Generative AI and LLMs: Architecture and Data Preparation

Generative AI and LLMs: Architecture and Data Preparation

Name: Generative AI and LLMs: Architecture and Data Preparation
Rating: 4.687022900763359 (131 reviews)

This course is part of multiple programs.

Instructors: Joseph Santarcangelo

10,817 already enrolled

Included with Coursera Plus

Learn more

2 modules

Gain insight into a topic and learn the fundamentals.

4.7

(131 reviews)

Intermediate level

Recommended experience

5 hours to complete

3 weeks at 1 hour a week

Flexible schedule

Learn at your own pace

2 modules

Gain insight into a topic and learn the fundamentals.

4.7

(131 reviews)

Intermediate level

Recommended experience

5 hours to complete

3 weeks at 1 hour a week

Flexible schedule

Learn at your own pace

What you'll learn

Differentiate between generative AI architectures and models, such as RNNs, Transformers, VAEs, GANs, and Diffusion Models.
Describe how LLMs, such as GPT, BERT, BART, and T5, are used in language processing.
Implement tokenization to preprocess raw textual data using NLP libraries such as NLTK, spaCy, BertTokenizer, and XLNetTokenizer.
Create an NLP data loader using PyTorch to perform tokenization, numericalization, and padding of text data.

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

4 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Build your subject-matter expertise

This course is available as part of

When you enroll in this course, you'll also be asked to select a specific program.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 2 modules in this course

This IBM short course, a part of Generative AI Engineering Essentials with LLMs Professional Certificate, will teach you the basics of using generative AI and Large Language Models (LLMs). This course is suitable for existing and aspiring data scientists, machine learning engineers, deep-learning engineers, and AI engineers.

You will learn about the types of generative AI and its real-world applications. You will gain the knowledge to differentiate between various generative AI architectures and models, such as Recurrent Neural Networks (RNNs), Transformers, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Diffusion Models. You will learn the differences in the training approaches used for each model. You will be able to explain the use of LLMs, such as Generative Pre-Trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT). You will also learn about the tokenization process, tokenization methods, and the use of tokenizers for word-based, character-based, and subword-based tokenization. You will be able to explain how you can use data loaders for training generative AI models and list the PyTorch libraries for preparing and handling data within data loaders. The knowledge acquired will help you use the generative AI libraries in Hugging Face. It will also prepare you to implement tokenization and create an NLP data loader. For this course, a basic knowledge of Python and PyTorch and an awareness of machine learning and neural networks would be an advantage, though not strictly required.

In this module, you will learn about the significance of generative AI models and how they are used across a wide range of fields for generating various types of content. You will learn about the architectures and models commonly used in generative AI and the differences in the training approaches of these models. You will learn how large language models (LLMs) are used to build NLP-based applications. You will build a simple chatbot using the transformers library from Hugging Face.

What's included

5 videos2 readings2 assignments1 app item3 plugins

5 videosTotal 28 minutes

Overview of AI Engineering with LLMs5 minutesPreview module
Course Introduction3 minutes
Significance of Generative AI 5 minutes
Generative AI Architectures and Models 6 minutes
Generative AI for NLP7 minutes

2 readingsTotal 13 minutes

Course Overview10 minutes
Summary and Highlights3 minutes

2 assignmentsTotal 25 minutes

Graded Quiz: Generative AI Architecture15 minutes
Practice Quiz: Generative AI Overview and Architecture10 minutes

1 app itemTotal 60 minutes

Lab: Exploring Generative AI Libraries60 minutes

3 pluginsTotal 32 minutes

Helpful Tips for Course Completion2 minutes
Reading: Basics of AI Hallucinations10 minutes
Reading: Overview of Libraries and Tools20 minutes

In this module, you will learn to prepare data for training large language models (LLMs) by implementing tokenization. You will learn about the tokenization methods and the use of tokenizers. You will also learn about the purpose of data loaders and how you can use the DataLoader class in PyTorch. You will implement tokenization using various libraries such as nltk, spaCy, BertTokenizer, and XLNetTokenizer. You will also create a data loader with a collate function that processes batches of text.

What's included

2 videos5 readings2 assignments2 app items2 plugins

2 videosTotal 13 minutes

Tokenization6 minutesPreview module
Overview of Data Loaders6 minutes

5 readingsTotal 13 minutes

Data Quality and Diversity for Effective LLM Training 5 minutes
Summary and Highlights2 minutes
Course Conclusion3 minutes
Congratulations and Next Steps2 minutes
Team and Acknowledgments1 minute

2 assignmentsTotal 25 minutes

Graded Quiz: Data Preparation for LLMs15 minutes
Practice Quiz: Preparing Data10 minutes

2 app itemsTotal 120 minutes

Lab: Implementing Tokenization60 minutes
Lab: Creating an NLP Data Loader60 minutes

2 pluginsTotal 9 minutes

Cheat Sheet: Generative AI and LLMs: Architecture and Data Preparation5 minutes
Course Glossary: Generative AI and LLMs: Architecture and Data Preparation4 minutes

Instructors

Instructor ratings

4.3 (30 ratings)

Joseph Santarcangelo

IBM

33 Courses1,786,512 learners

Offered by

IBM

Recommended if you're interested in Machine Learning

Board Infinity
The Rise of Generative AI
Course
University of Michigan
Responsible Generative AI
Specialization
Coursera
Setting a Generative AI Strategy
Course
Fractal Analytics
Generative AI Essentials: A Comprehensive Introduction
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.7

131 reviews

5 stars
81.02%
4 stars
11.67%
3 stars
4.37%
2 stars
1.45%
1 star
1.45%

Showing 3 of 131

Reviewed on Jan 2, 2025

It was very informative and I enjoyed the journey I learned the patterns from the deep.

Reviewed on Oct 17, 2024

I am pretty much new to NLP data preparation. However this course made me comfortable with Date preparation activities.

Reviewed on Oct 20, 2024

I highly recommend using a human to deliver the lectures, which might enhance student engagement. Great introductory course.

View more reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

It will take only two weeks to complete this course if you spend two hours of study time per week.

It will be good if you have a basic knowledge of Python and PyTorch and a familiarity with machine learning and neural network concepts.

This course is part of a specialization. When you complete the specialization, you will prepare yourself with the skills and confidence to take on jobs such as AI Engineer, NLP Engineer, Machine Learning Engineer, Deep Learning Engineer, and Data Scientist.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

Generative AI and LLMs: Architecture and Data Preparation

What you'll learn

Details to know

See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise

Earn a career certificate

There are 2 modules in this course

Generative AI Architecture

What's included

Data Preparation for LLMs

What's included

Instructors

Offered by

Recommended if you're interested in Machine Learning

The Rise of Generative AI

Responsible Generative AI

Setting a Generative AI Strategy

Generative AI Essentials: A Comprehensive Introduction

Why people choose Coursera for their career

Learner reviews

New to Machine Learning? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

How long does it take to complete the Specialization?

Do I need any background knowledge to complete this course successfully?

Which roles can I perform after completing this course?

More questions