Generative AI and LLMs: Architecture and Data Preparation (Coursera)

Offered by IBM,
Generative AI and LLMs: Architecture and Data Preparation (Coursera)

This IBM short course, a part of Generative AI Engineering Essentials with LLMs Professional Certificate, will teach you the basics of using generative AI and Large Language Models (LLMs). This course is suitable for existing and aspiring data scientists, machine learning engineers, deep-learning engineers, and AI engineers.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

You will learn about the types of generative AI and its real-world applications. You will gain the knowledge to differentiate between various generative AI architectures and models, such as Recurrent Neural Networks (RNNs), Transformers, Generative Adversarial Networks (GANs), Variational AutoEncoders (VAEs), and Diffusion Models. You will learn the differences in the training approaches used for each model. You will be able to explain the use of LLMs, such as Generative Pre-Trained Transformers (GPT) and Bidirectional Encoder Representations from Transformers (BERT).
You will also learn about the tokenization process, tokenization methods, and the use of tokenizers for word-based, character-based, and subword-based tokenization. You will be able to explain how you can use data loaders for training generative AI models and list the PyTorch libraries for preparing and handling data within data loaders. The knowledge acquired will help you use the generative AI libraries in Hugging Face. It will also prepare you to implement tokenization and create an NLP data loader.
For this course, a basic knowledge of Python and PyTorch and an awareness of machine learning and neural networks would be an advantage, though not strictly required.

What you'll learn

  • Differentiate between generative AI architectures and models, such as RNNs, Transformers, VAEs, GANs, and Diffusion Models.
  • Describe how LLMs, such as GPT, BERT, BART, and T5, are used in language processing.
  • Implement tokenization to preprocess raw textual data using NLP libraries such as NLTK, spaCy, BertTokenizer, and XLNetTokenizer.
  • Create an NLP data loader using PyTorch to perform tokenization, numericalization, and padding of text data.

Syllabus

Generative AI Architecture
In this module, you will learn about the significance of generative AI models and how they are used across a wide range of fields for generating various types of content. You will learn about the architectures and models commonly used in generative AI and the differences in the training approaches of these models. You will learn how large language models (LLMs) are used to build NLP-based applications. You will build a simple chatbot using the transformers library from Hugging Face.

Data Preparation for LLMs
In this module, you will learn to prepare data for training large language models (LLMs) by implementing tokenization. You will learn about the tokenization methods and the use of tokenizers. You will also learn about the purpose of data loaders and how you can use the DataLoader class in PyTorch. You will implement tokenization using various libraries such as nltk, spaCy, BertTokenizer, and XLNetTokenizer. You will also create a data loader with a collate function that processes batches of text.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Artificial Intelligence Data Fairness and Bias (Coursera) Coursera
LearnQuest

Artificial Intelligence Data Fairness and Bias (Coursera)

In this course, we will explore fundamental issues of fairness and bias in machine learning. As predictive models begin making important decisions, from college admission to loan decisions, it becomes paramount to keep models from making unfair predictions. From human bias to dataset awareness, we will explore many aspects of building more ethical models.

Jun 15th 2026
3 Weeks
Generative AI Essentials: Overview and Impact (Coursera) Coursera
University of Michigan

Generative AI Essentials: Overview and Impact (Coursera)

With the rise of generative artificial intelligence, there has been a growing demand to explore how to use these powerful tools not only in our work but also in our day-to-day lives. Generative AI Essentials: Overview and Impact introduces learners to large language models and generative AI tools, like ChatGPT. In this course, you’ll explore generative AI essentials, how to ethically use artificial intelligence, its implications for authorship, and what regulations for generative AI could look like.

Jun 19th 2026
1 Week
Neural Networks and Random Forests (Coursera) Coursera
LearnQuest

Neural Networks and Random Forests (Coursera)

In this course, we will build on our knowledge of basic models and explore advanced AI techniques. We’ll start with a deep dive into neural networks, building our knowledge from the ground up by examining the structure and properties. Then we’ll code some simple neural network models and learn to avoid overfitting, regularization, and other hyper-parameter tricks.

Jun 15th 2026
3 Weeks
Follow a Machine Learning Workflow (Coursera) Coursera
CertNexus

Follow a Machine Learning Workflow (Coursera)

Machine learning is not just a single task or even a small group of tasks; it is an entire process, one that practitioners must follow from beginning to end. It is this process—also called a workflow—that enables the organization to get the most useful results out of their machine learning technologies. No matter what form the final product or service takes, leveraging the workflow is key to the success of the business's AI solution. This second course within the Certified Artificial Intelligence Practitioner (CAIP) professional certificate explores each step along the machine learning workflow, from problem formulation all the way to model presentation and deployment.

Jun 15th 2026
5-12 Weeks
Big Data, Artificial Intelligence, and Ethics (Coursera) Coursera
University of California, Davis

Big Data, Artificial Intelligence, and Ethics (Coursera)

This course gives you context and first-hand experience with the two major catalyzers of the computational science revolution: big data and artificial intelligence. With more than 99% of all mediated information in digital format and with 98% of the world population using digital technology, humanity produces an impressive digital footprint.

Jun 15th 2026
4 Weeks
Digitalisation in the Aerospace Industry (Coursera) Coursera
Technische Universität München - TUM

Digitalisation in the Aerospace Industry (Coursera)

The online course Digitalisation in Aerospace aims at making you aware of special production requirements connected with digitalisation. You will learn about the role of robotics and automation in manufacturing and gain a better understanding of differing perspectives on research and manufacturing as well as the points where these intersect.

Jun 15th 2026
3 Weeks
Python Project for AI & Application Development (Coursera) Coursera
IBM

Python Project for AI & Application Development (Coursera)

This mini-course is intended to apply foundational Python skills by implementing different techniques to develop applications and AI powered solutions. Assume the role of a developer and unit test and package an application with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin developing AI enabled applications using Python, build and run unit tests, and package the application for distribution.

Jun 15th 2026
1 Week
Solve Business Problems with AI and Machine Learning (Coursera) Coursera
CertNexus

Solve Business Problems with AI and Machine Learning (Coursera)

Artificial intelligence (AI) and machine learning (ML) have become an essential part of the toolset for many organizations. When used effectively, these tools provide actionable insights that drive critical decisions and enable organizations to create exciting, new, and innovative products and services. This is the first of four courses in the Certified Artificial Intelligence Practitioner (CAIP) professional certification. This course is meant as an entry point into the world of AI/ML. You'll learn about the business problems that AI/ML can solve, as well as the specific AI/ML technologies that can solve them. In addition, you'll get an overview of the general workflow involved in machine learning, as well as the tools and other resources that support it.

Jun 15th 2026
4 Weeks
Business Application of Machine Learning and Artificial Intelligence in Healthcare (Coursera) Coursera
Northeastern University

Business Application of Machine Learning and Artificial Intelligence in Healthcare (Coursera)

The future of healthcare is becoming dependent on our ability to integrate Machine Learning and Artificial Intelligence into our organizations. But it is not enough to recognize the opportunities of AI; we as leaders in the healthcare industry have to first determine the best use for these applications ensuring that we focus our investment on solving problems that impact the bottom line.

Jun 15th 2026
4 Weeks