Introduction to Data Engineering (Coursera)

Offered by IBM,
Introduction to Data Engineering (Coursera)

This course introduces you to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. You will gain an understanding of the modern data ecosystem and the role Data Engineers, Data Scientists, and Data Analysts play in this ecosystem. The Data Engineering Ecosystem includes several different components. It includes disparate data types, formats, and sources of data.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Data Pipelines gather data from multiple sources, transform it into analytics-ready data, and make it available to data consumers for analytics and decision-making. Data repositories, such as relational and non-relational databases, data warehouses, data marts, data lakes, and big data stores process and store this data. Data Integration Platforms combine disparate data into a unified view for the data consumers. You will learn about each of these components in this course. You will also learn about Big Data and the use of some of the Big Data processing tools.
A typical Data Engineering lifecycle includes architecting data platforms, designing data stores, and gathering, importing, wrangling, querying, and analyzing data. It also includes performance monitoring and finetuning to ensure systems are performing at optimal levels. In this course, you will learn about the data engineering lifecycle. You will also learn about security, governance, and compliance.
Data Engineering is recognized as one of the fastest-growing fields today. The career opportunities available in the field and the different paths you can take to enter this field are discussed in the course.
The course also includes hands-on labs that guide you to create your IBM Cloud Lite account, provision a database instance, load data into the database instance, and perform some basic querying operations that help you understand your dataset.
Completing this course will count towards your learning in any of the following programs:

What You Will Learn

  • Demonstrate the skills required for an entry-level data engineering role.
  • Implement various concepts in the data engineering lifecycle.
  • Showcase working knowledge with Python, Relational Databases, NoSQL Data Stores, Big Data Engines, Data Warehouses, and Data Pipelines.
  • Describe data security, governance, and compliance.

Syllabus

WEEK 1
What is Data Engineering?
In this module, you will learn about the different entities that come together to form a modern data ecosystem and the role Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts play in this ecosystem. You will learn what data engineering is and the key tasks in a data engineering lifecycle. You will also gain an understanding of the responsibilities of a data engineer, the skillsets they need in order to be successful, and what a typical day in the life of a data engineer looks like. At the end of the module, you will be guided to create a Lite account on IBM Cloud.

WEEK 2
The Data Engineering Ecosystem
In this module, you will learn about the data engineering ecosystem, the different types of data structures, file formats, sources of data, and the languages data professionals use in their day-to-day tasks. You will gain an understanding of several different types of data repositories such as relational and non-relational databases, data warehouses, data marts, and data lakes. You will learn about ETL and ELT processes, data pipelines, and data integration platforms. You will also gain an understanding of what big data is, and the tools used for processing and storing big data. During the course of this module, you will be guided to provision an instance of IBM Db2 using the Cloud Lite account you created in the previous module.

WEEK 3
Data Engineering Lifecycle
In this module, we will walk you through the data engineering lifecycle. You will learn about the architecture of a data platform, factors for selecting and designing data stores, and the different facets of security as it applies to data platforms and data lifecycle management. You will also learn about the process, steps, and tools used for gathering, importing, wrangling, and querying data. You will gain an understanding of performance monitoring and the steps you can take to troubleshoot performance issues. We will also talk about governance regulations, why we need them, and how technology enables compliance to regulations. During the course of this module, you will be guided to load data from a CSV file into the IBM Db2 instance you created in the previous module. You will also be guided to explore your dataset using some basic SQL queries that will be provided to you.

WEEK 4
Career Opportunities and Data Engineering in Action
In this module, you will learn about career opportunities in the field of Data Engineering and the different paths that you can take for getting skilled as a Data Engineer. At the end of the module, you will be presented with the final graded assignment which is divided into two parts. The first part of the final assignment includes a couple of quiz questions and the second part includes open-ended questions that will be reviewed and graded by a peer.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Code Free Data Science (Coursera) Coursera
University of California, San Diego

Code Free Data Science (Coursera)

The Code Free Data Science class is designed for learners seeking to gain or expand their knowledge in the area of Data Science. Participants will receive the basic training in effective predictive analytic approaches accompanying the growing discipline of Data Science without any programming requirements. Machine Learning methods will be presented by utilizing the KNIME Analytics Platform to discover patterns and relationships in data.

Jun 8th 2026
4 Weeks
Building Scalable Java Microservices with Spring Boot and Spring Cloud (Coursera) Coursera
Google Cloud

Building Scalable Java Microservices with Spring Boot and Spring Cloud (Coursera)

"Microservices" describes a software design pattern in which an application is a collection of loosely coupled services. These services are fine-grained, and can be individually maintained and scaled. The microservices architecture is ideal for the public cloud, with its focus on elastic scaling with on-demand resources. In this course, you will learn how to build Java applications using Spring Boot and Spring Cloud on Google Cloud Platform.

Jun 9th 2026
2 Weeks
Introduction to Machine Learning (Coursera) Coursera
Duke University

Introduction to Machine Learning (Coursera)

This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction.

Jun 12th 2026
5-12 Weeks
Building Database Applications in PHP (Coursera) Coursera
University of Michigan

Building Database Applications in PHP (Coursera)

In this course, we'll look at the object oriented patterns available in PHP. You'll learn how to connect to a MySQL using the Portable Data Objects (PDO) library and issue SQL commands in the the PHP language. We'll also look at how PHP uses cookies and manages session data. You'll learn how PHP avoids double posting data, how flash messages are implemented, and how to use a session to log in users in web applications.

Jun 8th 2026
5-12 Weeks
Introduction to Recommender Systems: Non-Personalized and Content-Based (Coursera) Coursera
University of Minnesota

Introduction to Recommender Systems: Non-Personalized and Content-Based (Coursera)

This course, which is designed to serve as the first course in the Recommender Systems specialization, introduces the concept of recommender systems, reviews several examples in detail, and leads you through non-personalized recommendation using summary statistics and product associations, basic stereotype-based or demographic recommendations, and content-based filtering recommendations.

Jun 8th 2026
4 Weeks
GIS Data Formats, Design and Quality (Coursera) Coursera
University of California, Davis

GIS Data Formats, Design and Quality (Coursera)

In this course, the second in the Geographic Information Systems (GIS) Specialization. What you will learn: design data tables and use separating and joining data in a relational database; write query strings to subset data; create and work with raster data; create web maps.

Jun 8th 2026
4 Weeks
Communicating Data Science Results (Coursera) Coursera
University of Washington

Communicating Data Science Results (Coursera)

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Jun 8th 2026
3 Weeks
La recherche documentaire (Coursera) Coursera
École Polytechnique

La recherche documentaire (Coursera)

Ce cours vise principalement à permettre aux étudiants d’identifier les sources pertinentes dans un domaine donné, leur apprendre à construire un état de l’art et à évaluer les sources, en particulier celles en accès libre sur Internet. Il cherche également à optimiser la recherche documentaire en incitant les étudiants à tirer le meilleur parti des outils et requêtes d’interrogation des bases de données. A l'issue de ce cours, ils devront être capables de construire et alimenter une bibliographie ordonnée, ainsi que de citer convenablement leurs sources pour éviter le plagiat.

Jun 8th 2026
3 Weeks
Managing Machine Learning Projects with Google Cloud (Coursera) Coursera
Google Cloud

Managing Machine Learning Projects with Google Cloud (Coursera)

Business professionals in non-technical roles have a unique opportunity to lead or influence machine learning projects. If you have questions about machine learning and want to understand how to use it, without the technical jargon, this course is for you. Learn how to translate business problems into machine learning use cases and vet them for feasibility and impact.

Jun 8th 2026
4 Weeks
Introduction to Data Science in Python (Coursera) Coursera
University of Michigan

Introduction to Data Science in Python (Coursera)

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.

Jun 8th 2026
4 Weeks