Python Project for Data Engineering (Coursera)

Offered by IBM,
Python Project for Data Engineering (Coursera)

This mini-course is intended to apply foundational Python skills by implementing different techniques to collect and work with data. Assume the role of a Data Engineer and extract data from multiple file formats, transform it into specific datatypes, and then load it into a single source for analysis. Continue with the course and test your knowledge by implementing webscraping and extracting data with APIs all with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin collecting large datasets from multiple sources and transform them into one primary source, or begin web scraping to gain valuable business insights all with the use of Python.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Prerequisite: Python for Data Science, AI and Developmentcourse from IBM is a pre-requisite for this project course. Please ensure that before taking this course you have either completed the Python for Data Science, AI and Development course from IBM or have equivalent proficiency in working with Python and data.
NOTE: This course is not intended to teach you Python and does not have too much instructional content. It is intended for you to apply prior Python knowledge.
Completing this course will count towards your learning in any of the following programs:

What You Will Learn

  • Demonstrate your Skills in Python - the language of choice for Data Engineering
  • Implement Webscraping, and use APIs to extract data in Python
  • Play the role of a Data Engineer working on a real project to extract, transform and load data using Jupyter notebook and Watson Studio

Syllabus

WEEK 1: Python Project for Data Engineering

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Data Management for Clinical Research (Coursera) Coursera
Vanderbilt University

Data Management for Clinical Research (Coursera)

This course presents critical concepts and practical methods to support planning, collection, storage, and dissemination of data in clinical research. Understanding and implementing solid data management principles is critical for any scientific domain. Regardless of your current (or anticipated) role in the research enterprise, a strong working knowledge and skill set in data management principles and practice will increase your productivity and improve your science. Our goal is to use these modules to help you learn and practice this skill set.

Jun 22nd 2026
5-12 Weeks
Comparing Genes, Proteins, and Genomes (Bioinformatics III) (Coursera) Coursera
University of California, San Diego

Comparing Genes, Proteins, and Genomes (Bioinformatics III) (Coursera)

Once we have sequenced genomes in the previous course, we would like to compare them to determine how species have evolved and what makes them different. In the first half of the course, we will compare two short biological sequences, such as genes (i.e., short sequences of DNA) or proteins. We will encounter a powerful algorithmic tool called dynamic programming that will help us determine the number of mutations that have separated the two genes/proteins.

Jun 22nd 2026
5-12 Weeks
Machine Learning: Regression (Coursera) Coursera
University of Washington

Machine Learning: Regression (Coursera)

Case Study - Predicting Housing Prices. In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.

Jun 22nd 2026
5-12 Weeks
Analytical Solutions to Common Healthcare Problems (Coursera) Coursera
University of California, Davis

Analytical Solutions to Common Healthcare Problems (Coursera)

In this course, we’re going to go over analytical solutions to common healthcare problems. I will review these business problems and you’ll build out various data structures to organize your data. We’ll then explore ways to group data and categorize medical codes into analytical categories. You will then be able to extract, transform, and load data into data structures required for solving medical problems and be able to also harmonize data from multiple sources.

Jun 22nd 2026
4 Weeks
Introduction to Data Science in Python (Coursera) Coursera
University of Michigan

Introduction to Data Science in Python (Coursera)

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.

Jun 22nd 2026
4 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 22nd 2026
4 Weeks
Everyday Excel, Part 2 (Coursera) Coursera
University of Colorado Boulder

Everyday Excel, Part 2 (Coursera)

"Everyday Excel, Part 2" is a continuation of the popular "Everyday Excel, [Parts 1](https://www.mooc-list.com/course/everyday-excel-part-1-coursera)". Building on concepts learned in the first course, you will continue to expand your knowledge of applications in Excel. This course is aimed at intermediate users, but even advanced users will pick up new skills and tools in Excel. By the end of this course, you will have the skills and tools to take on the project-based "Everyday Excel, Part 3 (Projects)".

Jun 22nd 2026
5-12 Weeks
Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 22nd 2026
5-12 Weeks
Machine Learning: Classification (Coursera) Coursera
University of Washington

Machine Learning: Classification (Coursera)

Case Studies: Analyzing Sentiment & Loan Default Prediction. In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank.

Jun 22nd 2026
5-12 Weeks
Python Project: pillow, tesseract, and opencv (Coursera) Coursera
University of Michigan

Python Project: pillow, tesseract, and opencv (Coursera)

This course will walk you through a hands-on project suitable for a portfolio. You will be introduced to third-party APIs and will be shown how to manipulate images using the Python imaging library (pillow), how to apply optical character recognition to images to recognize text (tesseract and py-tesseract), and how to identify faces in images using the popular opencv library. By the end of the course you will have worked with three different libraries available for Python 3 to create a real-world data-analysis project.

Jun 22nd 2026
3 Weeks