Exploratory Data Analysis for Machine Learning (Coursera)

Offered by IBM,
Exploratory Data Analysis for Machine Learning (Coursera)

This first course in the IBM Machine Learning Professional Certificate introduces you to Machine Learning and the content of the professional certificate. In this course you will realize the importance of good, quality data. You will learn common techniques to retrieve your data, clean it, apply feature engineering, and have it ready for preliminary analysis and hypothesis testing.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

By the end of this course you should be able to:

  • Retrieve data from multiple data sources: SQL, NoSQL databases, APIs, Cloud
  • Describe and use common feature selection and feature engineering techniques
  • Handle categorical and ordinal features, as well as missing values
  • Use a variety of techniques for detecting and dealing with outliers
  • Articulate why feature scaling is important and use a variety of scaling techniques

Who should take this course?
This course targets aspiring data scientists interested in acquiring hands-on experience with Machine Learning and Artificial Intelligence in a business setting.
What skills should you have?
To make the most out of this course, you should have familiarity with programming on a Python development environment, as well as fundamental understanding of Calculus, Linear Algebra, Probability, and Statistics.
Completing this course will count towards your learning in any of the following programs:

Syllabus

WEEK 1
A Brief History of Modern AI and its Applications
Artificial Intelligence is not new, but it is new in a sense that it is easier than ever to get started using Machine Learning in business settings. In this module we will go over a quick introduction to AI and Machine Learning and we will visit a brief history of modern AI. We will also explore some of the current applications of AI and Machine Learning for you to think about how you want to leverage them in your day to day business practice or personal projects.
Retrieving Data, Exploratory Data Analysis, and Feature Engineering
Good data is the fuel that powers Machine Learning and Artificial Intelligence. In this module you will learn how to retrieve data from different sources, how to clean it to ensure its quality, and how to conduct exploratory analysis to visually confirm it is ready for machine learning modeling.

WEEK 2
Inferential Statistics and Hypothesis Testing
Inferential statistics and hypothesis testing are two types of data analysis often overlooked at early stages of analyzing your data. They can give you quick insights about the quality of your data. They also help you confirm business intuition and help you prescribe what to analyze next using Machine Learning. This module looks at useful definitions and simple examples that will help you get started creating hypothesis around your business problem and how to test them.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Probabilistic Graphical Models 3: Learning (Coursera) Coursera
Stanford University

Probabilistic Graphical Models 3: Learning (Coursera)

Probabilistic graphical models (PGMs) are a rich framework for encoding probability distributions over complex domains: joint (multivariate) distributions over large numbers of random variables that interact with each other. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more. They are the basis for the state-of-the-art methods in a wide variety of applications, such as medical diagnosis, image understanding, speech recognition, natural language processing, and many, many more. They are also a foundational tool in formulating many machine learning problems.

Jun 8th 2026
5-12 Weeks
Introduction to Probability and Data with R (Coursera) Coursera
Duke University

Introduction to Probability and Data with R (Coursera)

This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes' rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization.

Jun 8th 2026
5-12 Weeks
Inferential and Predictive Statistics for Business (Coursera) Coursera
University of Illinois at Urbana-Champaign

Inferential and Predictive Statistics for Business (Coursera)

This course provides an analytical framework to help you evaluate key problems in a structured fashion and will equip you with tools to better manage the uncertainties that pervade and complicate business processes. The course aim to cover statistical ideas that apply to managers. We will consider two basic themes: first, is recognizing and describing variations present in everything around us, and then modeling and making decisions in the presence of these variations.

Jun 8th 2026
4 Weeks
Sequence Models (Coursera) Coursera
DeepLearning.AI

Sequence Models (Coursera)

This course will teach you how to build models for natural language, audio, and other sequence data. Thanks to deep learning, sequence algorithms are working far better than just two years ago, and this is enabling numerous exciting applications in speech recognition, music synthesis, chatbots, machine translation, natural language understanding, and many others.

Jun 8th 2026
3 Weeks
Machine Learning: Classification (Coursera) Coursera
University of Washington

Machine Learning: Classification (Coursera)

Case Studies: Analyzing Sentiment & Loan Default Prediction. In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank.

Jun 8th 2026
5-12 Weeks
Six Sigma Advanced Analyze Phase (Coursera) Coursera
University System of Georgia

Six Sigma Advanced Analyze Phase (Coursera)

This course is for you if you are looking to dive deeper into Six Sigma or strengthen and expand your knowledge of the basic components of green belt level of Six Sigma and Lean. Six Sigma skills are widely sought by employers both nationally and internationally. These skills have been proven to help improve business processes and performance. This course will take you deeper into the principles and tools associated with the "Analyze" phase of the DMAIC structure of Six Sigma.

Jun 8th 2026
3 Weeks
Generative AI Essentials: Overview and Impact (Coursera) Coursera
University of Michigan

Generative AI Essentials: Overview and Impact (Coursera)

With the rise of generative artificial intelligence, there has been a growing demand to explore how to use these powerful tools not only in our work but also in our day-to-day lives. Generative AI Essentials: Overview and Impact introduces learners to large language models and generative AI tools, like ChatGPT. In this course, you’ll explore generative AI essentials, how to ethically use artificial intelligence, its implications for authorship, and what regulations for generative AI could look like.

Jun 12th 2026
1 Week
Practical Predictive Analytics: Models and Methods (Coursera) Coursera
University of Washington

Practical Predictive Analytics: Models and Methods (Coursera)

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Jun 8th 2026
4 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 8th 2026
4 Weeks
Inferential Statistics (Coursera) Coursera
University of Amsterdam

Inferential Statistics (Coursera)

Inferential statistics are concerned with making inferences based on relations found in the sample, to relations in the population. Inferential statistics help us decide, for example, whether the differences between groups that we see in our data are strong enough to provide support for our hypothesis that group differences exist in general, in the entire population. We will start by considering the basic principles of significance testing: the sampling and test statistic distribution, p-value, significance level, power and type I and type II errors. Then we will consider a large number of statistical tests and techniques that help us make inferences for different types of data and different types of research designs.

Jun 8th 2026
5-12 Weeks
Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 8th 2026
5-12 Weeks
Preparing for the Google Cloud Professional Data Engineer Exam (Coursera) Coursera
Google Cloud

Preparing for the Google Cloud Professional Data Engineer Exam (Coursera)

From the course: "The best way to prepare for the exam is to be competent in the skills required of the job." This course uses a top-down approach to recognize knowledge and skills already known, and to surface information and skill areas for additional preparation. You can use this course to help create your own custom preparation plan. It helps you distinguish what you know from what you don't know. And it helps you develop and practice skills required of practitioners who perform this job.

Jun 13th 2026
5-12 Weeks