Probability & Statistics for Machine Learning & Data Science (Coursera)

Offered by DeepLearning.AI,
Probability & Statistics for Machine Learning & Data Science (Coursera)

Mathematics for Machine Learning and Data science is a foundational online program created in by DeepLearning.AI and taught by Luis Serrano. This beginner-friendly program is where you’ll master the fundamental mathematics toolkit of machine learning.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

After completing this course, learners will be able to:
• Describe and quantify the uncertainty inherent in predictions made by machine learning models, using the concepts of probability, random variables, and probability distributions.
• Visually and intuitively understand the properties of commonly used probability distributions in machine learning and data science like Bernoulli, Binomial, and Gaussian distributions
• Apply common statistical methods like maximum likelihood estimation (MLE) and maximum a priori estimation (MAP) to machine learning problems
• Assess the performance of machine learning models using interval estimates and margin of errors
• Apply concepts of statistical hypothesis testing to commonly used tests in data science like AB testing
Many machine learning engineers and data scientists struggle with mathematics. Challenging interview questions often hold people back from leveling up in their careers, and even experienced practitioners can feel held by a lack of math skills.
This specialization uses innovative pedagogy in mathematics to help you learn quickly and intuitively, with courses that use easy-to-follow plugins and visualizations to help you see how the math behind machine learning actually works. Upon completion, you’ll understand the mathematics behind all the most common algorithms and data analysis techniques — plus the know-how to incorporate them into your machine learning career.
Course 3 of 3 in the Mathematics for Machine Learning and Data Science Specialization.

What You Will Learn

  • Describe and quantify the uncertainty inherent in predictions made by machine learning models
  • Visually and intuitively understand the properties of commonly used probability distributions in machine learning and data science
  • Apply common statistical methods like maximum likelihood estimation (MLE) and maximum a priori estimation (MAP) to machine learning problems
  • Assess the performance of machine learning models using interval estimates and margin of errors

Syllabus

WEEK 1
Week 1 - Introduction to Probability and Probability Distributions
In this week, you will learn about probability of events and various rules of probability to correctly do arithmetic with probabilities. You will learn the concept of conditional probability and the key idea behind Bayes theorem. In lesson 2, we generalize the concept of probability of events to probability distribution over random variables. You will learn about some common probability distributions like the Binomial distribution and the Normal distribution.

WEEK 2
Week 2 - Describing probability distributions and probability distributions with multiple variables
This week you will learn about different measures to describe probability distributions as well as any dataset. These include the measures of central tendency (mean, median, and mode), variance, skewness, and kurtosis. The concept of the expected value of a random variable is introduced to understand each of these measures. You will also learn about some visual tools to describe data and distributions. In lesson 2, you will learn about the probability distribution of two or more random variables using concepts like joint distribution, marginal distribution, and conditional distribution. You will end the week by learning about covariance: a generalization of variance to two or more random variables.

WEEK 3
Week 3 - Sampling and Point estimation
This week shifts its focus from probability and statistics. You will start by learning the concept of a sample and a population and two fundamental results from statistics that concern samples and population: the law of large numbers and the central limit theorem. In lesson 2, you will learn the first and the simplest method of estimation in statistics: point estimation. You will see how maximum likelihood estimation, the most common point estimation method, works and how it connects with regularization (technique used to reduce overfitting in machine learning) using Bayes theorem.

WEEK 4
Week 4 - Confidence Intervals and Hypothesis testing
This week you will learn another estimation method called interval estimation. The most common interval estimates are confidence intervals and you will see how they are calculated and how to correctly interpret them. In lesson 2, we cover the third estimation method called hypothesis testing where estimates are formulated as hypothesis and then tested in the presence of available evidence or sample of data. You will learn the concept of p-value that helps in making a decision for a hypothesis test and also learn some common tests like the t-test, two-sample t-test, and the paired t-test. We end the week with an interesting application of hypothesis testing in data science: A/B testing.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Practical Predictive Analytics: Models and Methods (Coursera) Coursera
University of Washington

Practical Predictive Analytics: Models and Methods (Coursera)

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Jun 8th 2026
4 Weeks
Learn to code with AI (Coursera) Coursera
Scrimba

Learn to code with AI (Coursera)

Imagine waking up tomorrow as a web developer. What would you want to build? With AI tools like ChatGPT, you're already a developer, regardless of your experience, if you know how to work with them. So in this course, you'll build functional, interactive front-end projects while learning how to write effective prompts and debug and refine your code with the help of AI.

Jun 10th 2026
2 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 8th 2026
4 Weeks
Understanding Clinical Research: Behind the Statistics (Coursera) Coursera
University of Cape Town

Understanding Clinical Research: Behind the Statistics (Coursera)

If you’ve ever skipped over`the results section of a medical paper because terms like “confidence interval” or “p-value” go over your head, then you’re in the right place. You may be a clinical practitioner reading research articles to keep up-to-date with developments in your field or a medical student wondering how to approach your own research. Greater confidence in understanding statistical analysis and the results can benefit both working professionals and those undertaking research themselves.

Jun 8th 2026
5-12 Weeks
Advanced Algorithms and Complexity (Coursera) Coursera
University of California, San Diego,Higher School of Economics - HSE University

Advanced Algorithms and Complexity (Coursera)

You've learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision.

Jun 8th 2026
5-12 Weeks
Machine Learning: Clustering & Retrieval (Coursera) Coursera
University of Washington

Machine Learning: Clustering & Retrieval (Coursera)

Case Studies: Finding Similar Documents. A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover?

Jun 8th 2026
5-12 Weeks
Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 8th 2026
5-12 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 8th 2026
4 Weeks
Communicating Data Science Results (Coursera) Coursera
University of Washington

Communicating Data Science Results (Coursera)

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Jun 8th 2026
3 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks
Introduction to Recommender Systems: Non-Personalized and Content-Based (Coursera) Coursera
University of Minnesota

Introduction to Recommender Systems: Non-Personalized and Content-Based (Coursera)

This course, which is designed to serve as the first course in the Recommender Systems specialization, introduces the concept of recommender systems, reviews several examples in detail, and leads you through non-personalized recommendation using summary statistics and product associations, basic stereotype-based or demographic recommendations, and content-based filtering recommendations.

Jun 8th 2026
4 Weeks
Linear Regression and Modeling (Coursera) Coursera
Duke University

Linear Regression and Modeling (Coursera)

This course introduces simple and multiple linear regression models. These models allow you to assess the relationship between variables in a data set and a continuous response variable. Is there a relationship between the physical attractiveness of a professor and their student evaluation scores? Can we predict the test score for a child based on certain characteristics of his or her mother? In this course, you will learn the fundamental theory behind linear regression and, through data examples, learn to fit, examine, and utilize regression models to examine relationships between multiple variables, using the free statistical software R and RStudio.

Jun 8th 2026
4 Weeks