Probability Theory: Foundation for Data Science (Coursera)

Probability Theory: Foundation for Data Science (Coursera)

Understand the foundations of probability and its relationship to statistics and data science. We’ll learn what it means to calculate a probability, independent and dependent outcomes, and conditional events. We’ll study discrete and continuous random variables and see how this fits with data collection. We’ll end the course with Gaussian (normal) random variables and the Central Limit Theorem and understand its fundamental importance for all of statistics and data science.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics.

Course 1 of 3 in the Statistical Inference for Data Science Applications Specialization

What You Will Learn

  • Explain why probability is important to statistics and data science.
  • See the relationship between conditional and independent events in a statistical experiment.
  • Calculate the expectation and variance of several random variables and develop some intuition.

Syllabus

WEEK 1
Descriptive Statistics and the Axioms of Probability
Understand the foundation of probability and its relationship to statistics and data science. We’ll learn what it means to calculate a probability, independent and dependent outcomes, and conditional events. We’ll study discrete and continuous random variables and see how this fits with data collection. We’ll end the course with Gaussian (normal) random variables and the Central Limit Theorem and understand it’s fundamental importance for all of statistics and data science.

WEEK 2
Conditional Probability
The notion of “conditional probability” is a very useful concept from Probability Theory and in this module we introduce the idea of “conditioning” and Bayes’ Formula. The fundamental concept of “independent event” then naturally arises from the notion of conditioning. Conditional and independent events are fundamental concepts in understanding statistical results.

WEEK 3
Discrete Random Variables
The concept of a “random variable” (r.v.) is fundamental and often used in statistics. In this module we’ll study various named discrete random variables. We’ll learn some of their properties and why they are important. We’ll also calculate the expectation and variance for these random variables.

WEEK 4
Continuous Random Variables
In this module, we’ll extend our definition of random variables to include continuous random variables. The concepts in this unit are crucial since a substantial portion of statistics deals with the analysis of continuous random variables. We’ll begin with uniform and exponential random variables and then study Gaussian, or normal, random variables.

WEEK 5
Joint Distributions and Covariance
The power of statistics lies in being able to study the outcomes and effects of multiple random variables (i.e. sometimes referred to as “data”). Thus, in this module, we’ll learn about the concept of “joint distribution” which allows us to generalize probability theory to the multivariate case.

WEEK 6
Central Limit Theorem
The Central Limit Theorem (CLT) is a crucial result used in the analysis of data. In this module, we’ll introduce the CLT and it’s applications such as characterizing the distribution of the mean of a large data set. This will set the stage for the next course.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Neural Networks and Deep Learning (Coursera) Coursera
DeepLearning.AI

Neural Networks and Deep Learning (Coursera)

If you want to break into cutting-edge AI, this course will help you do so. Deep learning engineers are highly sought after, and mastering deep learning will give you numerous new career opportunities. Deep learning is also a new "superpower" that will let you build AI systems that just weren't possible a few years ago. In this course, you will learn the foundations of deep learning.

Jun 22nd 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 22nd 2026
4 Weeks
Practical Predictive Analytics: Models and Methods (Coursera) Coursera
University of Washington

Practical Predictive Analytics: Models and Methods (Coursera)

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Jun 22nd 2026
4 Weeks
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 22nd 2026
4 Weeks
Linear Regression and Modeling (Coursera) Coursera
Duke University

Linear Regression and Modeling (Coursera)

This course introduces simple and multiple linear regression models. These models allow you to assess the relationship between variables in a data set and a continuous response variable. Is there a relationship between the physical attractiveness of a professor and their student evaluation scores? Can we predict the test score for a child based on certain characteristics of his or her mother? In this course, you will learn the fundamental theory behind linear regression and, through data examples, learn to fit, examine, and utilize regression models to examine relationships between multiple variables, using the free statistical software R and RStudio.

Jun 22nd 2026
4 Weeks
Six Sigma Advanced Define and Measure Phases (Coursera) Coursera
University System of Georgia

Six Sigma Advanced Define and Measure Phases (Coursera)

This course is for you if you are looking to dive deeper into Six Sigma or strengthen and expand your knowledge of the basic components of green belt level of Six Sigma and Lean. Six Sigma skills are widely sought by employers both nationally and internationally. These skills have been proven to help improve business processes and performance. This course will take you deeper into the principles and tools associated with the "Design" and "Measure" phases of the DMAIC structure of Six Sigma.

Jun 22nd 2026
5-12 Weeks
Introducción a Data Science: Programación Estadística con R (Coursera) Coursera
Universidad Nacional Autónoma de México

Introducción a Data Science: Programación Estadística con R (Coursera)

Este curso te proporcionará las bases del lenguaje de programación estadística R, la lengua franca de la estadística, el cual te permitirá escribir programas que lean, manipulen y analicen datos cuantitativos. Te explicaremos la instalación del lenguaje; también verás una introducción a los sistemas base de gráficos y al paquete para graficar ggplot2, para visualizar estos datos. Además también abordarás la utilización de uno de los IDEs más populares entre la comunidad de usuarios de R, llamado RStudio.

Jun 22nd 2026
4 Weeks
Experimentation for Improvement (Coursera) Coursera
McMaster University

Experimentation for Improvement (Coursera)

We are always using experiments to improve our lives, our community, and our work. Are you doing it efficiently? Or are you (incorrectly) changing one thing at a time and hoping for the best? In this course, you will learn how to plan efficient experiments - testing with many variables. Our goal is to find the best results using only a few experiments. A key part of the course is how to optimize a system.

Jun 22nd 2026
5-12 Weeks
Bayesian Statistics: From Concept to Data Analysis (Coursera) Coursera
University of California, Santa Cruz

Bayesian Statistics: From Concept to Data Analysis (Coursera)

This course introduces the Bayesian approach to statistics, starting with the concept of probability and moving to the analysis of data. We will learn about the philosophy of the Bayesian approach as well as how to implement it for common types of data. We will compare the Bayesian approach to the more commonly-taught Frequentist approach, and see some of the benefits of the Bayesian approach.

Jun 22nd 2026
4 Weeks
Think Again III: How to Reason Inductively (Coursera) Coursera
Duke University

Think Again III: How to Reason Inductively (Coursera)

Want to solve a murder mystery? What caused your computer to fail? Who can you trust in your everyday life? In this course, you will learn how to analyze and assess five common forms of inductive arguments: generalizations from samples, applications of generalizations, inference to the best explanation, arguments from analogy, and causal reasoning. The course closes by showing how you can use probability to help make decisions of all sorts.

Jun 22nd 2026
4 Weeks