Probability Theory, Statistics and Exploratory Data Analysis (Coursera)

Probability Theory, Statistics and Exploratory Data Analysis (Coursera)

Exploration of Data Science requires certain background in probability and statistics. This course introduces you to the necessary sections of probability theory and statistics, guiding you from the very basics all way up to the level required for jump starting your ascent in Data Science.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

The core concept of the course is random variable — i.e. variable whose values are determined by random experiment. Random variables are used as a model for data generation processes we want to study. Properties of the data are deeply linked to the corresponding properties of random variables, such as expected value, variance and correlations. Dependencies between random variables are crucial factor that allows us to predict unknown quantities based on known values, which forms the basis of supervised machine learning. We begin with the notion of independent events and conditional probability, then introduce two main classes of random variables: discrete and continuous and study their properties. Finally, we learn different types of data and their connection with random variables.
While introducing you to the theory, we'll pay special attention to practical aspects for working with probabilities, sampling, data analysis, and data visualization in Python.
Course 4 of 4 in the Mathematics for Data Science Specialization.

Syllabus

WEEK 1
Conditional probability and Independence
During this week we discuss conditional probability and independence of events. Sometimes we can use this definition to find probabilities. Sometimes we check that this definition fulfills to assure whether events are independent. We discuss important law of total probability, which allows us to find probability of some event when we know its conditional probabilities provided some hypotheses and probabilities of the hypotheses. We also discuss Bayes's rule which allows us to find probability of hypothesis provided that some event occurred. We demonstrate how Python can be used for calculating conditional probabilities and checking independence of events.

WEEK 2
Random variables
Random variable denotes a value that depends on the result of some random experiment. Some natural examples of random variables come from gambling and lotteries. There are two main classes of random variables that we will consider in this course. This week we'll learn discrete random variables that take finite or countable number of values. Discrete random variables can be described by their distribution. We'll consider various discrete distributions, introduce notions of expected value and variance and learn to generate and visualize discrete random variables with Python.

WEEK 3
Systems of random variables; properties of expectation and variance, covariance and correlation.
Several random variables associated with the same random experiment constitute a system of random variables. To describe system of discrete random variables one can use joint distribution, which takes into account all possible combinations of values that random variables may take. We'll find some joint distributions, research their properties and introduce independence of random variables. Then we'll discuss properties of expected value and variance with respect to arithmetic operations and introduce measures of independence between random variables.

WEEK 4
Continuous random variables
This week we'll study continuous random variables that constitute important data type in statistics and data analysis. For continuous random variables we'll define probability density function (PDF) and cumulative distribution function (CDF), see how they are linked and how sampling from random variable may be used to approximate its PDF. We'll introduce expected value, variance, covariance and correlation for continuous random variables and discuss their properties. Finally, we'll use Python to generate independent and correlated continuous random variables.

WEEK 5
From random variables to statistical data. Data summarization and descriptive statistics.
This week we'll introduce types of statistical data and discuss models that are used to pass from statistical data to random variables. We'll introduce descriptive statistics of sample data, such as various measures of central tendency and statistical dispersion, and find correspondences between properties of random variables (population) and the sample descriptive statistics, which are essential for statistical predictions. We’ll talk about visualization of statistical data and learn to work with them in Python.

WEEK 6
Correlations and visualizations
This week we’ll consider correlation in statistical data and find out how its' related to the level of dependance within the data and what it means for scatter plots. We’ll consider several types of correlation suitable for different types of data and discuss difference between correlation and causation. Finally, we’ll learn to visualize dependence between numeric variables and calculate correlation with Python.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

A Crash Course in Causality: Inferring Causal Effects from Observational Data (Coursera) Coursera
University of Pennsylvania

A Crash Course in Causality: Inferring Causal Effects from Observational Data (Coursera)

We have all heard the phrase “correlation does not equal causation.” What, then, does equal causation? This course aims to answer that question and more! Over a period of 5 weeks, you will learn how causal effects are defined, what assumptions about your data and models are necessary, and how to implement and interpret some popular statistical methods. Learners will have the opportunity to apply these methods to example data in R (free statistical software environment).

Jun 22nd 2026
5-12 Weeks
Deep Learning and Reinforcement Learning (Coursera) Coursera
IBM

Deep Learning and Reinforcement Learning (Coursera)

This course introduces you to two of the most sought-after disciplines in Machine Learning: Deep Learning and Reinforcement Learning. Deep Learning is a subset of Machine Learning that has applications in both Supervised and Unsupervised Learning, and is frequently used to power most of the AI applications that we use on a daily basis. First you will learn about the theory behind Neural Networks, which are the basis of Deep Learning, as well as several modern architectures of Deep Learning.

Jun 22nd 2026
5-12 Weeks
Differential Equations for Engineers (Coursera) Coursera
The Hong Kong University of Science and Technology - HKUST

Differential Equations for Engineers (Coursera)

This course is about differential equations and covers material that all engineers should know. Both basic theory and applications are taught. In the first five weeks we will learn about ordinary differential equations, and in the final week, partial differential equations. The course is composed of 56 short lecture videos, with a few simple problems to solve following each lecture. And after each substantial topic, there is a short practice quiz. Solutions to the problems and practice quizzes can be found in instructor-provided lecture notes. There are a total of six weeks in the course, and at the end of each week there is an assessed quiz.

Jun 22nd 2026
5-12 Weeks
Bayesian Statistics: From Concept to Data Analysis (Coursera) Coursera
University of California, Santa Cruz

Bayesian Statistics: From Concept to Data Analysis (Coursera)

This course introduces the Bayesian approach to statistics, starting with the concept of probability and moving to the analysis of data. We will learn about the philosophy of the Bayesian approach as well as how to implement it for common types of data. We will compare the Bayesian approach to the more commonly-taught Frequentist approach, and see some of the benefits of the Bayesian approach.

Jun 22nd 2026
4 Weeks
Estadística aplicada a los negocios (Coursera) Coursera
Universidad Austral

Estadística aplicada a los negocios (Coursera)

La toma de decisiones está en la esencia de los negocios. Gerenciar es tomar decisiones, muchas veces bajo presión, con información desordenada y en un contexto de incertidumbre. Un aspecto básico es entender y analizar la información, organizar los datos de forma de facilitar su posterior uso y la toma de decisiones.

Jun 22nd 2026
4 Weeks
Mathematical Foundations for Cryptography (Coursera) Coursera
University of Colorado System

Mathematical Foundations for Cryptography (Coursera)

Welcome to Course 2 of Introduction to Applied Cryptography. In this course, you will be introduced to basic mathematical principles and functions that form the foundation for cryptographic and cryptanalysis methods. These principles and functions will be helpful in understanding symmetric and asymmetric cryptographic methods examined in Course 3 and Course 4. These topics should prove especially useful to you if you are new to cybersecurity. It is recommended that you have a basic knowledge of computer science and basic math skills such as algebra and probability.

Jun 22nd 2026
4 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026
4 Weeks
Think Again III: How to Reason Inductively (Coursera) Coursera
Duke University

Think Again III: How to Reason Inductively (Coursera)

Want to solve a murder mystery? What caused your computer to fail? Who can you trust in your everyday life? In this course, you will learn how to analyze and assess five common forms of inductive arguments: generalizations from samples, applications of generalizations, inference to the best explanation, arguments from analogy, and causal reasoning. The course closes by showing how you can use probability to help make decisions of all sorts.

Jun 22nd 2026
4 Weeks