Introduction to Bayesian Data Analysis (openHPI)

Introduction to Bayesian Data Analysis (openHPI)

Bayesian data analysis is increasingly becoming the tool of choice for many data-analysis problems. This free course on Bayesian data analysis will teach you basic ideas about random variables and probability distributions, Bayes' rule, and its application in simple data analysis problems. You will learn to use the R package brms (which is a front-end for the probabilistic programming language Stan). The focus will be on regression modeling, culminating in a brief introduction to hierarchical models (otherwise known as mixed or multilevel models). This course is appropriate for anyone familiar with the programming language R and for anyone who has done some frequentist data analysis (e.g., linear modeling and/or linear mixed modeling) in the past.

Introduction: Why are Bayesian methods important for data analysts?
Here are some of the advantages of Bayesian methods over the standard frequentist approach used in data analysis:

  • Prior knowledge/expertise can be incorporated into the data analysis
  • Models can be flexibly specified to reflect the assumed generative process
  • The results of the analysis – the posterior distributions of the parameters of interest – have an intuitive interpretation
  • Hypothesis testing can be carried out in a more meaningful manner than the standard used null hypothesis significance testing

Prerequisites: Who is this course for?
We assume the following in this course:

  • Basic familiarity with the programming language R, openHPI offers a free R course for Beginners (in German)
  • Experience with data analysis using linear models
  • It is helpful (but not necessary) to have had some exposure to linear mixed models using the R library lme4
  • High-school mathematics (pre-calculus)
  • Some basic concepts from probability theory (sum and product rule, conditional probability)

This course is not appropriate for participants who don't know R programming and who have no experience at all with data analysis.

Course outcomes: What will you learn from this course?

  • Some basic ideas relating to random variables
  • Some fundamental properties of probability distributions
  • Application of Bayes' rule in data analysis
  • The concept of likelihood and its role in Bayesian statistical modeling
  • Bayesian regression models using brms (a front-end for Stan)
  • How to visualize and interpret prior and posterior distributions
  • How to generate prior and posterior predictive distributions for evaluating models
  • How to interpret the results of simple regression models

After completing this course, you will be in a good position to learn how to use more advanced Bayesian methods, such as hierarchical models, finite mixture models, multinomial processing tree models, measurement error models, etc.

What you'll learn

  • Bayesian statistics
  • Data analysis
  • Bayesian regression models using brms

Course contents

Week 0 - Initial Setup:
Installing R and RStudio, rstan, brms, and other necessary packages in R; Setting up R markdown for reproducible data analyses.

Week 1 - Introduction:
Learn the foundational ideas about random variables and probability distributions; Reading: Chapter 1 of the textbook (excluding the section on bivariate distributions).

Week 2 - Bayesian data analysis:
Understand Bayes' rule, derive the posterior using Bayes' rule; visualize the prior, likelihood, and posterior; distinguish the relationship between the prior, likelihood, and posterior; incorporate prior knowledge into the analysis; Reading: Chapter 2.

Week 3 - Computational Bayesian data analysis:
Derive the posterior through sampling; perform simple regression modeling of a simple button-pressing task using Stan/brms; do prior predictive distributions, sensitivity analysis, and different classes of prior; do posterior predictive distributions; derive the log-normal likelihood; Reading: Chapter 3.

Week 4 - Bayesian regression and hierarchical models:
Perform simple linear regressions using the normal and binomial likelihoods to answer the following research questions: (i) Does attentional load affect pupil size? (ii) Does trial id affect response times? (iii) Does set size affect recall accuracy? Take a brief look-ahead at linear mixed models; Reading: Chapter 4 and up to section 5.3 of chapter 5.

Final Exam:
Final Exam

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Predictive Modeling and Analytics (Coursera) Coursera
University of Colorado Boulder

Predictive Modeling and Analytics (Coursera)

Welcome to the second course in the Data Analytics for Business specialization! This course will introduce you to some of the most widely used predictive modeling techniques and their core principles. By taking this course, you will form a solid foundation of predictive analytics, which refers to tools and techniques for building statistical or machine learning models to make predictions based on data. You will learn how to carry out exploratory data analysis to gain insights and prepare data for predictive modeling, an essential skill valued in the business.

Jun 8th 2026
4 Weeks
Data Science Bootcamp (openHPI) OpenHPI
Hasso-Plattner-Institut

Data Science Bootcamp (openHPI)

The ultimate goal of the bootcamp is to cultivate strong data science skills with an emphasis on machine learning techniques to satisfactorily meet and exceed the requests of the Data science world. In the process, we will develop good habits for operating independently as data scientists and for operating as members of productive data science teams.

Jun 7th 2023
4 Weeks
Customer Analytics (Coursera) Coursera
University of Pennsylvania

Customer Analytics (Coursera)

Data about our browsing and buying patterns are everywhere. From credit card transactions and online shopping carts, to customer loyalty programs and user-generated ratings/reviews, there is a staggering amount of data that can be used to describe our past buying behaviors, predict future ones, and prescribe new ways to influence future purchasing decisions. In this brand new course, four of Wharton’s top marketing professors will dive deeper into the key areas of customer analytics: descriptive analytics, predictive analytics, prescriptive analytics, and their application to real-world business practices including Amazon, Google, and Starbucks to name a few.

Jun 8th 2026
5-12 Weeks
Leadership Through Marketing (Coursera) Coursera
Northwestern University

Leadership Through Marketing (Coursera)

The success of every organization depends on attracting and retaining customers. Although the marketing concepts for doing so are well established, digital technology has empowered customers, while producing massive amounts of data, revolutionizing the processes through which organizations attract and retain customers. In this course, students will learn how to identify new opportunities to create value for empowered consumers, develop strategies that yield an advantage over rivals, and develop the data science skills to lead more effectively, allocate resources, and to confront this very challenging environment with confidence.

Jun 14th 2026
4 Weeks
Introducción a Data Science: Programación Estadística con R (Coursera) Coursera
Universidad Nacional Autónoma de México

Introducción a Data Science: Programación Estadística con R (Coursera)

Este curso te proporcionará las bases del lenguaje de programación estadística R, la lengua franca de la estadística, el cual te permitirá escribir programas que lean, manipulen y analicen datos cuantitativos. Te explicaremos la instalación del lenguaje; también verás una introducción a los sistemas base de gráficos y al paquete para graficar ggplot2, para visualizar estos datos. Además también abordarás la utilización de uno de los IDEs más populares entre la comunidad de usuarios de R, llamado RStudio.

Jun 8th 2026
4 Weeks
Graph Analytics for Big Data (Coursera) Coursera
University of California, San Diego

Graph Analytics for Big Data (Coursera)

Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data.

Jun 8th 2026
5-12 Weeks
Basic Statistics (Coursera) Coursera
University of Amsterdam

Basic Statistics (Coursera)

Understanding statistics is essential to understand research in the social and behavioral sciences. In this course you will learn the basics of statistics; not just how to calculate them, but also how to evaluate them. This course will also prepare you for the next course in the specialization - the course Inferential Statistics. In the first part of the course we will discuss methods of descriptive statistics. You will learn what cases and variables are and how you can compute measures of central tendency (mean, median and mode) and dispersion (standard deviation and variance). Next, we discuss how to assess relationships between variables, and we introduce the concepts correlation and regression.

Jun 8th 2026
5-12 Weeks
Reproducible Research (Coursera) Coursera
Johns Hopkins University

Reproducible Research (Coursera)

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Jun 8th 2026
4 Weeks
Probabilistic Graphical Models 1: Representation (Coursera) Coursera
Stanford University

Probabilistic Graphical Models 1: Representation (Coursera)

Probabilistic graphical models (PGMs) are a rich framework for encoding probability distributions over complex domains: joint (multivariate) distributions over large numbers of random variables that interact with each other. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more. They are the basis for the state-of-the-art methods in a wide variety of applications, such as medical diagnosis, image understanding, speech recognition, natural language processing, and many, many more. They are also a foundational tool in formulating many machine learning problems.

Jun 8th 2026
5-12 Weeks
Introduction to Spreadsheets and Models (Coursera) Coursera
University of Pennsylvania

Introduction to Spreadsheets and Models (Coursera)

The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have know how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future.

Jun 8th 2026
4 Weeks
Infonomics II: Business Information Management and Measurement (Coursera) Coursera
University of Illinois at Urbana-Champaign

Infonomics II: Business Information Management and Measurement (Coursera)

Even decades into the Information Age, accounting practices yet fail to recognize the financial value of information. Moreover, traditional asset management practices fail to recognize information as an asset to be managed with earnest discipline. This has led to a business culture of complacence, and the inability for most organizations to fully leverage available information assets. This second course in the two-part Infonomics series explores how and why to adapt well-honed asset management principles and practices to information, and how to apply accepted and new valuation models to gauge information’s potential and realized economic benefits.

Jun 10th 2026
4 Weeks