Reproducible Research (Coursera)

Reproducible Research (Coursera)

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.
Completing this course will count towards your learning in any of the following programs:

What You Will Learn

  • Organize data analysis to help make it more reproducible
  • Write up a reproducible data analysis using knitr
  • Determine the reproducibility of analysis project
  • Publish reproducible web documents using Markdown

Syllabus

WEEK 1
Concepts, Ideas, & Structure
This week will cover the basic ideas of reproducible research since they may be unfamiliar to some of you. We also cover structuring and organizing a data analysis to help make it more reproducible. I recommend that you watch the videos in the order that they are listed on the web page, but watching the videos out of order isn't going to ruin the story.

WEEK 2
Markdown & knitr
This week we cover some of the core tools for developing reproducible documents. We cover the literate programming tool knitr and show how to integrate it with Markdown to publish reproducible web documents. We also introduce the first peer assessment which will require you to write up a reproducible data analysis using knitr.

WEEK 3
Reproducible Research Checklist & Evidence-based Data Analysis
This week covers what one could call a basic check list for ensuring that a data analysis is reproducible. While it's not absolutely sufficient to follow the check list, it provides a necessary minimum standard that would be applicable to almost any area of analysis.

WEEK 4
Week 4: Case Studies & Commentaries
This week there are two case studies involving the importance of reproducibility in science for you to watch.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Leadership Through Marketing (Coursera) Coursera
Northwestern University

Leadership Through Marketing (Coursera)

The success of every organization depends on attracting and retaining customers. Although the marketing concepts for doing so are well established, digital technology has empowered customers, while producing massive amounts of data, revolutionizing the processes through which organizations attract and retain customers. In this course, students will learn how to identify new opportunities to create value for empowered consumers, develop strategies that yield an advantage over rivals, and develop the data science skills to lead more effectively, allocate resources, and to confront this very challenging environment with confidence.

Jun 28th 2026
4 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 22nd 2026
4 Weeks
Introduction to Genomic Technologies (Coursera) Coursera
Johns Hopkins University

Introduction to Genomic Technologies (Coursera)

This course introduces you to the basic biology of modern genomics and the experimental tools that we use to measure it. We'll introduce the Central Dogma of Molecular Biology and cover how next-generation sequencing can be used to measure DNA, RNA, and epigenetic patterns. You'll also get an introduction to the key concepts in computing and data science that you'll need to understand how data from next-generation sequencing experiments are generated and analyzed.

Jun 22nd 2026
4 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026
4 Weeks
Introduction to Machine Learning (Coursera) Coursera
Duke University

Introduction to Machine Learning (Coursera)

This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction.

Jun 26th 2026
5-12 Weeks
Fundamentals of GIS (Coursera) Coursera
University of California, Davis

Fundamentals of GIS (Coursera)

Explore the world of spatial analysis and cartography with geographic information systems (GIS). What you will learn: define core geospatial concepts; practice with subset data using selections and feature attributes; create map books using advanced mapping techniques; create layer and map packages.

Jun 22nd 2026
4 Weeks
Experimentation for Improvement (Coursera) Coursera
McMaster University

Experimentation for Improvement (Coursera)

We are always using experiments to improve our lives, our community, and our work. Are you doing it efficiently? Or are you (incorrectly) changing one thing at a time and hoping for the best? In this course, you will learn how to plan efficient experiments - testing with many variables. Our goal is to find the best results using only a few experiments. A key part of the course is how to optimize a system.

Jun 22nd 2026
5-12 Weeks