Advanced Reproducibility in Cancer Informatics (Coursera)

Advanced Reproducibility in Cancer Informatics (Coursera)

This course introduces tools that help enhance reproducibility and replicability in the context of cancer informatics. It uses hands-on exercises to demonstrate in practical terms how to get acquainted with these tools but is by no means meant to be a comprehensive dive into these tools. The course introduces tools and their concepts such as git and GitHub, code review, Docker, and GitHub actions.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

The course is intended for students in the biomedical sciences and researchers who use informatics tools in their research. It is the follow up course to the Introduction to Reproducibility in Cancer Informatics course. Learners who take this course should:

  • Have some familiarity with R or Python
  • Have take the Introductory Reproducibility in Cancer Informatics course
  • Have some familiarity with GitHub

Motivation
Data analyses are generally not reproducible without direct contact with the original researchers and a substantial amount of time and effort (BeaulieuJones, 2017). Reproducibility in cancer informatics (as with other fields) is still not monitored or incentivized despite that it is fundamental to the scientific method. Despite the lack of incentive, many researchers strive for reproducibility in their own work but often lack the skills or training to do so effectively.
Equipping researchers with the skills to create reproducible data analyses increases the efficiency of everyone involved. Reproducible analyses are more likely to be understood, applied, and replicated by others. This helps expedite the scientific process by helping researchers avoid false positive dead ends. Open source clarity in reproducible methods also saves researchers' time so they don't have to reinvent the proverbial wheel for methods that everyone in the field is already performing.
Curriculum
The course includes hands-on exercises for how to apply reproducible code concepts to their code. Individuals who take this course are encouraged to complete these activities as they follow along with the course material to help increase the reproducibility of their analyses.
Goal of this course:
To equip learners with a deeper knowledge of the capabilities of reproducibility tools and how they can apply to their existing analyses scripts and projects.
What is NOT the goal of this course:
To be a comprehensive dive into each of the tools discussed.
How to use the course
Each chapter has associated exercises that you are encourage to complete in order to get the full benefit of the course
This course is designed with busy professional learners in mind -- who may have to pick up and put down the course when their schedule allows. In general, you are able to skip to chapters you find a most useful to (One incidence where a prior chapter is required is noted).
Each chapter has associated exercises that you are encourage to complete in order to get the full benefit of the course

What You Will Learn

  • Enhance reproducibility and replicability of data analyses
  • Introduction to reproducibility tools

Syllabus

WEEK 1
Getting started in this course
This section describes the rationale and context for this course as well as its target audience.
Defining Reproducibility
This section defines reproducibility for the purposes of this course.

WEEK 2
Version control with GitHub
This section discusses how to get started with creating branches and pull requests on GitHub.

WEEK 3
Code review - as an author
In this section we discuss the responsibility of an author of a pull request in code review.
Code review -- as a reviewer
In this section we discuss the responsibility of a reviewer of a pull request in code review.

WEEK 4
Launching Docker
This section walks through how to get started with Docker.
Modifying a Docker image
This section describes how to modify an existing Docker image

WEEK 5
Automation as a reproducibility tool
This section describes the motivation for using automation tools to enhance reproducibility.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Communicating Data Science Results (Coursera) Coursera
University of Washington

Communicating Data Science Results (Coursera)

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Jun 8th 2026
3 Weeks
Framework for Data Collection and Analysis (Coursera) Coursera
University of Maryland, College Park

Framework for Data Collection and Analysis (Coursera)

This course will provide you with an overview over existing data products and a good understanding of the data collection landscape. With the help of various examples you will learn how to identify which data sources likely matches your research question, how to turn your research question into measurable pieces, and how to think about an analysis plan.

Jun 8th 2026
4 Weeks
Infonomics II: Business Information Management and Measurement (Coursera) Coursera
University of Illinois at Urbana-Champaign

Infonomics II: Business Information Management and Measurement (Coursera)

Even decades into the Information Age, accounting practices yet fail to recognize the financial value of information. Moreover, traditional asset management practices fail to recognize information as an asset to be managed with earnest discipline. This has led to a business culture of complacence, and the inability for most organizations to fully leverage available information assets. This second course in the two-part Infonomics series explores how and why to adapt well-honed asset management principles and practices to information, and how to apply accepted and new valuation models to gauge information’s potential and realized economic benefits.

Jun 10th 2026
4 Weeks
Data Visualization with Advanced Excel (Coursera) Coursera
PwC

Data Visualization with Advanced Excel (Coursera)

In this course, you will get hands-on instruction of advanced Excel 2013 functions. You’ll learn to use PowerPivot to build databases and data models. We’ll show you how to perform different types of scenario and simulation analysis and you’ll have an opportunity to practice these skills by leveraging some of Excel's built in tools including, solver, data tables, scenario manager and goal seek.

Jun 8th 2026
4 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 8th 2026
4 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 8th 2026
4 Weeks
Communicating Business Analytics Results (Coursera) Coursera
University of Colorado Boulder

Communicating Business Analytics Results (Coursera)

The analytical process does not end with models than can predict with accuracy or prescribe the best solution to business problems. Developing these models and gaining insights from data do not necessarily lead to successful implementations. This depends on the ability to communicate results to those who make decisions.

Jun 8th 2026
4 Weeks
Graph Analytics for Big Data (Coursera) Coursera
University of California, San Diego

Graph Analytics for Big Data (Coursera)

Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data.

Jun 8th 2026
5-12 Weeks
Introduction to Probability and Data with R (Coursera) Coursera
Duke University

Introduction to Probability and Data with R (Coursera)

This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes' rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization.

Jun 8th 2026
5-12 Weeks
Foundations of strategic business analytics (Coursera) Coursera
ESSEC Business School

Foundations of strategic business analytics (Coursera)

Who is this course for? This course is designed for students, business analysts, and data scientists who want to apply statistical knowledge and techniques to business contexts. For example, it may be suited to experienced statisticians, analysts, engineers who want to move more into a business role. You will find this course exciting and rewarding if you already have a background in statistics, can use R or another programming language and are familiar with databases and data analysis techniques such as regression, classification, and clustering.

Jun 8th 2026
4 Weeks
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 8th 2026
4 Weeks
Introduction to Data Science in Python (Coursera) Coursera
University of Michigan

Introduction to Data Science in Python (Coursera)

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.

Jun 8th 2026
4 Weeks