Data Visualization for Genome Biology (Coursera)

Offered by University of Toronto,
Data Visualization for Genome Biology (Coursera)

The past decade has seen a vast increase in the amount of data available to biologists, driven by the dramatic decrease in cost and concomitant rise in throughput of various next-generation sequencing technologies, such that a project unimaginable 10 years ago was recently proposed, the Earth BioGenomes Project, which aims to sequence the genomes of all eukaryotic species on the planet within the next 10 years. So while data are no longer limiting, accessing and interpreting those data has become a bottleneck. One important aspect of interpreting data is data visualization. This course introduces theoretical topics in data visualization through mini-lectures, and applied aspects in the form of hands-on labs.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

The labs use both web-based tools and R, so students at all computer skill levels can benefit.

Syllabus

Week 1
In this module we'll cover 3 straightforward approaches for generating simple plots. As we'll see in the lab, often visualizing datasets can help us see the overall shape of the data that might not be captured in descriptive statistics like mean and standard deviation. Plotting datasets is also a useful way to identify outliers. In the mini-lectures we go over some common biological data visualization paradigms and more generally what the common chart types are, and we also talk about the context and grammar of data visualization.

Week 2
In this week's module we explore ways of displaying biological variation and a little bit of background about track viewers. We also cover visual perception, Gestalt principles, and issues related to colour perception, important for accessibility-related reasons. In the lab we'll use an online app, PlotsOfDifferences, to generate some charts that display variation nicely, and we'll also use R to generate some box plots, histograms, and violin plots. Last but not least, we'll try adjusting some of the settings in JBrowse to help assess gene expression levels in a more intuitive manner. Thanks to Dr. Joachim Goedhart, University of Amsterdam, Netherlands for permission to use PlotsOfDifferences in the lab.

Week 3
In this week's module we explore ways of visualizing gene expression data after briefly covering how we can measure gene expression levels with RNA-seq and identify significantly differentially expressed genes using statistical tests. We also cover design thinking. In the lab we'll use an online platform, Galaxy, to generate a volcano plot for visualizing significantly differentially expressed genes, and we'll also use R to generate some heatmaps of gene expression. Last but not least, we'll create our own "electronic fluorescent pictographs" for a gene expression data set.

Week 4
In this week's module we cover how the Gene Ontology can be used to make sense of often overwhelmingly long lists of genes from transcriptomic and other kind of 'omic experiments, especially through Gene Ontology enrichment analyses. We'll also look at Agile Development and User Testing and how these can help improve data visualization tools. In the lab, we'll try our hand at 3 online Gene Ontology analysis apps, and create some nice overview charts for GO enrichment results in R. Thanks to Dr. Roy Navon, Technion University, Israel, for permission to use GOrilla in the lab. Thanks to Dr. Juri Reimand of the University of Toronto for permission to use g:Profiler. And thanks to Dr. Zhen Su of the China Agricultural University for permission to use AgriGO.

Week 5
In this week's module, we explore tools for displaying and analyzing graph networks, notably those created when we generate protein-protein interactions, especially in a high-throughput manner. These PPIs are deposited in online databases like BioGRID, and can be retrieved on-the-fly via web services for display in powerful network visualization apps like Cytoscape. We'll talk about other web services/APIs that are available for biology in one of the mini-lectures, and in the lab we'll use Cytoscape to explore interactors of BRCA2. We'll also use a plug-in called BiNGO to do Gene Ontology enrichment analyses of its interactors, continuing our exploration of GO that we started last week. Last, we'll try using D3 to display an interaction network in a web page.

Week 6
In this module we cover methods for generating and making sense of ever bigger biological data sets. The growth in sequencing capacity has enabled projects that we unimaginable even a few years ago, such as the Earth Biogenomes Project, which aims to sequence the genome of a representative of every eukaryotic species on the planet. In order to make sense of these large data sets, it is often useful to use dimentionality reduction methods, like t-SNE, PCA, and UMAP, to help visualize how similar samples are. Logic diagrams (Venn-Euler or Upset plots) are also useful for displaying how sets of genes are similar one to another. Thanks to Dr. Tim Hulsen (Philips Research, the Netherlands) for permission to use the DeepVenn app in the lab.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Framework for Data Collection and Analysis (Coursera) Coursera
University of Maryland, College Park

Framework for Data Collection and Analysis (Coursera)

This course will provide you with an overview over existing data products and a good understanding of the data collection landscape. With the help of various examples you will learn how to identify which data sources likely matches your research question, how to turn your research question into measurable pieces, and how to think about an analysis plan.

Jun 22nd 2026
4 Weeks
Dino 101: Dinosaur Paleobiology (Coursera) Coursera
University of Alberta

Dino 101: Dinosaur Paleobiology (Coursera)

Dino 101: Dinosaur Paleobiology is a 12-lesson course teaching a comprehensive overview of non-avian dinosaurs. Topics covered: anatomy, eating, locomotion, growth, environmental and behavioral adaptations, origins and extinction. Lessons are delivered from museums, fossil-preparation labs and dig sites. Estimated workload: 3-5 hrs/week.

Jun 27th 2026
5-12 Weeks
Leadership Through Marketing (Coursera) Coursera
Northwestern University

Leadership Through Marketing (Coursera)

The success of every organization depends on attracting and retaining customers. Although the marketing concepts for doing so are well established, digital technology has empowered customers, while producing massive amounts of data, revolutionizing the processes through which organizations attract and retain customers. In this course, students will learn how to identify new opportunities to create value for empowered consumers, develop strategies that yield an advantage over rivals, and develop the data science skills to lead more effectively, allocate resources, and to confront this very challenging environment with confidence.

Jun 28th 2026
4 Weeks
Data Visualization (Coursera) Coursera
University of Illinois at Urbana-Champaign

Data Visualization (Coursera)

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for pattern-based classification and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.

Jun 22nd 2026
4 Weeks
Mathematical Biostatistics Boot Camp 1 (Coursera) Coursera
Johns Hopkins University

Mathematical Biostatistics Boot Camp 1 (Coursera)

This class presents the fundamental probability and statistical concepts used in elementary data analysis. It will be taught at an introductory level for students with junior or senior college-level mathematical training including a working knowledge of calculus. A small amount of linear algebra and programming are useful for the class, but not required.

Jun 22nd 2026
4 Weeks
Fundamentals of GIS (Coursera) Coursera
University of California, Davis

Fundamentals of GIS (Coursera)

Explore the world of spatial analysis and cartography with geographic information systems (GIS). What you will learn: define core geospatial concepts; practice with subset data using selections and feature attributes; create map books using advanced mapping techniques; create layer and map packages.

Jun 22nd 2026
4 Weeks
Introduction to Spreadsheets and Models (Coursera) Coursera
University of Pennsylvania

Introduction to Spreadsheets and Models (Coursera)

The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have know how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future.

Jun 22nd 2026
4 Weeks
Graph Analytics for Big Data (Coursera) Coursera
University of California, San Diego

Graph Analytics for Big Data (Coursera)

Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data.

Jun 22nd 2026
5-12 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 22nd 2026
4 Weeks