EdX

High-Dimensional Data Analysis (edX)

High-Dimensional Data Analysis (edX)

A focus on several techniques that are widely used in the analysis of high-dimensional data. If you’re interested in data analysis and interpretation, then this is the data science course for you. We start by learning the mathematical definition of distance and use this to motivate the use of the singular value decomposition (SVD) for dimension reduction and multi-dimensional scaling and its connection to principle component analysis.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

We will learn about the batch effect: the most challenging data analytical problem in genomics today and describe how the techniques can be used to detect and adjust for batch effects. Specifically, we will describe the principal component analysis and factor analysis and demonstrate how these concepts are applied to data visualization and data analysis of high-throughput experimental data.

Finally, we give a brief introduction to machine learning and apply it to high-throughput data. We describe the general idea behind clustering analysis and descript K-means and hierarchical clustering and demonstrate how these are used in genomics and describe prediction algorithms such as k-nearest neighbors along with the concepts of training sets, test sets, error rates and cross-validation.
Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.
This course is part of the Data Analysis for Life Sciences XSeries.
These courses make up two Professional Certificates and are self-paced:

Data Analysis for Life Sciences:

Genomics Data Analysis:

What you'll learn

  • Mathematical Distance
  • Dimension Reduction
  • Singular Value Decomposition and Principal Component Analysis
  • Multiple Dimensional Scaling Plots
  • Factor Analysis
  • Dealing with Batch Effects
  • Clustering
  • Heatmaps
  • Basic Machine Learning Concepts
Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Foundations of Data Analysis - Part 1: Statistics Using R (edX) EdX
University of Texas at Austin,UTAustinX

Foundations of Data Analysis - Part 1: Statistics Using R (edX)

Use R to learn fundamental statistical topics such as descriptive statistics and modeling. In this first part of a two part course, we’ll walk through the basics of statistical thinking – starting with an interesting question. Then, we’ll learn the correct statistical tool to help answer our question of interest – using R and hands-on Labs. Finally, we’ll learn how to interpret our findings and develop a meaningful conclusion.

No sessions available
5-12 Weeks
Computer Applications of Artificial Intelligence and e-Construction (edX) EdX
Purdue University,PurdueX

Computer Applications of Artificial Intelligence and e-Construction (edX)

Learn the fundamentals of artificial intelligence, machine learning, natural language processing and their applications in e-Construction. This course is the third in a sequence of interrelated courses of the current computer applications in the construction industry. The emphasis of this course is the advanced computational tools including artificial intelligence, machine learning, and natural language processing, and their applications in e-Construction.

Mar 28th 2022
5-12 Weeks
Foundations of Data Analysis - Part 1: Statistics Using R (edX) EdX
University of Texas at Austin,UTAustinX

Foundations of Data Analysis - Part 1: Statistics Using R (edX)

This is a hands on course with a data lab to teach fundamental statistical topics such as descriptive statistics, inferential testing, and modeling. In this first part of a two part course, we’ll walk through the basics of statistical thinking – starting with an interesting question. Then, we’ll learn the correct statistical tool to help answer our question of interest – using R and hands-on Labs.

No sessions available
5-12 Weeks
Designing and Running Randomized Evaluations (edX) EdX
MIT,MITx

Designing and Running Randomized Evaluations (edX)

Learn how to both design randomized evaluations and implement them in the field to measure the impact of social programs. A randomized evaluation, also known as a randomized controlled trial (RCT), field experiment or field trial, is a type of impact evaluation that uses random assignment to allocate resources, run programs, or apply policies as part of the study design.

Sep 7th 2021
5-12 Weeks
Case Studies in Functional Genomics (edX) EdX
HarvardX,Harvard University

Case Studies in Functional Genomics (edX)

Perform RNA-Seq, ChIP-Seq, and DNA methylation data analyses, using open source software, including R and Bioconductor. We will explain how to perform the standard processing and normalization steps, starting with raw data, to get to the point where one can investigate relevant biological questions.

Self Paced
Self-Paced
Enabling Technologies for Data Science and Analytics: The Internet of Things (edX) EdX
Columbia University,ColumbiaX

Enabling Technologies for Data Science and Analytics: The Internet of Things (edX)

Discover the relationship between Big Data and the Internet of Things (IoT). The Internet of Things is rapidly growing. It is predicted that more than 25 billion devices will be connected by 2020. In this data science course, you will learn about the major components of the Internet of Things and how data is acquired from sensors. You will also examine ways of analyzing event data, sentiment analysis, facial recognition software and how data generated from devices can be used to make decisions.

Self Paced
Self-Paced
Applications of Linear Algebra Part 2 (edX) EdX
Davidson College,DavidsonX

Applications of Linear Algebra Part 2 (edX)

Explore applications of linear algebra in the field of data mining by learning fundamentals of search engines, clustering movies into genres and of computer graphics by posterizing an image. Our world is in a data deluge with ever increasing sizes of datasets. Linear algebra is a tool to manage and analyze such data. This course is part 2 of a 2-part course, with this part extending smoothly from the first. Note, however, that part 1, is not a prerequisite for part 2.

No sessions available
4 Weeks
Behavioural Economics in Action (edX) EdX
University of Toronto,University of TorontoX

Behavioural Economics in Action (edX)

Learn to use principles and methods of behavioural economics to change behaviours, improve welfare and make better products and policy. How can we get people to save more money, eat healthy foods, engage in healthy behaviors, and make better choices in general? There has been a lot written about the fact that human beings do not process information and make decisions in an optimal fashion.

Self Paced
Self-Paced