EdX

Understanding the World Through Data (edX)

Offered by MIT, MITx,
Understanding the World Through Data (edX)

Become a data explorer – learn how to leverage data and basic machine learning algorithms to understand the world. Speech recognition, drones, and self-driving cars – things that once seemed like pure science fiction – are now widely available technologies, and just a few examples of how humans have taught machines to analyze data and make decisions. In this hands-on, introductory course, you will examine all the forms in which data exists, learn tools that uncover relationships between data, and leverage basic algorithms to understand the world from a new perspective.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Whether you're a high school student or someone switching careers, all you need to get started in this course is a curiosity about the topic of machine learning and a willingness to tinker around with your computer.
The course is taught by modules. Within each module, you'll have access to videos, short exercises, and a final capstone project. In Module 1, you'll begin by looking at different kinds of data. To help you explore the data, you'll dive right into some programming with the Python programming language. You don't need to have any programming background, we will guide you on how to leverage Python to explore and visualize any data.
One kind of data you'll work with is data that relates one variable to another. Coming up with a relationship between two variables—one depending on the other—is at the center of Module 2. In that module, you'll build up some core concepts before seeing your first machine learning algorithm. The goal is to use programming to create models that describe mathematical relationships between data. You'll be able to see how good the model is and use it to make predictions about new data.
In Module 3, you'll see a discussion about where imperfections in collected data might come from. You rarely have perfectly “clean” data sets, so it's important to understand how imperfections impact the model that an algorithm might come up with. To this end, we will introduce the notion of data distributions and build up to the concepts of biased and unbiased noise.
Another kind of data you'll work with is data that belongs in different groups (or classes). Creating a model that predicts what group data belongs in is at the center of Module 4. You'll work through different ways of thinking about this problem and see three different ways of approaching making such groupings (classification).

What you'll learn

  • Python programming and the Colab notebook programming environment
  • Dependent and independent variables
  • Coming up with relationships between data using linear and polynomial regression models
  • Recognizing how data is distributed
  • How to observe noise in distributions and when to ignore it
  • Categorize data into groups with classification models
  • And more!

Syllabus

Module 1: How to represent and manipulate data
Examples of numerical data
The Python programming language and the Colab notebook programming environment
Loading datafiles in Colab as dataframes and performing simple operations (selecting rows or columns, filtering data by specific conditions, grouping data, applying functions on the resulting groups)
Finding the correlation between columns of the dataframe
Visualizing the data using line plots, scatter plots, histograms, correlation matrix

Module 2: Reverse engineering nature
Dependent and independent variables and how they correspond to real life scenarios
Intuition for what a linear model is
Intuition for what a polynomial model is
Python libraries that can perform the linear regression on data
Compare the quality of different models (mean-squared-error and R^2 values)
Fitting higher order polynomials
Overfitting

Module 3: Distributions and Latent Variables
Uniform distributions
Gaussian distributions
Distribution mean and standard deviation
Noise in distributions (biased and unbiased noise)

Module 4: How machines think
Categorizing data based on particular conditions being met
Using linear regression to classify a new datapoint as above or below the best fit line
Using a support vector classifier to separate two groups of data and classifying a new datapoint into a group
Using logistic regression to classify data into two groups and finding the probabilities of a new datapoint falling into each group
Understanding how to divide data into training and test sets

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Analyzing and Visualizing Data with Power BI (edX) EdX
Davidson College,DavidsonX

Analyzing and Visualizing Data with Power BI (edX)

Step up your analytics game and learn one of the most in-demand job skills in the United States. Power BI is a robust business analytics and visualization tool from Microsoft that helps data professionals bring their data to life and tell more meaningful stores. This four-week course is a beginner's guide to working with data in Power BI and is perfect for professionals. You'll become confident in working with data, creating data visualizations, and preparing reports and dashboards.

Self Paced
Self-Paced
Datos para la efectividad de las políticas públicas (edX) EdX
Inter-American Development Bank - IDB,IDBx

Datos para la efectividad de las políticas públicas (edX)

Este curso te ayudará a tomar el control de los datos y familiarizarte con las herramientas para utilizarlos en la planificación, gestión y evaluación de políticas publicas. En esta era de la información, los datos están disponibles en todos lados y crecen a una tasa exponencial. ¿Cómo podemos darles sentido a todos los datos y aprovecharlos en el momento de tomar decisiones?, ¿cómo los utilizamos para que nos ayuden a guiar la gestión y planificación de nuestras políticas? Tanto si eres ciudadano como planificador de políticas, deberías poder responder a estas preguntas.

Self Paced
Self-Paced
Case Studies in Functional Genomics (edX) EdX
HarvardX,Harvard University

Case Studies in Functional Genomics (edX)

Perform RNA-Seq, ChIP-Seq, and DNA methylation data analyses, using open source software, including R and Bioconductor. We will explain how to perform the standard processing and normalization steps, starting with raw data, to get to the point where one can investigate relevant biological questions.

Self Paced
Self-Paced
CS50's Introduction to Computer Science (edX) EdX
HarvardX,Harvard University

CS50's Introduction to Computer Science (edX)

An introduction to the intellectual enterprises of computer science and the art of programming. This is CS50, Harvard University's introduction to the intellectual enterprises of computer science and the art of programming for majors and non-majors alike, with or without prior programming experience. An entry-level course taught by David J. Malan, CS50 teaches students how to think algorithmically and solve problems efficiently.

Self Paced
Self-Paced
The Analytics Edge (edX) EdX
MIT,MITx

The Analytics Edge (edX)

Through inspiring examples and stories, discover the power of data and use analytics to provide an edge to your career and your life. In the last decade, the amount of data available to organizations has reached unprecedented levels. Data is transforming business, social interactions, and the future of our society. In this course, you will learn how to use data and analytics to give an edge to your career and your life.

This course is archived
13-24 Weeks
Artificial Intelligence (AI) (edX) EdX
Columbia University,ColumbiaX

Artificial Intelligence (AI) (edX)

Learn the fundamentals of Artificial Intelligence (AI), and apply them. Design intelligent agents to solve real-world problems including, search, games, machine learning, logic, and constraint satisfaction problems. What do self-driving cars, face recognition, web search, industrial robots, missile guidance, and tumor detection have in common? They are all complex real world problems being solved with applications of intelligence (AI).

This course is archived
5-12 Weeks
Introduction to Applied Biostatistics: Statistics for Medical Research (edX) EdX
Osaka University

Introduction to Applied Biostatistics: Statistics for Medical Research (edX)

Learn data analysis for medical research with practical hands-on examples using R Commander. Want to learn how to analyze real-world medical data, but unsure where to begin? This Applied Biostatistics course provides an introduction to important topics in medical statistical concepts and reasoning.

No sessions available
5-12 Weeks
Advanced Algorithmics and Graph Theory with Python (edX) EdX
Institut Mines-Telecom,IMTx

Advanced Algorithmics and Graph Theory with Python (edX)

Strengthen your skills in algorithmics and graph theory, and gain experience in programming in Python along the way. Algorithmics and programming are fundamental skills for engineering students, data scientists and analysts, computer hobbyists or developers. Learning how to program algorithms can be tedious if you aren’t given an opportunity to immediately practice what you learn. In this course, you won't just focus on theory or study a simple catalog of methods, procedures, and concepts. Instead, you’ll be given a challenge wherein you'll be asked to beat an algorithm we’ve written for you by coming up with your own clever solution.

Sep 4th 2023
5-12 Weeks
Applied Quantum Computing III: Algorithm and Software (edX) EdX
Purdue University,PurdueX

Applied Quantum Computing III: Algorithm and Software (edX)

Learn domain-specific quantum algorithms and how to run them on present-day quantum hardware. This course is part III of the series of Quantum computing courses, which covers aspects from fundamentals to present-day hardware platforms to quantum software and programming. The goal of part III is to discuss some of the key domain-specific algorithms that are developed by exploiting the fundamental quantum phenomena (e.g. entanglement)and computing models discussed in part I.

Mar 25th 2024
5-12 Weeks
Knowledge Management and Big Data in Business (edX) EdX
The Hong Kong Polytechnic University,HKPolyUx

Knowledge Management and Big Data in Business (edX)

Learn why and how knowledge management and Big Data are vital to the new business era. The business landscape is changing so rapidly that traditional management, business and computing courses do not meet the needs for the next generation of workers in the business world. Most traditional methods are of a repetitive, rule-based nature and will be gradually replaced by Artificial Intelligence.

Self Paced
Self-Paced