EdX

AI skills for Engineers: Supervised Machine Learning (edX)

AI skills for Engineers: Supervised Machine Learning (edX)

Learn the fundamentals of machine learning to help you correctly apply various classification and regression machine learning algorithms to real-life problems using the Python toolbox scikit-learn. Machine learning classification and regression techniques have potential uses in various engineering disciplines. These machine learning models allow you to make predictions for a category (classification) or for a number (regression) given sensor data, and can be used in, for example, predicting properties of objects (such as their weight or shape).

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Using hands-on and interactive exercises you will get insight into:
Machine learning and its variants, such as supervised learning, semi-supervised learning, unsupervised learning and reinforcement learning.
Regression techniques such as linear regression, K-nearest neighbor regression, how to deal with outliers and evaluation metrics such as the mean squared error (MSE) and mean absolute error (MAE).
Classification techniques such as the histogram method, the nearest mean (or nearest medoid) method and the nearest neighbor classifier. We cover the classification setting and important concepts such as the Bayes classifier and the Bayes error, the optimal classifier in theory.
Training models using (stochastic) gradient descent and its variants, we learn how to tune this optimizer, and how to use it to construct a logistic regression classification model.
Overfitting means a classifier works well on a training set but not on unseen test data. We discuss how to build complex non-linear models, and we analyze how we can understand overfitting using the bias-variance decomposition and the curse of dimensionality. Finally, we discuss how to evaluate fairly and tune machine learning models and estimate how much data they need for an sufficient performance.
Regularization methods can help to mitigate overfitting. We discuss two regularization techniques for estimating the linear regression coefficients: ridge regression and LASSO. The latter can also be used for variable selection.
Classifier evaluation metrics such as the ROC curve and confusion matrix can give more insight into the performance of classifiers. We also discuss what constitutes a “good” accuracy; this is given by so-called dummy-classifiers which are naïve baselines.
Support Vector Machines (SVMs) are more advanced classification models that can provide good performance even in high-dimensional spaces and with little data. We discuss their different variants such as the soft-margin SVM, the hard-margin SVM and the nonlinear kernel SVM.
Decision Trees are simple models that can easily be understood by lay people. They are easy to use and visualize, and instead of a black box they can be easily understood as an interpretable white box model, making them suitable for various applications.
The lectures feature a unique combination of videos mixed with hands-on interaction with machine learning algorithms to stimulate a deeper understanding. In the exercises you apply the algorithms in Python using scikit-learn and in the final project you will further deepen your understanding of the various concepts by building and tuning a machine learning pipeline from start to finish.
This course is part of the AI Skills: Basic and Advanced Techniques in Machine Learning Professional Certificate.

What you'll learn

  • Apply common operations (pre-processing, plotting, etc.) to datasets using Python.
  • Explain the concept of supervised, semi-supervised, unsupervised machine learning and reinforcement learning.
  • Explain how various supervised learning models work and recognize their limitations.
  • Analyze which factors impact the performance of learning algorithms.
  • Apply learning algorithms to datasets using Python and Scikit-learn and evaluate their performance.
  • Optimize a machine learning pipeline using Python and Scikit-learn.

Syllabus

Week 1: Introduction & Regression
This week is an introduction to the course with an overview of the topics. We give a brief introduction to machine learning and its different variants, and we will make a gentle start with regression. In the regression setting, a machine learning model will need to predict a number.
In the introductory part we cover:

  • Why use machine learning?
  • Machine learning basics and terminology
  • The biggest challenge in machine learning
  • Machine learning frameworks: supervised, semi-supervised, unsupervised and reinforcement learning

In the regression part we cover:

  • The regression setting and its assumptions
  • The mean squared error (MSE) and mean absolute error (MAE)
  • Outliers in regression
  • Linear regression and K-nearest neighbour regression

Week 2: Classification & Training Models
This week we discuss the classification setting and how to train models using gradient descent. In the classification setting, a machine learning model will need to predict a category or class. Gradient descent is an iterative procedure to train models, such as logistic regression and neural networks.
In the classification part we cover:

  • Terminology and basics of classification
  • Building classifiers using histograms, nearest mean (nearest medoid) classifier, K-nearest neighbour (KNN) classifier
  • The Bayes classifier and the Bayes error
  • How to use the KNN classifier in practice

In the training models part, we cover:

  • The basics of gradient descent
  • The three variants of gradient descent: batch, mini-batch and stochastic gradient descent (SGD)
  • How to tune gradient descent
  • The basics of logistic regression

Week 3: Overfitting & Regularization
This week focuses on overfitting and regularization. Overfitting is the problem where a machine learning algorithm performs well on the training set but does not perform well on new and unseen data. Regularization covers various techniques that aim to solve this problem.
In the overfitting part we cover:

  • How to use linear models for nonlinear tasks?
  • The bias-variance trade-off and the curse of dimensionality
  • How to use learning curves to estimate the amount of data needed
  • Cross validation, model selection and hyperparameter tuning

In the regularization part we cover:

  • Ridge regression
  • LASSO regularization and how it’s used for variable selection

Week 4: Classifier Evaluation & Support Vector Machines
Classifier evaluation delves deeper into the various evaluation metrics for classifiers. In the second part we cover the support vector machine classifier. The support vector machine is a well-known more advanced classification model.
In the classifier evaluation part, we cover:

  • What a “good” accuracy means (e.g., naïve baselines/dummy classifiers)
  • The confusion matrix (false positive, false negative, costs)
  • ROC-curves

In the support vector machine part (SVM) we cover:

  • Basics of the SVM, the margin and the hard-margin SVM
  • The soft-margin SVM
  • Kernels

Week 5: Decision Trees & Final Project
In this last week we discuss decision trees and you will work on a more in-depth final project. Decision trees are simple and interpretable models that are very user-friendly. The final project will involve building a machine learning pipeline, including hyperparameter tuning and a careful and fair evaluation, to solve a small practical application (MNIST).
In the decision tree part, we cover:

  • Basics of decision trees and their terminology
  • How to train decision trees with CART
  • Overfitting and other pros and cons of decision trees

Week 6: Wrap up
In this week, there will be extra time for questions regarding the earlier weeks and to discuss the final project.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Statistics Using Python (edX) EdX
University of Wisconsin–Madison,WisconsinX

Statistics Using Python (edX)

Learn the fundamentals of statistics using Python. This course is a compact primer in statistics as a foundation for data-driven business analysis. A selection of concepts include descriptive statistics, probability, inference, correlation, and regression. The course also exposes students to basic Python programming for use in statistics.

Jan 23rd 2024
5-12 Weeks
Robotics: Vision Intelligence and Machine Learning (edX) EdX
University of Pennsylvania,PennX

Robotics: Vision Intelligence and Machine Learning (edX)

Learn how to design robot vision systems that avoid collisions, safely work with humans and understand their environment. How do robots “see”, respond to and learn from their interactions with the world around them? This is the fascinating field of visual intelligence and machine learning. Visual intelligence allows a robot to “sense” and “recognize” the surrounding environment. It also enables a robot to “learn” from the memory of past experiences by extracting patterns in visual signals.

No sessions available
5-12 Weeks
Predictive Analytics (edX) EdX
Indian Institute of Management, Bangalore,IIMBx

Predictive Analytics (edX)

Master the tools of predictive analytics in this statistics based analytics course. Decision makers often struggle with questions such as: What should be the right price for a product? Which customer is likely to default in his/her loan repayment? Which products should be recommended to an existing customer? Finding right answers to these questions can be challenging yet rewarding.

This course is archived
5-12 Weeks
Data Science: Capstone (edX) EdX
HarvardX,Harvard University

Data Science: Capstone (edX)

Show what you’ve learned from the Professional Certificate Program in Data Science. To become an expert data scientist you need practice and experience. By completing this capstone project you will get an opportunity to apply the knowledge and skills in R data analysis that you have gained throughout the series. This final project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine learning.

Self Paced
Self-Paced
Data Science: R Basics (edX) EdX
HarvardX,Harvard University

Data Science: R Basics (edX)

Build a foundation in R and learn how to wrangle, analyze, and visualize data. This course will introduce you to the basics of R programming. You can better retain R when you learn it to solve a specific problem, so you’ll use a real-world dataset about crime in the United States. You will learn the R skills needed to answer essential questions about differences in crime across the different states.

Self Paced
Self-Paced
Recommender Systems: Behind the Screen (edX) EdX
Université de Montréal,UMontrealX

Recommender Systems: Behind the Screen (edX)

How are items recommended when you’re browsing for movies, jobs or clothing online? Register here and you’ll discover the fundamental concepts and methods allowing the most relevant item suggestions to users from e-commerce to online advertisement. In this course, you will explore and learn the best methods and practices in recommender systems, which are an essential component of the online ecosystem. This course was developed by IVADO and HEC Montréal as part of a workshop that took place in Montreal.

Sep 26th 2023
5-12 Weeks
Distributed Machine Learning with Apache Spark (edX) EdX
University of California, Berkeley,BerkeleyX

Distributed Machine Learning with Apache Spark (edX)

Learn the underlying principles required to develop scalable machine learning pipelines and gain hands-on experience using Apache Spark. Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization.

No sessions available
4 Weeks
Understanding the World Through Data (edX) EdX
MIT,MITx

Understanding the World Through Data (edX)

Become a data explorer – learn how to leverage data and basic machine learning algorithms to understand the world. Speech recognition, drones, and self-driving cars – things that once seemed like pure science fiction – are now widely available technologies, and just a few examples of how humans have taught machines to analyze data and make decisions. In this hands-on, introductory course, you will examine all the forms in which data exists, learn tools that uncover relationships between data, and leverage basic algorithms to understand the world from a new perspective.

Mar 13th 2024
5-12 Weeks
Applied Quantum Computing III: Algorithm and Software (edX) EdX
Purdue University,PurdueX

Applied Quantum Computing III: Algorithm and Software (edX)

Learn domain-specific quantum algorithms and how to run them on present-day quantum hardware. This course is part III of the series of Quantum computing courses, which covers aspects from fundamentals to present-day hardware platforms to quantum software and programming. The goal of part III is to discuss some of the key domain-specific algorithms that are developed by exploiting the fundamental quantum phenomena (e.g. entanglement)and computing models discussed in part I.

Mar 25th 2024
5-12 Weeks
Machine Learning with Python: from Linear Models to Deep Learning (edX) EdX
MIT,MITx

Machine Learning with Python: from Linear Models to Deep Learning (edX)

An in-depth introduction to the field of machine learning, from linear models to deep learning and reinforcement learning, through hands-on Python projects. Machine learning methods are commonly used across engineering and sciences, from computer systems to physics. Moreover, commercial sites such as search engines, recommender systems (e.g., Netflix, Amazon), advertisers, and financial institutions employ machine learning algorithms for content recommendation, predicting customer behavior, compliance, or risk.

May 27th 2024
13-24 Weeks