FUN

Machine learning in Python with scikit-learn (FUN)

Offered by INRIA,

Build predictive models with scikit-learn and gain a practical understanding of the strengths and limitations of machine learning! Predictive modeling is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.
The training will be essentially practical, focusing on examples of applications with code executed by the participants.
The Mooc is completely free of charge. All the course materials are also available on a github repository.
The authors of the course are scikit-learn core developpers, they will be your guides throughout the training!

What you will learn
At the end of this course, you will be able to:

Grasp the fundamental concepts of machine learning
Build a predictive modeling pipeline with scikit-learn
Develop intuitions behind machine learning models from linear models to gradient-boosted decision trees
Evaluate the statistical performance of your models

Format
The course will cover practical aspects through the use of Jupyter notebooks and regular exercises. Throughout the course, we will highligh scikit-learn best practices and give you the intuition to use scikit-learn in a methodologically sound way.
Prerequisites
The course aims to be accessible without a strong technical background. The requirements for this course are:

basic knowledge of Python programming : defining variables, writing functions, importing modules
some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required

Course plan

Introduction
Module 1. The Predictive Modeling Pipeline
Module 2. Selecting the best model
Module 3. Hyperparameters tuning
Module 4. Linear Models
Module 5. Decision tree models
Module 6. Ensemble of models
Module 7. Evaluating model performance

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Coursera

Johns Hopkins University

Algorithms for DNA Sequencing (Coursera)

Statistics & Data Analysis Data Science

We will learn computational methods -- algorithms and data structures -- for analyzing DNA sequencing data. We will learn a little about DNA, genomics, and how DNA sequencing is used. We will use Python to implement key algorithms and data structures and to analyze real genomes and DNA sequencing datasets.

Jun 22nd 2026

4 Weeks

Python Algorithms DNA

Coursera

University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Statistics & Data Analysis Data Science

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 22nd 2026

5-12 Weeks

Machine Learning Search Data Mining

FUN

Institut Mines-Telecom

Fondamentaux pour le Big Data (FUN)

CS: Software Engineering Sci: Mathematics

Le MOOC « Fondamentaux pour le big data » permet d'acquérir efficacement le niveau prérequis en informatique et en statistiques pour suivre des formations dans le domaine du big data. Le big data offre de nouvelles opportunités d’emplois au sein des entreprises et des administrations. De nombreuses formations préparant à ces opportunités de métiers existent. Le suivi de ces formations nécessite des connaissances de base en statistiques et en informatique que ce MOOC vous propose d’acquérir dans les domaines de l’analyse, algèbre, probabilités, statistiques, programmation Python et bases de données.

No sessions available

5-12 Weeks

Programming Python Statistics

S'initier à la Data Science et à ses enjeux (FUN)

FUN

CY Cergy Paris Université

S'initier à la Data Science et à ses enjeux (FUN)

CS: Programming Data Science

La Data Science d’un monde qui change ! La Big data, et plus généralement l’analyse de données, occupent une place de plus en plus importante au sein des stratégies de nombreuses organisations. Suivi de performance, analyse des comportements, découvertes de nouvelles opportunités de marché : les applications sont multiples, et intéressent des secteurs variés. Du e-commerce à la finance, en passant par les transports et la santé, les entreprises ont besoin de talents formés à la collecte, au stockage, mais aussi au traitement et à la modélisation des données.

Self Paced

Self-Paced

Artificial Intelligence Machine Learning Big Data

Coursera

University of Toronto

Learn to Program: The Fundamentals (Coursera)

CS: Programming

Behind every mouse click and touch-screen tap, there is a computer program that makes things happen. This course introduces the fundamental building blocks of programming and teaches you how to write fun and useful programs using the Python language.

Jun 22nd 2026

5-12 Weeks

Programming Python Semantics

Coursera

University of Colorado Boulder

Predictive Modeling and Analytics (Coursera)

Statistics & Data Analysis Data Science

Welcome to the second course in the Data Analytics for Business specialization! This course will introduce you to some of the most widely used predictive modeling techniques and their core principles. By taking this course, you will form a solid foundation of predictive analytics, which refers to tools and techniques for building statistical or machine learning models to make predictions based on data. You will learn how to carry out exploratory data analysis to gain insights and prepare data for predictive modeling, an essential skill valued in the business.

Jun 22nd 2026

4 Weeks

Data Analysis Analytics Data Cleaning

Coursera

Stanford University

Probabilistic Graphical Models 2: Inference (Coursera)

Statistics & Data Analysis Data Science

Probabilistic graphical models (PGMs) are a rich framework for encoding probability distributions over complex domains: joint (multivariate) distributions over large numbers of random variables that interact with each other. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more.

Jun 22nd 2026

5-12 Weeks

Machine Learning PGM Inference

Coursera

Johns Hopkins University

Python for Genomic Data Science (Coursera)

Statistics & Data Analysis Data Science

This class provides an introduction to the Python programming language and the iPython notebook. This is the third course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Jun 22nd 2026

4 Weeks

Programming Python Big Data

Coursera

University of Washington

Machine Learning: Clustering & Retrieval (Coursera)

Statistics & Data Analysis Data Science

Case Studies: Finding Similar Documents. A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover?

Jun 22nd 2026

5-12 Weeks

Machine Learning Clustering MapReduce

Coursera

Johns Hopkins University

Regression Models (Coursera)

Statistics & Data Analysis Data Science

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026

4 Weeks

Statistics Regression Linear Regression

Coursera

Scrimba

Learn to code with AI (Coursera)

CS: Software Engineering

Imagine waking up tomorrow as a web developer. What would you want to build? With AI tools like ChatGPT, you're already a developer, regardless of your experience, if you know how to work with them. So in this course, you'll build functional, interactive front-end projects while learning how to write effective prompts and debug and refine your code with the help of AI.

Jun 24th 2026

2 Weeks

Programming Artificial Intelligence HTML

Coursera

DeepLearning.AI

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization (Coursera)

Data Science

This course will teach you the "magic" of getting deep learning to work well. Rather than the deep learning process being a black box, you will understand what drives performance, and be able to more systematically get good results. You will also learn TensorFlow.

Jun 22nd 2026

3 Weeks

Algorithms Machine Learning Neural Networks