FUN

Machine learning in Python with scikit-learn (FUN)

Offered by INRIA,
Machine learning in Python with scikit-learn (FUN)

Build predictive models with scikit-learn and gain a practical understanding of the strengths and limitations of machine learning! Predictive modeling is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.
The training will be essentially practical, focusing on examples of applications with code executed by the participants.
The Mooc is completely free of charge. All the course materials are also available on a github repository.
The authors of the course are scikit-learn core developpers, they will be your guides throughout the training!

What you will learn
At the end of this course, you will be able to:

  • Grasp the fundamental concepts of machine learning
  • Build a predictive modeling pipeline with scikit-learn
  • Develop intuitions behind machine learning models from linear models to gradient-boosted decision trees
  • Evaluate the statistical performance of your models

Format
The course will cover practical aspects through the use of Jupyter notebooks and regular exercises. Throughout the course, we will highligh scikit-learn best practices and give you the intuition to use scikit-learn in a methodologically sound way.
Prerequisites
The course aims to be accessible without a strong technical background. The requirements for this course are:

  • basic knowledge of Python programming : defining variables, writing functions, importing modules
  • some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required

Course plan

Introduction
Module 1. The Predictive Modeling Pipeline
Module 2. Selecting the best model
Module 3. Hyperparameters tuning
Module 4. Linear Models
Module 5. Decision tree models
Module 6. Ensemble of models
Module 7. Evaluating model performance

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 22nd 2026
5-12 Weeks
Fondamentaux pour le Big Data (FUN) FUN
Institut Mines-Telecom

Fondamentaux pour le Big Data (FUN)

Le MOOC « Fondamentaux pour le big data » permet d'acquérir efficacement le niveau prérequis en informatique et en statistiques pour suivre des formations dans le domaine du big data. Le big data offre de nouvelles opportunités d’emplois au sein des entreprises et des administrations. De nombreuses formations préparant à ces opportunités de métiers existent. Le suivi de ces formations nécessite des connaissances de base en statistiques et en informatique que ce MOOC vous propose d’acquérir dans les domaines de l’analyse, algèbre, probabilités, statistiques, programmation Python et bases de données.

No sessions available
5-12 Weeks
S'initier à la Data Science et à ses enjeux (FUN) FUN
CY Cergy Paris Université

S'initier à la Data Science et à ses enjeux (FUN)

La Data Science d’un monde qui change ! La Big data, et plus généralement l’analyse de données, occupent une place de plus en plus importante au sein des stratégies de nombreuses organisations. Suivi de performance, analyse des comportements, découvertes de nouvelles opportunités de marché : les applications sont multiples, et intéressent des secteurs variés. Du e-commerce à la finance, en passant par les transports et la santé, les entreprises ont besoin de talents formés à la collecte, au stockage, mais aussi au traitement et à la modélisation des données.

Self Paced
Self-Paced
Predictive Modeling and Analytics (Coursera) Coursera
University of Colorado Boulder

Predictive Modeling and Analytics (Coursera)

Welcome to the second course in the Data Analytics for Business specialization! This course will introduce you to some of the most widely used predictive modeling techniques and their core principles. By taking this course, you will form a solid foundation of predictive analytics, which refers to tools and techniques for building statistical or machine learning models to make predictions based on data. You will learn how to carry out exploratory data analysis to gain insights and prepare data for predictive modeling, an essential skill valued in the business.

Jun 22nd 2026
4 Weeks
Probabilistic Graphical Models 2: Inference (Coursera) Coursera
Stanford University

Probabilistic Graphical Models 2: Inference (Coursera)

Probabilistic graphical models (PGMs) are a rich framework for encoding probability distributions over complex domains: joint (multivariate) distributions over large numbers of random variables that interact with each other. These representations sit at the intersection of statistics and computer science, relying on concepts from probability theory, graph algorithms, machine learning, and more.

Jun 22nd 2026
5-12 Weeks
Machine Learning: Clustering & Retrieval (Coursera) Coursera
University of Washington

Machine Learning: Clustering & Retrieval (Coursera)

Case Studies: Finding Similar Documents. A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover?

Jun 22nd 2026
5-12 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026
4 Weeks
Learn to code with AI (Coursera) Coursera
Scrimba

Learn to code with AI (Coursera)

Imagine waking up tomorrow as a web developer. What would you want to build? With AI tools like ChatGPT, you're already a developer, regardless of your experience, if you know how to work with them. So in this course, you'll build functional, interactive front-end projects while learning how to write effective prompts and debug and refine your code with the help of AI.

Jun 24th 2026
2 Weeks