Data Analysis with R (Coursera)

Offered by IBM,
Data Analysis with R (Coursera)

Welcome to Data Analysis with R. Now that you have a basic understanding of R programming language fundamentals, it is time to put that knowledge to work! The R programming language is purpose-built for data analysis. R is the key that opens the door between the problems that you want to solve with data and the answers you need to meet your objectives.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

This course starts with a question and then walks you through the process of answering it through data. You will first learn important techniques for preparing (or wrangling) your data for analysis. You will then learn how to gain a better understanding of your data through exploratory data analysis, helping you to summarize your data and identify relevant relationships between variables that can lead to insights. Once your data is ready to analyze, you will learn how to develop your model and evaluate and tune its performance. By following this process, you can be sure that your data analysis performs to the standards that you have set, and you can have confidence in the results.
You will build hands-on experience by playing the role of a data analyst who is analyzing airline departure and arrival data to predict flight delays. Using an Airline Reporting Carrier On-Time Performance Dataset, you will practice reading data files, preprocessing data, creating models, improving models, and evaluating them to ultimately choose the best model.
Watch the videos, work through the labs, and add to your portfolio. Good luck!
Completing this course will count towards your learning in any of the following programs:

What You Will Learn

  • Prepare data for analysis by handling missing values, formatting and normalizing data, binning, and turning categorical values into numeric values.
  • Conduct exploratory data analysis using descriptive statistics, data grouping, analysis of variance (ANOVA), and correlation statistics.
  • Develop a predictive model using various regression methods.
  • Evaluate a model for overfitting and underfitting conditions and tune its performance using regularization and grid search.

Syllabus

WEEK 1
Introduction to Data Analysis with R
All data analysis starts with a problem that you need to solve and understanding your data and the types of questions you can answer about it are key aspects of this. The R programming language provides you with all the tools you need to conduct powerful data analysis, providing the conduit between your data and the real-world problems you want to solve.
In this module, you’ll review a type of problem that you can solve in R and the underlying data that forms the basis for your analysis. You’ll also learn about the R packages for data analysis, which provide a powerful set of tools that you’re likely to use in everyday data analyses. Finally, you’ll see how to import data and gain basic insights from the dataset.

WEEK 2
Data Wrangling
Data wrangling, or data pre-processing, is an essential first step to achieving accurate and complete analysis of your data. This process transforms your raw data into a format that can be easily categorized or mapped to other data, creating predictable relationships between them, and making it easier to build the models you need to answer questions about your data.
This module provides an introduction to data pre-processing in R and then provides you with the tools you need to identify and handle missing values in your dataset, transform data formats to align them with other data you may want to compare them to, normalize your data, create categories of information through data binning, and convert categorical variables into quantitative values that can then be used in numeric-based analyses.

WEEK 3
Exploratory Data Analysis
Exploratory data analysis, or EDA, is an approach to analyzing data that summarizes its main characteristics and helps you gain a better understanding of the dataset, uncover relationships between different variables, and extract important variables for the problem you are trying to solve.
The main question you are trying to answer in this module is: "What causes flight delays?" In this module, you’ll learn some useful exploratory data analysis techniques that will help answer this question.

WEEK 4
Model Development in R
You have identified the problem that you’re trying to solve and have pre-processed the dataset you’ll use in your analysis, and you have conducted some exploratory data analysis to answer some of your initial questions. Now, it’s time to develop your model and assess the strength of your assumptions.
In this module, you will examine model development by trying to predict the arrival delay of a flight using the Airline dataset. You’ll learn regression techniques for determining the correlation between variables in your dataset, and evaluate the result both visually and through the calculation of metrics.

WEEK 5
Model Evaluation
You have a firm understanding of your data and have pre-processed it to ensure the best possible outcomes. And you have conducted exploratory data analysis and developed your model. Everything looks good so far, but how can you be certain your model works in the real world and performs optimally?
In this module, you’ll learn how to use the tidymodels framework to evaluate your model. Tidymodels is a collection of packages for modeling and machine learning using tidyverse principles. Using these packages, you’ll learn how to cross-validate your models, identify potential problems, like overfitting and underfitting, and handle overfitting problems using a technique called regularization. You’ll also learn how to tune your models using grid search.

WEEK 6
Project

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Practical Machine Learning (Coursera) Coursera
Johns Hopkins University

Practical Machine Learning (Coursera)

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates.

Jun 15th 2026
4 Weeks
Mastering Data Analysis in Excel (Coursera) Coursera
Duke University

Mastering Data Analysis in Excel (Coursera)

Important: The focus of this course is on math - specifically, data-analysis concepts and methods - not on Excel for its own sake. We use Excel to do our calculations, and all math formulas are given as Excel Spreadsheets, but we do not attempt to cover Excel Macros, Visual Basic, Pivot Tables, or other intermediate-to-advanced Excel functionality. This course will prepare you to design and implement realistic predictive models based on data. In the Final Project (module 6) you will assume the role of a business data analyst for a bank, and develop two different predictive models to determine which applicants for credit cards should be accepted and which rejected. Your first model will focus on minimizing default risk, and your second on maximizing bank profits.

Jun 15th 2026
5-12 Weeks
Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 15th 2026
5-12 Weeks
Reproducible Research (Coursera) Coursera
Johns Hopkins University

Reproducible Research (Coursera)

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Jun 15th 2026
4 Weeks
Analyze Data to Answer Questions (Coursera) Coursera
Google

Analyze Data to Answer Questions (Coursera)

This is the fifth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. In this course, you’ll explore the “analyze” phase of the data analysis process. You’ll take what you’ve learned to this point and apply it to your analysis to make sense of the data you’ve collected. You’ll learn how to organize and format your data using spreadsheets and SQL to help you look at and think about your data in different ways. You’ll also find out how to perform complex calculations on your data to complete business objectives.

Jun 16th 2026
4 Weeks
Data Science in Real Life (Coursera) Coursera
Johns Hopkins University

Data Science in Real Life (Coursera)

Have you ever had the perfect data science experience? The data pull went perfectly. There were no merging errors or missing data. Hypotheses were clearly defined prior to analyses. Randomization was performed for the treatment of interest. The analytic plan was outlined prior to analysis and followed exactly. The conclusions were clear and actionable decisions were obvious. Has that every happened to you? Of course not. Data analysis in real life is messy. How does one manage a team facing real data analyses? In this one-week course, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.

Jun 15th 2026
1 Week
Foundations of marketing analytics (Coursera) Coursera
ESSEC Business School

Foundations of marketing analytics (Coursera)

Who is this course for? This course is designed for students, business analysts, and data scientists who want to apply statistical knowledge and techniques to business contexts. For example, it may be suited to experienced statisticians, analysts, engineers who want to move more into a business role, in particular in marketing. You will find this course exciting and rewarding if you already have a background in statistics, can use R or another programming language and are familiar with databases and data analysis techniques such as regression, classification, and clustering. However, it contains a number of recitals and R Studio tutorials which will consolidate your competences, enable you to play more freely with data and explore new features and statistical functions in R.

Jun 15th 2026
5-12 Weeks
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 15th 2026
4 Weeks
Share Data Through the Art of Visualization (Coursera) Coursera
Google

Share Data Through the Art of Visualization (Coursera)

This is the sixth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. You’ll learn how to visualize and present your data findings as you complete the data analysis process. This course will show you how data visualizations, such as visual dashboards, can help bring your data to life. You’ll also explore Tableau, a data visualization platform that will help you create effective visualizations for your presentations.

Jun 16th 2026
4 Weeks
Fundamentals of GIS (Coursera) Coursera
University of California, Davis

Fundamentals of GIS (Coursera)

Explore the world of spatial analysis and cartography with geographic information systems (GIS). What you will learn: define core geospatial concepts; practice with subset data using selections and feature attributes; create map books using advanced mapping techniques; create layer and map packages.

Jun 15th 2026
4 Weeks
Big Data Integration and Processing (Coursera) Coursera
University of California, San Diego

Big Data Integration and Processing (Coursera)

At the end of the course, you will be able to: Retrieve data from example database and big data management systems; Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications; Identify when a big data problem needs data integration; Execute simple big data integration and processing on Hadoop and Spark platforms.

Jun 15th 2026
5-12 Weeks