Data Analysis with R (Udacity)

Offered by Udacity, Facebook,
Data Analysis with R (Udacity)

Visually Analyze and Summarize Data Sets. Exploratory data analysis is an approach for summarizing and visualizing the important characteristics of a data set. Promoted by John Tukey, exploratory data analysis focuses on exploring data to understand the data’s underlying structure and variables, to develop intuition about the data set, to consider how that data set came into existence, and to decide how it can be investigated with more formal statistical methods.

Class Deals by MOOC List - Click here and see Udacity's Active Discounts, Deals, and Promo Codes.

If you're interested in supplemental reading material for the course check out the Exploratory Data Analysis book. (Not Required)
This course is also a part of our Data Analyst Nanodegree.
Udacity's Intro to Programming is your first step towards careers in Web and App Development, Machine Learning, Data Science, AI, and more! This program is perfect for beginners.

What You Will Learn

Lesson 1
What is EDA?

  • Start by learn about what exploratory data analysis (EDA) is and why it is important.

Lesson 2
R Basics

  • EDA
  • which comes before formal hypothesis testing and modeling
  • makes use of visual methods to analyze and summarize data sets.
  • R will be our tool for generating those visuals and conducting analyses.
  • We will install RStudio and packages
  • learn the layout and basic commands of R
  • practice writing basic R scripts
  • and inspect data sets.

Lesson 3
Explore One Variable

  • Perform EDA to understand the distribution of a variable and to check for anomalies and outliers.
  • Learn how to quantify and visualize individual variables within a data set to make sense of a pseudo-data set of Facebook users.
  • Create histograms and boxplots
  • transform variables
  • and examine tradeoffs in visualizations.

Lesson 4
Explore Two Variables

  • DA allows us to identify the most important variables and relationships within a data set before building predictive models.
  • Learn techniques for exploring the relationship between any two variables in a data set.
  • Create scatter plots
  • calculate correlations
  • and investigate conditional means.

Lesson 5
Explore Many Variables

  • Learn powerful methods and visualizations for examining relationships among multiple variables.
  • Reshape data frames and how to use aesthetics like color and shape to uncover more information
  • Continue to build intuition around the Facebook data set and explore some new data sets as well.

Lesson 6
Diamonds and Price Predictions

  • Investigate the diamonds data set alongside Facebook Data Scientist
  • Solomon Messing.
  • See how predictive modeling can allow us to determine a good price for a diamond.
  • As a final project

Prerequisites and Requirements
A background in statistics is helpful but not required. Consider taking Intro to Descriptive Statistics prior to taking this course. Relevant topics include:

  • Mean, median, mode
  • Normal, uniform, and skewed distributions
  • Histograms and box plots
  • Familiarity with the following CS and Math topics will help students:
  • Variable assignment
  • Comparison and logical operators ( , =, ==, &, | )
  • If else statements
  • Square roots, logarithms, and exponentials

Why Take This Course
You will...

  • Understand data analysis via EDA as a journey and a way to explore data
  • Explore data at multiple levels using appropriate visualizations
  • Acquire statistical knowledge for summarizing data
  • Demonstrate curiosity and skepticism when performing data analysis
  • Develop intuition around a data set and understand how the data was generated.
Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 8th 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks
Data Visualization and D3.js (Udacity) Udacity
Udacity,Zipfian Academy

Data Visualization and D3.js (Udacity)

Communicating with Data. Learn the fundamentals of data visualization and practice communicating with data. This course covers how to apply design principles, human perception, color theory, and effective storytelling to data visualization. If you present data to others, aspire to be an analyst or data scientist, or if you’d like to become more technical with visualization tools, then you can grow your skills with this course.

Self Paced
Self-Paced
Real-Time Analytics with Apache Storm (Udacity) Udacity
Udacity,Twitter

Real-Time Analytics with Apache Storm (Udacity)

The world is trending in real time! Learn from Twitter to scalably process tweets, or any big data stream, in real-time to drive d3 visualizations using Apache Storm, the "Hadoop of Real Time." Storm is free, open source, and fun to use! Learn from Karthik Ramasamy, about the distributed, fault-tolerant, and flexible technology used to power Twitter’s real-time data flow pipeline. Twitter open sourced Storm in 2011, and it graduated to a top-level Apache project in September, 2014.

Self Paced
Self-Paced
Introduction to Machine Learning Course (Udacity) Udacity
Udacity

Introduction to Machine Learning Course (Udacity)

This class will teach you the end-to-end process of investigating data through a machine learning lens. Learn online, with Udacity. Machine Learning is a first-class ticket to the most exciting careers in data analysis today. As data sources proliferate along with the computing power to process them, going straight to the data is one of the most straightforward ways to quickly gain insights and make predictions.

Self Paced
Self-Paced
Data Wrangling with MongoDB (Udacity) Udacity
Udacity,MongoDB University

Data Wrangling with MongoDB (Udacity)

In this course, we will explore how to wrangle data from diverse sources and shape it to enable data-driven applications. Some data scientists spend the bulk of their time doing this! Students will learn how to gather and extract data from widely used data formats. They will learn how to assess the quality of data and explore best practices for data cleaning. We will also introduce students to MongoDB, covering the essentials of storing data and the MongoDB query language together with exploratory analysis using the MongoDB aggregation framework.

Self Paced
Self-Paced
Introduction to Spreadsheets and Models (Coursera) Coursera
University of Pennsylvania

Introduction to Spreadsheets and Models (Coursera)

The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have know how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future.

Jun 8th 2026
4 Weeks
Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 8th 2026
5-12 Weeks
Spark (Udacity) Udacity
Udacity,Insight

Spark (Udacity)

Master how to work with big data and build machine learning models at scale using Spark! In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.

Self Paced
Self-Paced