Data Wrangling, Analysis and AB Testing with SQL (Coursera)

Data Wrangling, Analysis and AB Testing with SQL (Coursera)

This course allows you to apply the SQL skills taught in “SQL for Data Science” to four increasingly complex and authentic data science inquiry case studies. We'll learn how to convert timestamps of all types to common formats and perform date/time calculations. We'll select and perform the optimal JOIN for a data science inquiry and clean data within an analysis dataset by deduping, running quality checks, backfilling, and handling nulls.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

We'll learn how to segment and analyze data per segment using windowing functions and use case statements to execute conditional logic to address a data science inquiry. We'll also describe how to convert a query into a scheduled job and how to insert data into a date partition. Finally, given a predictive analysis need, we'll engineer a feature from raw data using the tools and skills we've built over the course. The real-world application of these skills will give you the framework for performing the analysis of an AB test.
Course 2 of 4 in the Learn SQL Basics for Data Science Specialization.

What You Will Learn

  • Validate and clean a dataset
  • Assess and create datasets to answer your questions
  • Solve problems using SQL
  • Build a simple testing framework to touch on AB Testing

Syllabus

WEEK 1
Data of Unknown Quality
In this module, you will be able to create trustworthy analysis from a new set of data. You will be able to coalesce some nulls and identify unreliable data and discover reasons why data might be missing. You will also be able to answer ambiguous questions by defining new metrics.

WEEK 2
Creating Clean Datasets
In this module, you will be able to name the main the categories of data types. You will be able to explain how the unfiltered data can be manipulated into a table where you can conduct data analysis. You will be able to discuss why a data warehouse is separate from a production database, and you will be able to use the tools you learned to create your own trustworthy tables.

WEEK 3
SQL Problem Solving
In this module, you will be able to map out your joins and be able to highlight the level of detail needed for different kinds of questions. You will be able to practice answering data questions, which should help you feel ready to get asked a whole slough of questions, vague questions, ambiguous questions, or even poorly worded questions. Finally, you will develop a strategy for answering all those questions using data.

WEEK 4
Case Study: AB Testing
In this module, you will be able to use your SQL skills to set up a basic AB testing system. You will be able to apply hypothesis testing to prove or disprove a hypothesis about how user behavior changed. You will be able to test and interpret the results using a metric or metrics that are tied directly to some business metrics. You will be able to test your SQL skills and give you the base experience you need to learn anything more complicated in terms of AB testing in the future.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Pattern Discovery in Data Mining (Coursera) Coursera
University of Illinois at Urbana-Champaign

Pattern Discovery in Data Mining (Coursera)

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for data-driven phrase mining and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.

Jun 8th 2026
4 Weeks
Business Intelligence Concepts, Tools, and Applications (Coursera) Coursera
University of Colorado System

Business Intelligence Concepts, Tools, and Applications (Coursera)

This is the fourth course in the Data Warehouse for Business Intelligence specialization. Ideally, the courses should be taken in sequence. In this course, you will gain the knowledge and skills for using data warehouses for business intelligence purposes and for working as a business intelligence developer. You’ll have the opportunity to work with large data sets in a data warehouse environment and will learn the use of MicroStrategy's Online Analytical Processing (OLAP) and Visualization capabilities to create visualizations and dashboards.

Jun 8th 2026
5-12 Weeks
Graph Analytics for Big Data (Coursera) Coursera
University of California, San Diego

Graph Analytics for Big Data (Coursera)

Want to understand your data network structure and how it changes under different conditions? Curious to know how to identify closely interacting clusters within a graph? Have you heard of the fast-growing area of graph analytics and want to learn more? This course gives you a broad overview of the field of graph analytics so you can learn new ways to model, store, retrieve and analyze graph-structured data.

Jun 8th 2026
5-12 Weeks
Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera)

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data! In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information.

Jun 8th 2026
4 Weeks
Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 8th 2026
5-12 Weeks
Customer Analytics (Coursera) Coursera
University of Pennsylvania

Customer Analytics (Coursera)

Data about our browsing and buying patterns are everywhere. From credit card transactions and online shopping carts, to customer loyalty programs and user-generated ratings/reviews, there is a staggering amount of data that can be used to describe our past buying behaviors, predict future ones, and prescribe new ways to influence future purchasing decisions. In this brand new course, four of Wharton’s top marketing professors will dive deeper into the key areas of customer analytics: descriptive analytics, predictive analytics, prescriptive analytics, and their application to real-world business practices including Amazon, Google, and Starbucks to name a few.

Jun 8th 2026
5-12 Weeks
Bioinformatic Methods II (Coursera) Coursera
University of Toronto

Bioinformatic Methods II (Coursera)

Large-scale biology projects such as the sequencing of the human genome and gene expression surveys using RNA-seq, microarrays and other technologies have created a wealth of data for biologists. However, the challenge facing scientists is analyzing and even accessing these data to extract useful information pertaining to the system being studied. This course focuses on employing existing bioinformatic resources – mainly web-based programs and databases – to access the wealth of data to answer questions relevant to the average biologist, and is highly hands-on.

Jun 8th 2026
5-12 Weeks
The Structured Query Language (SQL) (Coursera) Coursera
University of Colorado Boulder

The Structured Query Language (SQL) (Coursera)

In this course you will learn all about the Structured Query Language ("SQL".) We will review the origins of the language and its conceptual foundations. But primarily, we will focus on learning all the standard SQL commands, their syntax, and how to use these commands to conduct analysis of the data within a relational database. Our scope includes not only the SELECT statement for retrieving data and creating analytical reports, but also includes the DDL ("Data Definition Language") and DML ("Data Manipulation Language") commands necessary to create and maintain database objects.

Jun 9th 2026
5-12 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks
Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 8th 2026
5-12 Weeks
Reproducible Research (Coursera) Coursera
Johns Hopkins University

Reproducible Research (Coursera)

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Jun 8th 2026
4 Weeks
Social Network Analysis (Coursera) Coursera
University of California, Davis

Social Network Analysis (Coursera)

This course is designed to quite literally ‘make a science’ out of something at the heart of society: social networks. Humans are natural network scientists, as we compute new network configurations all the time, almost unaware, when thinking about friends and family (which are particular forms of social networks), about colleagues and organizational relations (other, overlapping network structures), and about how to navigate delicate or opportunistic network configurations to save guard or advance in our social standing (with society being one big social network itself).

Jun 8th 2026
5-12 Weeks