Process Data from Dirty to Clean (Coursera)

Offered by Google,
Process Data from Dirty to Clean (Coursera)

This is the fourth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. In this course, you’ll continue to build your understanding of data analytics and the concepts and tools that data analysts use in their work. You’ll learn how to check and clean your data using spreadsheets and SQL as well as how to verify and report your data cleaning results. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Learners who complete this certificate program will be equipped to apply for introductory-level jobs as data analysts. No previous experience is necessary.
By the end of this course, you will be able to do the following:

  • Learn how to check for data integrity.
  • Discover data cleaning techniques using spreadsheets.
  • Develop basic SQL queries for use on databases.
  • Apply basic SQL functions for cleaning and transforming data.
  • Gain an understanding of how to verify the results of cleaning data.
  • Explore the elements and importance of data cleaning reports.

Course 4 of 8 in the Google Data Analytics Professional Certificate.
What You Will Learn

  • Define data integrity with reference to types of integrity and risk to data integrity
  • Apply basic SQL functions for use in cleaning string variables in a database
  • Develop basic SQL queries for use on databases
  • Describe the process involved in verifying the results of cleaning data

Syllabus

WEEK 1
Before you clean, check for integrity
As you start thinking about how to prepare your data for exploration, this part of the course will highlight why data integrity is so essential to successful decision-making. You’ll learn about how data is generated and the techniques analysts use to decide what data to collect for analysis. And you'll discover structured and unstructured data, data types, and data formats.

WEEK 2
All about clean data
Every data analyst wants clean data to work with when performing an analysis. In this part of the course, you’ll learn the difference between clean and dirty data. You’ll also explore data cleaning techniques using spreadsheets and other tools.

WEEK 3
Cleaning data in SQL
Knowing a variety of ways to clean data can make an analyst’s job much easier. In this part of the course, you’ll check out how to clean your data using SQL. You’ll explore queries and functions that you can use in SQL to clean and transform your data to get it ready for analysis.

WEEK 4
Verify and report your cleaning results
Cleaning your data is an important step in the data analysis process. Verifying and reporting your cleaning is a way to show that your data is ready for the next step. In this part of the course, you'll find out the processes involved with verifying and reporting data cleaning as well as their benefits.

WEEK 5
Optional: Adding data to your resume
Creating an effective resume will help you on your data analytics career path. In this part of the course, you’ll learn all about the job application process with a focus on crafting a resume that highlights your strengths and applicable experience. Even if you aren't applying to jobs yet, it's still a good time to improve your resume. It's like spring training for a first season in a major league–you don't want to miss it!

WEEK 6
Course challenge
Prepare for the course challenge by reviewing terms and definitions in the glossary. Then, demonstrate your knowledge of the importance of sample size, data integrity, and the connection of data to business objectives during the quiz. You will also have an opportunity to apply your skill with data cleaning techniques in both spreadsheets and SQL. Finally, document, report on, and verify your data-cleaning process and results.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Interprofessional Healthcare Informatics (Coursera) Coursera
University of Minnesota

Interprofessional Healthcare Informatics (Coursera)

Interprofessional Healthcare Informatics is a graduate-level, hands-on interactive exploration of real informatics tools and techniques offered by the University of Minnesota and the University of Minnesota's National Center for Interprofessional Practice and Education. We will be incorporating technology-enabled educational innovations to bring the subject matter to life. Over the 10 modules, we will create a vital online learning community and a working healthcare informatics network.

Jun 22nd 2026
5-12 Weeks
Foundations of strategic business analytics (Coursera) Coursera
ESSEC Business School

Foundations of strategic business analytics (Coursera)

Who is this course for? This course is designed for students, business analysts, and data scientists who want to apply statistical knowledge and techniques to business contexts. For example, it may be suited to experienced statisticians, analysts, engineers who want to move more into a business role. You will find this course exciting and rewarding if you already have a background in statistics, can use R or another programming language and are familiar with databases and data analysis techniques such as regression, classification, and clustering.

Jun 22nd 2026
4 Weeks
Introduction to Probability and Data with R (Coursera) Coursera
Duke University

Introduction to Probability and Data with R (Coursera)

This course introduces you to sampling and exploring data, as well as basic probability theory and Bayes' rule. You will examine various types of sampling methods, and discuss how such methods can impact the scope of inference. A variety of exploratory data analysis techniques will be covered, including numeric summary statistics and basic data visualization.

Jun 22nd 2026
5-12 Weeks
Framework for Data Collection and Analysis (Coursera) Coursera
University of Maryland, College Park

Framework for Data Collection and Analysis (Coursera)

This course will provide you with an overview over existing data products and a good understanding of the data collection landscape. With the help of various examples you will learn how to identify which data sources likely matches your research question, how to turn your research question into measurable pieces, and how to think about an analysis plan.

Jun 22nd 2026
4 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 22nd 2026
4 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 22nd 2026
4 Weeks
Managing Big Data with MySQL (Coursera) Coursera
Duke University

Managing Big Data with MySQL (Coursera)

This course is an introduction to how to use relational databases in business analysis. You will learn how relational databases work, and how to use entity-relationship diagrams to display the structure of the data held within them. This knowledge will help you understand how data needs to be collected in business contexts, and help you identify features you want to consider if you are involved in implementing new data collection efforts.

Jun 22nd 2026
5-12 Weeks
Big Data Modeling and Management Systems (Coursera) Coursera
University of California, San Diego

Big Data Modeling and Management Systems (Coursera)

Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools.

Jun 22nd 2026
5-12 Weeks
Relational Database Support for Data Warehouses (Coursera) Coursera
University of Colorado System

Relational Database Support for Data Warehouses (Coursera)

Relational Database Support for Data Warehouses is the third course in the Data Warehousing for Business Intelligence specialization. In this course, you'll use analytical elements of SQL for answering business intelligence questions. You'll learn features of relational database management systems for managing summary data commonly used in business intelligence reporting. Because of the importance and difficulty of managing implementations of data warehouses, we'll also delve into storage architectures, scalable parallel processing, data governance, and big data impacts. In the assignments in this course, you can use either Oracle or PostgreSQL.

Jun 22nd 2026
5-12 Weeks
Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jun 22nd 2026
5-12 Weeks
Pattern Discovery in Data Mining (Coursera) Coursera
University of Illinois at Urbana-Champaign

Pattern Discovery in Data Mining (Coursera)

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for data-driven phrase mining and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.

Jun 22nd 2026
4 Weeks