Serverless Data Processing with Dataflow: Develop Pipelines (Coursera)

Offered by Google Cloud,
Serverless Data Processing with Dataflow: Develop Pipelines (Coursera)

In this second installment of the Dataflow course series, we are going to be diving deeper on developing pipelines using the Beam SDK. We start with a review of Apache Beam concepts. Next, we discuss processing streaming data using windows, watermarks and triggers. We then cover options for sources and sinks in your pipelines, schemas to express your structured data, and how to do stateful transformations using State and Timer APIs.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

We move onto reviewing best practices that help maximize your pipeline performance. Towards the end of the course, we introduce SQL and Dataframes to represent your business logic in Beam and how to iteratively develop pipelines using Beam notebooks.

Syllabus

WEEK 1
Introduction
This module covers the course outline
Beam Concepts Review
Review main concepts of Apache Beam, and how to apply them to write your own data processing pipelines.
Windows, Watermarks Triggers
In this module, you will learn about how to process data in streaming with Dataflow. For that, there are three main concepts that you need to learn: how to group data in windows, the importance of watermark to know when the window is ready to produce results, and how you can control when and how many times the window will emit output.
Sources & Sinks
In this module, you will learn about what makes sources and sinks in Google Cloud Dataflow. The module will go over some examples of Text IO, FileIO, BigQueryIO, PubSub IO, KafKa IO, BigTable IO, Avro IO, and Splittable DoFn. The module will also point out some useful features associated with each IO.

WEEK 2
Schemas
This module will introduce schemas, which give developers a way to express structured data in their Beam pipelines.
State and Timers
This module covers State and Timers, two powerful features that you can use in your DoFn to implement stateful transformations.
Best Practices
This module will discuss best practices and review common patterns that maximize performance for your Dataflow pipelines.

WEEK 3
Dataflow SQL & DataFrames
This modules introduces two new APIs to represent your business logic in Beam: SQL and Dataframes.
Beam Notebooks
This module will cover Beam notebooks, an interface for Python developers to onboard onto the Beam SDK and develop their pipelines iteratively in a Jupyter notebook environment.
Summary
This module provides a recap of the course

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Windows OS Forensics (Coursera) Coursera
Infosec

Windows OS Forensics (Coursera)

The Windows OS Forensics course covers windows file systems, Fat32, ExFat, and NTFS. You will learn how these systems store data, what happens when a file gets written to disc, what happens when a file gets deleted from disc, and how to recover deleted files. You will also learn how to correctly interpret the information in the file system data structures, giving the student a better understanding of how these file systems work. This knowledge will enable you to validate the information from multiple forensic tools properly.

Jun 15th 2026
5-12 Weeks
Process Data from Dirty to Clean (Coursera) Coursera
Google

Process Data from Dirty to Clean (Coursera)

This is the fourth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. In this course, you’ll continue to build your understanding of data analytics and the concepts and tools that data analysts use in their work. You’ll learn how to check and clean your data using spreadsheets and SQL as well as how to verify and report your data cleaning results. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 16th 2026
5-12 Weeks
Developing Applications with SQL, Databases, and Django (Coursera) Coursera
IBM

Developing Applications with SQL, Databases, and Django (Coursera)

The essentials of application development are accessing, processing, and presenting data. Data is stored in various databases, either on-premise or on the cloud, and developers will need to learn how to talk to them via programming languages. In this course, you will be introduced to some fundamental database concepts. You will learn the basics of SQL, a simple and powerful programming language for querying and managing data. And you will learn about cloud database fundamentals and get hands-on cloud database experiences.

Jun 15th 2026
5-12 Weeks
Cloud Data Security (Coursera) Coursera
University of Minnesota

Cloud Data Security (Coursera)

This course gives learners an opportunity to explore data security in the cloud. In this course, learners will: dive into the data services offered by cloud providers and compare their security features; analyze a data breach and trace it back to the vulnerability that made it possible; learn about database injection and aggregation attacks; follow the life cycle of a data item and its relationship to privacy and integrity; associate modern privacy requirements with US and European laws.

Jun 15th 2026
4 Weeks
Prepare Data for Exploration (Coursera) Coursera
Google

Prepare Data for Exploration (Coursera)

This is the third course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. As you continue to build on your understanding of the topics from the first two courses, you’ll also be introduced to new topics that will help you gain practical data analytics skills. You’ll learn how to use tools like spreadsheets and SQL to extract and make use of the right data for your objectives and how to organize and protect your data. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 15th 2026
5-12 Weeks
Data Collection and Processing with Python (Coursera) Coursera
University of Michigan

Data Collection and Processing with Python (Coursera)

This course teaches you to fetch and process data from services on the Internet. It covers Python list comprehensions and provides opportunities to practice extracting from and processing deeply nested data. You'll also learn how to use the Python requests module to interact with REST APIs and what to look for in documentation of those APIs. For the final project, you will construct a “tag recommender” for the flickr photo sharing site.

Jun 15th 2026
3 Weeks
Microsoft Azure SQL (Coursera) Coursera
Microsoft

Microsoft Azure SQL (Coursera)

In this course, you will learn the fundamentals of database concepts in a cloud environment, get basic skilling in cloud data services, and build your foundational knowledge of cloud data services within Microsoft Azure. You will explore relational data offerings, provisioning and deploying relational databases, and querying relational data through cloud data solutions with Microsoft Azure. You will learn you'll learn about SQL. You'll see how it's used to query and maintain data in a database, and the different dialects that are available.

Jun 15th 2026
2 Weeks
Data Science Companion (Coursera) Coursera
MathWorks

Data Science Companion (Coursera)

The Data Science Companion provides an introduction to data science. You will gain a quick background in data science and core machine learning concepts, such as regression and classification. You’ll be introduced to the practical knowledge of data processing and visualization using low-code solutions, as well as an overview of the ways to integrate multiple tools effectively to solve data science problems.

Jun 19th 2026
4 Weeks
SQL for Data Science Capstone Project (Coursera) Coursera
University of California, Davis

SQL for Data Science Capstone Project (Coursera)

Data science is a dynamic and growing career field that demands knowledge and skills-based in SQL to be successful. This course is designed to provide you with a solid foundation in applying SQL skills to analyze data and solve real business problems. Whether you have successfully completed the other courses in the Learn SQL Basics for Data Science Specialization or are taking just this course, this project is your chance to apply the knowledge and skills you have acquired to practice important SQL querying and solve problems with data. You will participate in your own personal or professional journey to create a portfolio-worthy piece from start to finish.

Jun 15th 2026
4 Weeks