EdX

Serverless Data Processing with Dataflow: Foundations (edX)

Offered by Google Cloud,
Serverless Data Processing with Dataflow: Foundations (edX)

This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. In this first course, we start with a refresher of what Apache Beam is and its relationship with Dataflow.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Next, we talk about the Apache Beam vision and the benefits of the Beam Portability framework. The Beam Portability framework achieves the vision that a developer can use their favorite programming language with their preferred execution backend. We then show you how Dataflow allows you to separate compute and storage while saving money, and how identity, access, and management tools interact with your Dataflow pipelines. Lastly, we look at how to implement the right security model for your use case on Dataflow.
This course is part of the Google Cloud Data Engineer Learning Path Professional Certificate.

What you'll learn

  • Demonstrate how Apache Beam and Cloud Dataflow work together to fulfill your organization’s data processing needs
  • Summarize the benefits of the Beam Portability Framework and enable it for your Dataflow pipelines
  • Enable Shuffle & Streaming Engine for batch & streaming pipelines respectively for maximum performance
  • Enable Flexible Resource Scheduling for more cost efficient performance
  • Select the right combination of IAM permissions for your Dataflow job
  • Implement best practices for a secure data processing environment

Syllabus

  1. Introduction

This module covers the course outline and does a quick refresh on the Apache Beam programming model and Google’s Dataflow managed service.

  1. Beam Portability

In this module we are going to learn about four sections, Beam Portability, Runner v2, Container Environments, and Cross-Language Transforms.

  1. Separating Compute and Storage with Dataflow

IIn this module we discuss how to separate compute and storage with Dataflow. This module contains four sections Dataflow, Dataflow Shuffle Service, Dataflow Streaming Engine, Flexible Resource Scheduling.

  1. IAM, Quotas, and Permissions

In this module, we talk about the different IAM roles, quotas, and permissions required to run Dataflow.

  1. Security

In this module, we will look at how to implement the right security model for your use case on Dataflow.

  1. Summary

In this course, we started with the refresher of what Apache Beam is, and its relationship with Dataflow.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Building Resilient Streaming Systems on GCP em Português Brasileiro (Coursera) Coursera
Google Cloud

Building Resilient Streaming Systems on GCP em Português Brasileiro (Coursera)

Este curso rápido sob demanda tem uma semana de duração e é baseado no Google Cloud Platform Big Data and Machine Learning Fundamentals. Por meio de videoaulas, demonstrações e laboratórios práticos, os participantes aprenderão a criar pipelines de dados de streaming usando o Google Cloud Pub/Sub e o Dataflow para a tomada de decisões em tempo real. Você também aprenderá a criar painéis para renderizar respostas personalizadas para vários tipos de público das partes interessadas.

Jun 15th 2026
1 Week
Microservices and Serverless (edX) EdX
IBM

Microservices and Serverless (edX)

Design, develop, deploy, manage and secure applications and solutions on public, private or hybrid cloud platforms. This course will introduce you to 12-factor apps and microservices, concepts that emerged to help organizations work better and faster in a cloud-native manner. You’ll then learn about serverless computing—how it works, what value it brings, and what are specific serverless technologies. You’ll get hands-on with IBM Cloud Functions, a serverless platform on IBM Cloud that lets you develop serverless apps with ease. Finally, you will learn to build and deploy applications using container images on the code engine.

Self Paced
Self-Paced
Hacking PostgreSQL: Data Access Methods (edX) EdX
Ural Federal University,UrFUx

Hacking PostgreSQL: Data Access Methods (edX)

Learn the science, engineering practices and hacking techniques of data access – core aspects of information processing in a database. This course is about data storage and data processing technologies with examples from PostgreSQL. It is geared toward database core developers, operation systems developers, system architects, and all those who want to understand databases in more detail.

No sessions available
13-24 Weeks
Predictive Modeling and Machine Learning with MATLAB (Coursera) Coursera
MathWorks

Predictive Modeling and Machine Learning with MATLAB (Coursera)

In this course, you will build on the skills learned in Exploratory Data Analysis with MATLAB and Data Processing and Feature Engineering with MATLAB to increase your ability to harness the power of MATLAB to analyze data relevant to the work you do. These skills are valuable for those who have domain knowledge and some exposure to computational tools, but no programming background.

Jun 22nd 2026
4 Weeks
Introduction to Serverless on Kubernetes (edX) EdX
Linux Foundation,LinuxFoundationX

Introduction to Serverless on Kubernetes (edX)

Learn how to build serverless functions that can be run on any cloud, without being restricted by limits on the execution duration, languages available, or the size of your code. With the advent of systems like AWS Lambda, the term serverless gained much popularity. However, many people are still unsure what it is for, and how it can help them build applications faster than traditional approaches. Other potential users are turned off by the arbitrary limits and lock-in of cloud-based serverless products.

Self Paced
Self-Paced
Data Processing and Analysis with Excel (edX) EdX
Rochester Institute of Technology,RITx

Data Processing and Analysis with Excel (edX)

Learn to use Excel to organize and clean data so it can be manipulated and analyzed. In this course, you will learn how to organize your data within the Microsoft Office Excel software tool. Once organized, we will discuss data cleaning. You will learn how to identify outliers and anomalies in the data, and how to identify and change data-types. Together we will develop a data analysis plan, after which we will apply analysis methods and tools, including exploratory analysis, evaluation of results, and comparison with other findings.

Self Paced
Self-Paced
Basic Data Processing and Visualization (Coursera) Coursera
University of California, San Diego

Basic Data Processing and Visualization (Coursera)

This is the first course in the four-course specialization Python Data Products for Predictive Analytics, introducing the basics of reading and manipulating datasets in Python. In this course, you will learn what a data product is and go through several Python libraries to perform data retrieval, processing, and visualization.

Jun 22nd 2026
5-12 Weeks
Data Collection and Processing with Python (Coursera) Coursera
University of Michigan

Data Collection and Processing with Python (Coursera)

This course teaches you to fetch and process data from services on the Internet. It covers Python list comprehensions and provides opportunities to practice extracting from and processing deeply nested data. You'll also learn how to use the Python requests module to interact with REST APIs and what to look for in documentation of those APIs. For the final project, you will construct a “tag recommender” for the flickr photo sharing site.

Jun 15th 2026
3 Weeks