Fundamentals of Software Architecture for Big Data (Coursera)

Fundamentals of Software Architecture for Big Data (Coursera)

The course is intended for individuals looking to understand the basics of software engineering as they relate to building large software systems that leverage big data. You will be introduced to software engineering concepts necessary to build and scale large, data intensive, distributed systems. Starting with software engineering best practices and loosely coupled, highly cohesive data microservices, the course takes you through the evolution of a distributed system over time.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Fundamentals of Software Architecture for Big Data can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics.

What You Will Learn

  • Practice software engineering fundamentals; test first development, refactoring, continuous integration, and continuous delivery.
  • Architect and create a big data or distributed system using rest collaboration, event collaboration, and batch processing.
  • Create a performant, scalable distributed system that handles big data.

Syllabus

WEEK 1
Software Engineering Overview
In this module you will learn the basics of modern software engineering. You will learn how our industry progresses over time, practice test driven development, and implement widely used data structures.

WEEK 2
Fundamentals of Software Architecture
In this module you will learn the fundamentals of software architecture. You will learn how to evolve an architecture over time, how to work within a large codebase, and a bit about blockchain.

WEEK 3
Fundamentals of Production Software
In this module you will learn the fundamentals of monitoring software in production. You will learn how to create reliable background jobs, how to calculate and communicate service availability, and how to implement production metrics and monitoring.

WEEK 4
Fundamentals of Software Architecture for Big Data
In this module you will learn the fundamentals of production quality databases and messaging systems. You will learn to understand the tradeoffs between consistency and availability, how to implement database transactions to improve consistency, and how to implement messaging systems to improve availability.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Deploying Machine Learning Models (Coursera) Coursera
University of California, San Diego

Deploying Machine Learning Models (Coursera)

In this course we will learn about Recommender Systems (which we will study for the Capstone project), and also look at deployment issues for data products. By the end of this course, you should be able to implement a working recommender system (e.g. to predict ratings, or generate lists of related products), and you should understand the tools and techniques required to deploy such a working system on real-world, large-scale datasets.

Jun 22nd 2026
4 Weeks
Managing Big Data in Clusters and Cloud Storage (Coursera) Coursera
Cloudera

Managing Big Data in Clusters and Cloud Storage (Coursera)

In this course, you'll learn how to manage big datasets, how to load them into clusters and cloud storage, and how to apply structure to the data so that you can run queries on it using distributed SQL engines like Apache Hive and Apache Impala. You’ll learn how to choose the right data types, storage systems, and file formats based on which tools you’ll use and what performance you need.

Jun 22nd 2026
5-12 Weeks
Managing Big Data with MySQL (Coursera) Coursera
Duke University

Managing Big Data with MySQL (Coursera)

This course is an introduction to how to use relational databases in business analysis. You will learn how relational databases work, and how to use entity-relationship diagrams to display the structure of the data held within them. This knowledge will help you understand how data needs to be collected in business contexts, and help you identify features you want to consider if you are involved in implementing new data collection efforts.

Jun 22nd 2026
5-12 Weeks
Excel Power Tools for Data Analysis (Coursera) Coursera
Macquarie University

Excel Power Tools for Data Analysis (Coursera)

Welcome to Excel Power Tools for Data Analysis. In this four-week course, we introduce Power Query, Power Pivot and Power BI, three power tools for transforming, analysing and presenting data. Excel's ease and flexibility have long made it a tool of choice for doing data analysis, but it does have some inherent limitations: for one, truly "big" data simply does not fit in a spreadsheet and for another, the process of importing and cleaning data can be a repetitive, time-consuming and error-prone.

Jun 22nd 2026
4 Weeks
Engineering Practices for Building Quality Software (Coursera) Coursera
University of Minnesota

Engineering Practices for Building Quality Software (Coursera)

Agile embraces change which means that team should be able to effectively make changes to the system as team learns about users and market. To be good at effectively making changes to the system, teams need to have engineering rigor and excellence else embracing change becomes very painful and expensive. In this course, you will learn about engineering practices and processes that agile and traditional teams use to make sure the team is prepared for change. In additional, you will also learn about practices, techniques and processes that can help team build high quality software. You will also learn how to calculate a variety of quantitative metrics related to software quality.

Jun 22nd 2026
4 Weeks
Big Data, Genes, and Medicine (Coursera) Coursera
The State University of New York

Big Data, Genes, and Medicine (Coursera)

This course distills for you expert knowledge and skills mastered by professionals in Health Big Data Science and Bioinformatics. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine that will be intertwined with the science of Big Data and skills to harness the avalanche of data openly available at your fingertips and which we are just starting to make sense of.

Jun 22nd 2026
5-12 Weeks
Software Engineering: Software Design and Project Management (Coursera) Coursera
The Hong Kong University of Science and Technology - HKUST

Software Engineering: Software Design and Project Management (Coursera)

Software Development Life Cycle (SDLC) is the process of developing software through planning, requirement analysis, design, implementation, testing, and maintenance. This course focuses on the project planning and analysis/design phases of SDLC, and you will learn about different architectural patterns and design patterns to solve common problems in software design. It covers project planning, scheduling, and cost estimating, which are the principal tasks of software project managers.

Jun 22nd 2026
3 Weeks
Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera)

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data! In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information.

Jun 22nd 2026
4 Weeks
Scalable Machine Learning on Big Data using Apache Spark (Coursera) Coursera
IBM

Scalable Machine Learning on Big Data using Apache Spark (Coursera)

This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark. Most real world machine learning work involves very large data sets that go beyond the CPU, memory and storage limitations of a single computer. Apache Spark is an open source framework that leverages cluster computing and distributed storage to process extremely large data sets in an efficient and cost effective manner. Therefore an applied knowledge of working with Apache Spark is a great asset and potential differentiator for a Machine Learning engineer.

Jun 22nd 2026
4 Weeks
Big Data Analysis with Scala and Spark (Scala 2 version) (Coursera) Coursera
École Polytechnique Fédérale de Lausanne

Big Data Analysis with Scala and Spark (Scala 2 version) (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written in Scala. In this course, we'll see how the data parallel paradigm can be extended to the distributed case, using Spark throughout.

Jun 22nd 2026
4 Weeks
Software Engineering: Implementation and Testing (Coursera) Coursera
The Hong Kong University of Science and Technology - HKUST

Software Engineering: Implementation and Testing (Coursera)

Software Development Life Cycle (SDLC) is the process of developing software through planning, requirement analysis, design, implementation, testing, and maintenance. This course focuses on the implementation and testing phases of SDLC, and you will examine different software development processes for large software systems development, and understand the strengths (pros) and weaknesses (cons) of different software development processes.

Jun 22nd 2026
5-12 Weeks