EdX

Big Data Computing with Spark (edX)

Big Data Computing with Spark (edX)

Learn the theory and gain hands-on experience of big data systems, using Spark as the exemplary platform. Big data systems such as Hadoop and Spark emerge as enabling technologies in managing massive amounts of data across hundreds or even thousands of computing nodes. Meanwhile, cloud computing platforms have made these technologies easily accessible to individuals as well as large enterprises.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

This course exposes students to both the theory and hands-on experience of big data systems, using Spark as the exemplary platform.

What you'll learn

  • Spark programming using both RDD and DataFrame APIs
  • Useful packages including ML, GraphX/GraphFrames, and SparkStreaming
  • Spark internals and performance optimizations
  • Algorithm design for big data systems

Syllabus

Week 1: Overview, MapReduce, and Hadoop
Week 2-3: Spark Basics and RDD
Week 4: SparkSQL and MLib
Week 5: Spark internals
Week 6: Algorithm design for big data
Week 7: GraphX/GraphFrames
Week 8: Spark Streaming

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Industry 4.0: How to Revolutionize your Business (edX) EdX
The Hong Kong Polytechnic University,HKPolyUx

Industry 4.0: How to Revolutionize your Business (edX)

An introduction to the fourth industrial revolution, it's major systems and technologies and how new products and services will impact business and society. We have witnessed the power of mechanization in the early nineteen century, automation in the seventies, information and the internet in the last decades. But now, the adaptation of connected intelligence into the business and social fabrics is advancing at an astonishing speed, which will completely change the way we conduct business.

Self Paced
Self-Paced
Biostatistics for Big Data Applications (edX) EdX
University of Texas Medical Branch

Biostatistics for Big Data Applications (edX)

Learn data analysis basics for working with biomedical big data with practical hands-on examples using R. This course provides a broad foundation of statistical terms and concepts as well as an introduction to the R statistical software package. The topics covered are fundamental components of biostatistical methods used in both omics and population health research.

No sessions Available
5-12 Weeks
Analytics in Python (edX) EdX
Columbia University,ColumbiaX

Analytics in Python (edX)

Learn the fundamental of programming in Python and develop the ability to analyze data and make data-driven decisions. Data is the lifeblood of an organization. Competency in programming is an essential skill for successfully extracting information and knowledge from data. The goal of this course is to introduce learners to the basics of programming in Python and to give a working knowledge of how to use programs to deal with data.

This course is archived
5-12 Weeks
Knowledge Management and Big Data in Business (edX) EdX
The Hong Kong Polytechnic University,HKPolyUx

Knowledge Management and Big Data in Business (edX)

Learn why and how knowledge management and Big Data are vital to the new business era. The business landscape is changing so rapidly that traditional management, business and computing courses do not meet the needs for the next generation of workers in the business world. Most traditional methods are of a repetitive, rule-based nature and will be gradually replaced by Artificial Intelligence.

Self Paced
Self-Paced
Introduction to Apache Spark (edX) EdX
University of California, Berkeley

Introduction to Apache Spark (edX)

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals. Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

Not Available
Course Not Available
Enabling Technologies for Data Science and Analytics: The Internet of Things (edX) EdX
Columbia University,ColumbiaX

Enabling Technologies for Data Science and Analytics: The Internet of Things (edX)

Discover the relationship between Big Data and the Internet of Things (IoT). The Internet of Things is rapidly growing. It is predicted that more than 25 billion devices will be connected by 2020. In this data science course, you will learn about the major components of the Internet of Things and how data is acquired from sensors. You will also examine ways of analyzing event data, sentiment analysis, facial recognition software and how data generated from devices can be used to make decisions.

Self Paced
Self-Paced
Big Data and Education (edX) EdX
University of Pennsylvania,PennX

Big Data and Education (edX)

Learn the methods and strategies for using large-scale educational data to improve education and make discoveries about learning. Online and software-based learning tools have been used increasingly in education. This movement has resulted in an explosion of data, which can now be used to improve educational effectiveness and support basic research on learning.

Self Paced
Self-Paced
Introducción a la Ciencia de Datos y el Big Data (edX) EdX
Tecnológico de Monterrey,TecdeMonterreyX

Introducción a la Ciencia de Datos y el Big Data (edX)

Obtén un panorama general de lo que es Data Science o Ciencia de Datos y cómo aplicarla en las organizaciones. Aprende a tomar decisiones basadas en los datos. El futuro pertenece a la ciencia de datos y a quienes la entiendan. Al igual que el petróleo y el gas impulsaron las economías de los siglos XX y XXI, los datos impulsan cada vez mas la innovación y la economía global a medida que avanzamos hacia una nueva era denominada la revolución digital.

Self Paced
Self-Paced
UX Data Analysis (edX) EdX
HECMontrealX,HEC Montréal

UX Data Analysis (edX)

Become a UX data scientist! From qualitative data analysis to big data Web analytics, you will be able to leverage insights from data to make empirically-based recommendations. Do big data and UX speak to you? This MOOC will give you the methods and tools to analyze the whole spectrum of data we handle in UX, from qualitative user research and quantitative user testing data analysis to big data Web analytics.

Self Paced
Self-Paced
Introducción a los Sistemas de Información Gerencial (MIS): Una guía de supervivencia (edX) EdX
Universidad Carlos III de Madrid - UC3M,UC3Mx

Introducción a los Sistemas de Información Gerencial (MIS): Una guía de supervivencia (edX)

Obtén las habilidades y el conocimiento necesarios para tener éxito en un mundo corporativo dominado por los sistemas de información gerencial (SIG o MIS). Los omnipresentes Sistemas de Información Gerencial (SIG) o Management Information Systems (MIS) juegan un papel crítico en el actual panorama profesional. Desde los sistemas de gestión de las relaciones con los clientes, que gestionan las interacciones diarias con los clientes actuales y potenciales, hasta los sistemas gerenciales y financieros que emiten y pagan facturas, el día a día de la vida laboral está cada vez más controlado por estos sistemas de gestión, que dictan qué hacer y cómo hacerlo.

Self Paced
Self-Paced