EdX

Introduction to Apache Spark (edX)

Introduction to Apache Spark (edX)

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals. Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL.
This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.
What you'll learn:

  • Basic Spark architecture
  • Common operations
  • How to avoid coding mistakes
  • How to debug your Spark program
Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Introduction to Java Programming - Part 2 (edX) EdX
The Hong Kong University of Science and Technology - HKUST,HKUSTx

Introduction to Java Programming - Part 2 (edX)

The first MOOC to teach the fundamental elements of Java programming and data abstraction. Do you want to become a better problem solver? This Java course will provide you with a strong understanding of basic Java programming elements and data abstraction using problem representation and the object-oriented framework.

Self Paced
Self-Paced
The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX) EdX
University of California, Berkeley,BerkeleyX

The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX)

A computer science principles course for anyone who wants to learn how to translate ideas into code. Discover the big ideas and thinking practices in computer science plus learn how to code using one of the friendliest programming languages, Snap! (based on Scratch).

No sessions available
13-24 Weeks
Programming in Scratch (edX) EdX
Harvey Mudd College,HarveyMuddX

Programming in Scratch (edX)

See how easy learning computer science can be. Use Scratch to create games, animations, stories and more. Want to learn computer programming, but unsure where to begin? This is the course for you! Scratch is the computer programming language that makes it easy and fun to create interactive stories, games and animations and share them online.

No sessions available
5-12 Weeks
HTML5 Apps and Games (edX) EdX
World Wide Web Consortium - W3C,W3Cx

HTML5 Apps and Games (edX)

Today, developers are increasingly moving from native to HTML5-based apps. Increase your ability to design and deliver innovative services on the Web! Want to learn advanced HTML5 tips and techniques? This is the course for you! Find out more about the powerful Web features that will help you create great content and apps.

Self Paced
Self-Paced
Wiretaps to Big Data: Privacy and Surveillance in the Age of Interconnection (edX) EdX
Cornell University

Wiretaps to Big Data: Privacy and Surveillance in the Age of Interconnection (edX)

Explore the privacy issues of an interconnected world. How does cellular technology enable massive surveillance? Do users have rights against surveillance? How does surveillance affect how we use cellular and other technologies? How does it affect our democratic institutions? Do you know that the metadata collected by a cellular network speaks volumes about its users? In this course you will explore all of these questions while investigating related issues in WiFi and Internet surveillance.

No sessions available
5-12 Weeks
HTML5 Coding Essentials and Best Practices (edX) EdX
World Wide Web Consortium - W3C,W3Cx

HTML5 Coding Essentials and Best Practices (edX)

Learn how to write Web pages and Web sites by mastering HTML5 coding techniques and best practices. HTML5 is the standard language of the Web, developed by W3C. For application developers and industry, HTML5 represents a set of features that people will be able to rely on for years to come. HTML5 is supported on a wide variety of devices, lowering the cost of creating rich applications to reach users everywhere.

Self Paced
Self-Paced
Machine Learning with Python: from Linear Models to Deep Learning (edX) EdX
MIT,MITx

Machine Learning with Python: from Linear Models to Deep Learning (edX)

An in-depth introduction to the field of machine learning, from linear models to deep learning and reinforcement learning, through hands-on Python projects. Machine learning methods are commonly used across engineering and sciences, from computer systems to physics. Moreover, commercial sites such as search engines, recommender systems (e.g., Netflix, Amazon), advertisers, and financial institutions employ machine learning algorithms for content recommendation, predicting customer behavior, compliance, or risk.

May 27th 2024
13-24 Weeks
Enabling Technologies for Data Science and Analytics: The Internet of Things (edX) EdX
Columbia University,ColumbiaX

Enabling Technologies for Data Science and Analytics: The Internet of Things (edX)

Discover the relationship between Big Data and the Internet of Things (IoT). The Internet of Things is rapidly growing. It is predicted that more than 25 billion devices will be connected by 2020. In this data science course, you will learn about the major components of the Internet of Things and how data is acquired from sensors. You will also examine ways of analyzing event data, sentiment analysis, facial recognition software and how data generated from devices can be used to make decisions.

Self Paced
Self-Paced
Introduction to Management Information Systems (MIS): A Survival Guide (edX) EdX
Universidad Carlos III de Madrid - UC3M,UC3Mx

Introduction to Management Information Systems (MIS): A Survival Guide (edX)

Gain the skills and knowledge needed to succeed in an MIS-dominated corporate world. This MIS course will cover supporting tech infrastructures (Cloud, Databases, Big Data), the MIS development/ procurement process, and the main integrated systems, ERPs, such as SAP®, Oracle® or Microsoft Dynamics Navision®, as well as their relationship with Business Process Redesign.

Self Paced
Self-Paced