Real-Time Analytics with Apache Storm (Udacity)

Offered by Udacity, Twitter,
Real-Time Analytics with Apache Storm (Udacity)

The world is trending in real time! Learn from Twitter to scalably process tweets, or any big data stream, in real-time to drive d3 visualizations using Apache Storm, the "Hadoop of Real Time." Storm is free, open source, and fun to use! Learn from Karthik Ramasamy, about the distributed, fault-tolerant, and flexible technology used to power Twitter’s real-time data flow pipeline. Twitter open sourced Storm in 2011, and it graduated to a top-level Apache project in September, 2014.

Class Deals by MOOC List - Click here and see Udacity's Active Discounts, Deals, and Promo Codes.

Starting from basic distributed concepts presented during our first Udacity-Twitter Storm Hackathon, link Storm concepts to Storm syntax to scalably drive Word Cloud visualizations with Vagrant, Ubuntu, Maven, Flask, Redis, and d3. Link to the public Twitter gardenhose stream to process live tweets, parse embedded URLs, and calculate Top worldwide hashtags. Extend beyond Storm basics by exploring multi-language capabilities in Python, integrate open source components, and implement real-time streaming joins.
In your final project, follow real-time trending topics by implementing the data pipeline to visualize only tweets that contain Top worldwide hashtags. Extend your project by exploring the Twitter API, or any data source, alongside Hackathon participants as they design their own ideas, receive feedback from Karthik, and open source a final project calculating real-time tweet sentiment and geolocation to drive a U.S. Map.
Learn by doing! The world is going real time. Batch processing, popularized by Hadoop, has latency exceeding required real-time demands of modern mobile, connected, always-on users. Stream processing with seconds-required response time is necessary to meet this demand. Twitter is a world leader in real-time processing at scale. Learn the future from the company defining it.

What You Will Learn

Lesson 1
Basic Storm Topologies

  • Link to a real-time d3 Word Cloud Visualization using Redis Flask and d3

Lesson 2
Storm Basics

  • Program Bolts link Spouts and connect to the live Twitter API to process real-time tweets
  • Explore open source components by connecting a Rolling Count Bolt to your topology to visualize Rolling Top Tweeted Words

Lesson 3
Beyond Storm Basics

  • Explore multi-language capabilities to download and parse real-time Tweeted URLs in Python using Beautiful Soup
  • Integrate complex open source bolts to calculate Top-N words to visualize real-time Top-N Hashtags
  • Use stream grouping concepts to easily create streaming join to connect and dynamically process multiple streams

Lesson 4
Final Project

  • Work on your final project and we cover additional questions and topics brought up by Hackathon participants
  • Explore Vagrant
  • VirtualBox
  • Redis
  • Flask
  • and d3 further if you are interested!

Lesson 5
Final Project: Construct a Storm Topology

  • Design a Storm Topology and new bolt that uses streaming joins to dynamically calculate Top-N Hashtags and display real-time tweets that contain trending Top Hashtags
  • Post your visualization to the forum and tweet them to your Twitter followers

Lesson 6
Project Extensions

  • Use additional features of the real-time Twitter sample stream or use any data source to drive your real-time d3 visualization

Prerequisites and Requirements
Programming language required: JavaTo be successful, you'll need intermediate knowledge of Java. Specifically, this is defined by experience and comfort with Java syntax, compile & run-time error diagnostics and debugging, ability to use javadocs as needed, and intermediate data structures including Arrays, HashMaps, and LinkedLists. If you need to build these skills, a good starting point is Udacity’s Introduction to Java with additional comfortability needed identifying and debugging compile & run-time errors. No prior experience is assumed in Ubuntu, git, Maven, Redis, Flask (Python) or d3 (Javascript). Python is useful, but optional. A basic course such as CS101 or OO in Python would be helpful.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Reproducible Research (Coursera) Coursera
Johns Hopkins University

Reproducible Research (Coursera)

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Jun 22nd 2026
4 Weeks
Responsive Images (Udacity) Udacity
Udacity,Google

Responsive Images (Udacity)

Fewer Bytes, Faster Loads. Did you know that images account for more than 60% of the bytes on average needed to load a web page? In this course you will learn how to work with images on the modern web, so that your images look great and load quickly on any device. Along the way, you will pick up a range of skills and techniques to smoothly integrate responsive images into your development workflow. By the end of the course, you will be developing with images that adapt and respond to different viewport sizes and usage scenarios.

Self Paced
Self-Paced
Introduction to Machine Learning Course (Udacity) Udacity
Udacity

Introduction to Machine Learning Course (Udacity)

This class will teach you the end-to-end process of investigating data through a machine learning lens. Learn online, with Udacity. Machine Learning is a first-class ticket to the most exciting careers in data analysis today. As data sources proliferate along with the computing power to process them, going straight to the data is one of the most straightforward ways to quickly gain insights and make predictions.

Self Paced
Self-Paced
Data Visualization and D3.js (Udacity) Udacity
Udacity,Zipfian Academy

Data Visualization and D3.js (Udacity)

Communicating with Data. Learn the fundamentals of data visualization and practice communicating with data. This course covers how to apply design principles, human perception, color theory, and effective storytelling to data visualization. If you present data to others, aspire to be an analyst or data scientist, or if you’d like to become more technical with visualization tools, then you can grow your skills with this course.

Self Paced
Self-Paced
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 22nd 2026
4 Weeks