Managing Big Data with R and Hadoop (FutureLearn)

Managing Big Data with R and Hadoop (FutureLearn)

Learn how to manage and analyse big data using the R programming language and Hadoop programming framework. This course will give you access to a virtual environment with installations of Hadoop, R and Rstudio to get hands-on experience with big data management. Several unique examples from statistical learning and related R code for map-reduce operations will be available for testing and learning.

Class Deals by MOOC List - Click here and see FutureLearn's Active Discounts, Deals, and Promo Codes.

Those with basic knowledge in statistical learning and R will better understand the methods behind and how to run them in parallel using map-reduce functions and Hadoop data storage. At the end of the course you will get access to RHadoop on a supercomputer at University of Ljubljana.

Syllabus

Week 1: Welcome to BIG DATA
Week 2: Working with Hadoop
Week 3: First steps in R and RHadoop
Week 4: Statistical learning with RHadoop: clustering
Week 5: Statistical learning with RHadoop: regression and classification

By the end of the course, you will:

  • Explore basic functionality of Apache Hadoop and of RHadoop
  • Experiment how to achieve performance of modern supercomputing
  • Experiment regression, clustering and classification with RHadoop
  • Investigate basic functionality of Bash terminal window
  • Knowledge about statistical learning to instances of data provided by edcators
  • How to do big data management with RHadoop on real supercomputer provided by Universiy of Ljubljana

Who is the course for?
This course is designed for people interested in data science, computational statistics and machine learning and have basic experiences with them. It will be also useful for advanced undergraduate students and first year PhD students in data analysis, statistics or bioinformatics, who wish to understand how to manage big data with Hadoop using R programming language.
We expect that the learners will also have basic experiences with linux and bash and working experiences with R and matrix operations. They should be also capable to download and run virtual machine.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Mathematical Biostatistics Boot Camp 1 (Coursera) Coursera
Johns Hopkins University

Mathematical Biostatistics Boot Camp 1 (Coursera)

This class presents the fundamental probability and statistical concepts used in elementary data analysis. It will be taught at an introductory level for students with junior or senior college-level mathematical training including a working knowledge of calculus. A small amount of linear algebra and programming are useful for the class, but not required.

Jun 22nd 2026
4 Weeks
Introduction to R for Data Science (FutureLearn) FutureLearn
Purdue University

Introduction to R for Data Science (FutureLearn)

Work with airline data to learn the fundamentals of the R platform. We live in a data-driven world. This course is relevant to learners who are interested in analyzing data that is pervasive across disciplines. Have you ever wondered how data-driven decisions are made across airlines? This course will use airline data to demonstrate key concepts involved in the analysis of big data.

No sessions available
4 Weeks
Genomic Medicine: Harnessing the Power of the Human Genome (FutureLearn) FutureLearn
University of Glasgow

Genomic Medicine: Harnessing the Power of the Human Genome (FutureLearn)

Using the latest genomics research, discover how genomic technologies are changing how we understand and treat medical conditions. Explore cutting-edge genomics data analysis tools and technology. This course will advance your understanding of the rapidly growing use of genomics in the research, diagnosis, and treatment of clinical conditions.

Available now
4 Weeks
Data to Insight: An Introduction to Data Analysis and Visualisation (FutureLearn) FutureLearn
University of Auckland

Data to Insight: An Introduction to Data Analysis and Visualisation (FutureLearn)

A hands-on introduction emphasizing key ideas, computer skills and statistical thinking. Data is everywhere and the lessons it contains can be the key to making good decisions. We want to give you skills and the confidence to dive into data using computer software and start making discoveries. You will learn key elements of data science and to start thinking like a statistician.

No sessions available
5-12 Weeks
Data Tells a Story: Reading Data in the Social Sciences and Humanities (FutureLearn) FutureLearn
Loughborough University

Data Tells a Story: Reading Data in the Social Sciences and Humanities (FutureLearn)

Learn about the role of data in a range of disciplines and about some fundamental tools for extracting knowledge from data. How can we answer questions about the world around us? How can we make decisions about what to do? Over the past years, more and more people have turned to data for help. Huge amounts of data are collected every day from millions of sources. This data has a lot to tell us! But data by itself is mute—it can only help us if we learn to make it speak and tell its story.

No sessions available
2 Weeks
Big Data: Measuring and Predicting Human Behaviour (FutureLearn) FutureLearn
The University of Warwick

Big Data: Measuring and Predicting Human Behaviour (FutureLearn)

Join us to explore how the vast amounts of data generated today can help us understand and even predict how humans behave. We increasingly rely on networked computer systems and smart cards to support our everyday activities, and everything we do generates data – whether buying bread at the supermarket, taking a ride on public transport, or calling a friend for a chat.

No sessions available
5-12 Weeks
Business Ethics: Exploring Big Data and Tax Avoidance (FutureLearn) FutureLearn
University of Leeds

Business Ethics: Exploring Big Data and Tax Avoidance (FutureLearn)

Learn why big data and tax avoidance are some of the biggest ethical issues facing businesses today and how they can be addressed. Explore the ethical complexities of big data and tax avoidance. Ethical behaviour brings significant benefits to businesses such as attracting employees, customers and investors. But failure to manage it properly can create huge challenges. On this course, you’ll discover big data and tax avoidance.

Feb 16th 2026
2 Weeks
Business Analytics Using Forecasting (FutureLearn) FutureLearn
National Tsing Hua University

Business Analytics Using Forecasting (FutureLearn)

Discover how business can harness the power of big data to make better predictive analysis. Learn how to use data to create powerful business forecasts. Organisations currently collect a vast quantity of data about suppliers, clients, employees, citizens, transactions, and much more. However, many are unaware of the predictive power this ‘big data’ has if anaylsed correctly. On this course, you’ll learn about forecasting using big data, exploring how it’s used by business as an important component of decision making.

Jul 15th 2024
5-12 Weeks
Interprofessional Healthcare Informatics (Coursera) Coursera
University of Minnesota

Interprofessional Healthcare Informatics (Coursera)

Interprofessional Healthcare Informatics is a graduate-level, hands-on interactive exploration of real informatics tools and techniques offered by the University of Minnesota and the University of Minnesota's National Center for Interprofessional Practice and Education. We will be incorporating technology-enabled educational innovations to bring the subject matter to life. Over the 10 modules, we will create a vital online learning community and a working healthcare informatics network.

Jun 22nd 2026
5-12 Weeks