Big Data: adquisición y almacenamiento de datos (Coursera)

Big Data: adquisición y almacenamiento de datos (Coursera)

¿Estás interesado en tener un conocimiento más detallado sobre las herramientas y aplicaciones Big Data? En este curso aprenderás los principios para comprender la terminología, conceptos básicos y herramientas más importantes para resolver problemas de análisis de datos enfocándonos en los problemas y las aplicaciones. El objetivo es proporcionar una visión de sistema para entender los retos más importantes que nos encontramos cuando trabajamos en entornos con grandes volúmenes de datos.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

En el curso se plantea una introducción a diversas herramientas utilizadas de forma común en la comunidad como Hadoop, Spark o Hive y tendrás que resolver diferentes retos de análisis de datos mediante su uso.
Al terminar el curso habrás adquirido conocimientos sobre el ecosistema de herramientas Big Data incluyendo ejemplos de uso con problemas industriales y científicos. Tendrás una serie de recursos sobre cómo un análisis a realizar se traduce en una serie de operaciones de recolección de datos, monitorización, almacenamiento, análisis y creación de informes sobre los resultados obtenidos. También adquirirás un criterio para elegir cuál es la herramienta más adecuada para resolver un cierto problema de análisis de datos a partir de los requerimientos de uso de las herramientas.
El curso está orientado tanto a estudiantes universitarios de primeros cursos de estudios universitarios relacionados con la informática, la ingeniería o las matemáticas, como a otros estudiantes con conocimientos de programación, interesados en aprender cómo utilizar de análisis de datos con herramientas de código abierto. Para realizar los ejercicios es necesario utilizar una máquina virtual que deberá ser instalada en tu ordenador.
Course 2 of 5 in the Big Data – Introducción al uso práctico de datos masivos Specialization.

Syllabus

WEEK 1
Introducción
La máquina virtual
A lo largo de estos cursos vamos a trabajar con un conjunto de herramientas contenidas en la máquina virtual Cloudera. En este apartado te explicamos cómo descargar e instalar dicha máquina virtual en tu ordenador. La MV-Cloudera requiere disponer de un equipo con las siguientes características: (1) máquina de 64 bits, (2) mínimo 6G de memoria (recomendable 8G), y (3) 20G disponibles en disco.
Ten en cuenta que bajar e instalar la máquina virtual te llevará tiempo dado el tamaño y complejidad de la misma
Introducción al ecosistema Apache Hadoop
En este módulo se van a introducir los conceptos básicos sobre el uso de Apache Hadoop y su utilización para plantear análisis de grandes conjuntos de datos. Se van a presentar las herramientas principales y la arquitectura del sistema.Visualiza los vídeos, contesta el cuestionario tantas veces como quieras, realiza el ejercicio práctico sobre Hadoop y HDFS, y accede a los foros para discutir los temas que te parezcan más interesantes.

WEEK 2
Tecnologías SQL y NoSQL. Consistencia, fiabilidad y escalabilidad
En este módulo se introducen conceptos básicos sobre la naturaleza de los datos a tratar y de qué forma los sistemas NoSQL se diferencian de las bases de datos relacionales. Se presenta el teorema CAP y se muestra su importancia en el contexto de los sistemas distribuidos. Finalmente, se muestran una serie de sistemas junto con su uso en la industria actual. Visualiza los vídeos, contesta el cuestionario tantas veces como quieras, y accede a los foros para discutir los temas que te parezcan más interesantes.

WEEK 3
Adquisición de datos
En este módulo se presentan los desafíos que hay que resolver a la hora de incorporar datos a los sistemas NoSQL y una breve introducción a las herramientas asociadas al ecosistema Hadoop más importantes. Visualiza los vídeos, contesta el cuestionario tantas veces como quieras, realiza el ejercicio práctico sobre Apache Scoop, y accede a los foros para discutir los temas que te parezcan más interesantes.

WEEK 4
Herramientas para el análisis de datos industrial
En este módulo se presenta el análisis industrial de grandes volúmenes de datos y se introducen una serie de herramientas y sistemas de segunda generación dedicados a resolver necesidades específicas de la industria.Visualiza los vídeos, contesta el cuestionario tantas veces como quieras, realiza los ejercicios prácticos sobre Apache Hive y Sparck, y accede a los foros para discutir los temas que te parezcan más interesantes.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Machine Learning With Big Data (Coursera) Coursera
University of California, San Diego

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems.

Jun 8th 2026
5-12 Weeks
Problem Solving with Excel (Coursera) Coursera
PwC

Problem Solving with Excel (Coursera)

This course explores Excel as a tool for solving business problems. In this course you will learn the basic functions of excel through guided demonstration. Each week you will build on your excel skills and be provided an opportunity to practice what you’ve learned. Finally, you will have a chance to put your knowledge to work in a final project. This course was created by PricewaterhouseCoopers LLP with an address at 300 Madison Avenue, New York, New York, 10017.

Jun 8th 2026
4 Weeks
Managing Big Data with MySQL (Coursera) Coursera
Duke University

Managing Big Data with MySQL (Coursera)

This course is an introduction to how to use relational databases in business analysis. You will learn how relational databases work, and how to use entity-relationship diagrams to display the structure of the data held within them. This knowledge will help you understand how data needs to be collected in business contexts, and help you identify features you want to consider if you are involved in implementing new data collection efforts.

Jun 8th 2026
5-12 Weeks
Big Data Modeling and Management Systems (Coursera) Coursera
University of California, San Diego

Big Data Modeling and Management Systems (Coursera)

Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools.

Jun 8th 2026
5-12 Weeks
Marketing Analytics (Coursera) Coursera
University of Virginia

Marketing Analytics (Coursera)

Organizations large and small are inundated with data about consumer choices. But that wealth of information does not always translate into better decisions. Knowing how to interpret data is the challenge -- and marketers in particular are increasingly expected to use analytics to inform and justify their decisions. Marketing analytics enables marketers to measure, manage and analyze marketing performance to maximize its effectiveness and optimize return on investment (ROI). Beyond the obvious sales and lead generation applications, marketing analytics can offer profound insights into customer preferences and trends, which can be further utilized for future marketing and business decisions.

Jun 8th 2026
5-12 Weeks
Infonomics II: Business Information Management and Measurement (Coursera) Coursera
University of Illinois at Urbana-Champaign

Infonomics II: Business Information Management and Measurement (Coursera)

Even decades into the Information Age, accounting practices yet fail to recognize the financial value of information. Moreover, traditional asset management practices fail to recognize information as an asset to be managed with earnest discipline. This has led to a business culture of complacence, and the inability for most organizations to fully leverage available information assets. This second course in the two-part Infonomics series explores how and why to adapt well-honed asset management principles and practices to information, and how to apply accepted and new valuation models to gauge information’s potential and realized economic benefits.

Jun 10th 2026
4 Weeks
Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera)

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data! In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information.

Jun 8th 2026
4 Weeks
Business Analytics for Decision Making (Coursera) Coursera
University of Colorado Boulder

Business Analytics for Decision Making (Coursera)

In this course you will learn how to create models for decision making. We will start with cluster analysis, a technique for data reduction that is very useful in market segmentation. You will then learn the basics of Monte Carlo simulation that will help you model the uncertainty that is prevalent in many business decisions.

Jun 8th 2026
4 Weeks
Business Intelligence Concepts, Tools, and Applications (Coursera) Coursera
University of Colorado System

Business Intelligence Concepts, Tools, and Applications (Coursera)

This is the fourth course in the Data Warehouse for Business Intelligence specialization. Ideally, the courses should be taken in sequence. In this course, you will gain the knowledge and skills for using data warehouses for business intelligence purposes and for working as a business intelligence developer. You’ll have the opportunity to work with large data sets in a data warehouse environment and will learn the use of MicroStrategy's Online Analytical Processing (OLAP) and Visualization capabilities to create visualizations and dashboards.

Jun 8th 2026
5-12 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 8th 2026
4 Weeks
Introduction to Spreadsheets and Models (Coursera) Coursera
University of Pennsylvania

Introduction to Spreadsheets and Models (Coursera)

The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have know how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future.

Jun 8th 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks