Big Data: procesamiento y análisis (Coursera)

Big Data: procesamiento y análisis (Coursera)

El presente curso tiene como objetivo presentar los métodos y técnicas básicos para el procesamiento y análisis de datos en el contexto de Big Data. No prentende ser un curso exhaustivo sobre Machine Learning ni sobre métodos Estadísticos, simplemente se pretenden mostrar las características principales de estas técnicas para que el alumno pueda tener una visión general de las opciones que ofrece el análisis de datos para poder explorar, confirmar indicios y en definitiva, extraer conclusiones.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

El curso está dirigido a estudiantes y profesionales que deseen aproximarse al procesamiento y análisis de datos en Big Data. Aunque no es un requisito indispensable tener experiencia en análisis de datos o en entornos Big Data, el curso puede resultar especialmente interesante a estudiantes con ciertos conocimientos de análisis de datos que deseen introducirse en el entorno Big Data, por otro lado, también resultará interesante a aquellos estudiantes con cierta experiencia en entornos Big Data que deseen adquirir una mayor visión analítica.
En este sentido el curso pretende ofrecer recursos realistas en el contexto Big Data y por este motivo se trabajará des de una máquina virtual con la aplicación Jupyter como enlace para desarrollar los modelos y técnicas con PySpark.

Course 3 of 5 in the Big Data – Introducción al uso práctico de datos masivos Specialization

Syllabus

WEEK 1
Introducción
La máquina virtual
ATENCIÓN: Si ya te instalaste la máquina virtual en el curso anterior de la Especialización no es necesario que vuelvas a hacerlo. En caso contrario, sigue leyendo.Los ejercicios y sesiones prácticas pretenden mostrar un caso práctico de procesamiento y análisis de datos en el contexto de Big Data. En este sentido, será necesario trabajar con una máquina virtual que ya trae configuradas e instaladas una serie de componentes habituales al manejar Big Data. En este apartado te explicamos cómo descargar e instalar la máquina virtual Cloudera en tu ordenador. La MV-Cloudera requiere disponer de un equipo con las siguientes características: (1) máquina de 64 bits, (2) mínimo 6G de memoria (recomendable 8G), y (3) 20G disponibles en disco.Ten en cuenta que bajar e instalar la máquina virtual te llevará tiempo dado el tamaño y complejidad de la misma
Material de práticas y ficheiro de trabajo
Para poder seguir la parte aplicada del curso, responder a los cuestionarios y trabajar con las herramientas que te explicamos, necesitarás acceder a una serie de ficheros de código, así como las bases de datos de trabajo, que hemos recopilado y comprimido. Verás que algunos vídeos llevan un código entre paréntesis que coincide con el nombre de alguno de estos ficheros. Esto significa que en el vídeo correspondiente se trabaja con dicho fichero.
A continuación te explicamos como incorporarlos en la máquina virtual.
Análisis Exploratorio de Datos
Durante la primera semana del curso se introducen el curso y las herramientas que se emplearán. Además también se presentan las tareas relacionadas con el Análisis Exploratorio de Datos.

WEEK 2
Modelos de regresión
En el módulo 2 del curso se introducen conceptos de modelización generales (calibración y validación) y en particular los modelos de regresión lineal y regresión logística. Desde la perspectiva de Big Data, se incluyen aspectos relacionados con la regularización de los modelos para su simplificación. Como en el módulo anterior, visualiza los vídeos, contesta los cuestionarios tantas veces como quieras, y accede a los foros para discutir los temas que te parezcan más interesantes.

WEEK 3
Árboles de regresión y clasificación
En el módulo 3 del curso se introduce la família de modelos basada en árboles (clasificación, regresión, bosques) y aspectos generales sobre la incertidumbre y el sobreajuste. Después de cada tema, o de unos pocos temas, encontrarás un cuestionario para comprobar tu nivel de comprensión de los mismos.Visualiza los vídeos, contesta los cuestionarios tantas veces como quieras, y accede a los foros para discutir los temas que te parezcan más interesantes.

WEEK 4
Redes neuronales y técnicas no supervisadas
En el módulo 4 del curso se introduce la família de modelos basada en redes neuronales así como se introducen las técnicas básicas no supervisadas, tanto de clasificación automática como de reducción de la dimensionalidad. En este módulo, además de los cuestionarios convencionales, tendrás que realizar un trabajo práctico en el que trabajarás las técnicas aprendidas hasta el momento. Visualiza los vídeos, contesta los cuestionarios tantas veces como quieras, realiza el ejercicios práctico, y accede a los foros para discutir los temas que te parezcan más interesantes.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 22nd 2026
4 Weeks
Marketing Analytics (Coursera) Coursera
University of Virginia

Marketing Analytics (Coursera)

Organizations large and small are inundated with data about consumer choices. But that wealth of information does not always translate into better decisions. Knowing how to interpret data is the challenge -- and marketers in particular are increasingly expected to use analytics to inform and justify their decisions. Marketing analytics enables marketers to measure, manage and analyze marketing performance to maximize its effectiveness and optimize return on investment (ROI). Beyond the obvious sales and lead generation applications, marketing analytics can offer profound insights into customer preferences and trends, which can be further utilized for future marketing and business decisions.

Jun 22nd 2026
5-12 Weeks
Principles of fMRI 1 (Coursera) Coursera
Johns Hopkins University

Principles of fMRI 1 (Coursera)

Functional Magnetic Resonance Imaging (fMRI) is the most widely used technique for investigating the living, functioning human brain as people perform tasks and experience mental states. It is a convergence point for multidisciplinary work from many disciplines. Psychologists, statisticians, physicists, computer scientists, neuroscientists, medical researchers, behavioral scientists, engineers, public health researchers, biologists, and others are coming together to advance our understanding of the human mind and brain. This course covers the design, acquisition, and analysis of Functional Magnetic Resonance Imaging (fMRI) data, including psychological inference, MR Physics, K Space, experimental design, pre-processing of fMRI data, as well as Generalized Linear Models (GLM’s).

Jun 22nd 2026
4 Weeks
Relational Database Support for Data Warehouses (Coursera) Coursera
University of Colorado System

Relational Database Support for Data Warehouses (Coursera)

Relational Database Support for Data Warehouses is the third course in the Data Warehousing for Business Intelligence specialization. In this course, you'll use analytical elements of SQL for answering business intelligence questions. You'll learn features of relational database management systems for managing summary data commonly used in business intelligence reporting. Because of the importance and difficulty of managing implementations of data warehouses, we'll also delve into storage architectures, scalable parallel processing, data governance, and big data impacts. In the assignments in this course, you can use either Oracle or PostgreSQL.

Jun 22nd 2026
5-12 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026
4 Weeks
Fundamentals of GIS (Coursera) Coursera
University of California, Davis

Fundamentals of GIS (Coursera)

Explore the world of spatial analysis and cartography with geographic information systems (GIS). What you will learn: define core geospatial concepts; practice with subset data using selections and feature attributes; create map books using advanced mapping techniques; create layer and map packages.

Jun 22nd 2026
4 Weeks
Interprofessional Healthcare Informatics (Coursera) Coursera
University of Minnesota

Interprofessional Healthcare Informatics (Coursera)

Interprofessional Healthcare Informatics is a graduate-level, hands-on interactive exploration of real informatics tools and techniques offered by the University of Minnesota and the University of Minnesota's National Center for Interprofessional Practice and Education. We will be incorporating technology-enabled educational innovations to bring the subject matter to life. Over the 10 modules, we will create a vital online learning community and a working healthcare informatics network.

Jun 22nd 2026
5-12 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 22nd 2026
4 Weeks
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 22nd 2026
4 Weeks
Experimentation for Improvement (Coursera) Coursera
McMaster University

Experimentation for Improvement (Coursera)

We are always using experiments to improve our lives, our community, and our work. Are you doing it efficiently? Or are you (incorrectly) changing one thing at a time and hoping for the best? In this course, you will learn how to plan efficient experiments - testing with many variables. Our goal is to find the best results using only a few experiments. A key part of the course is how to optimize a system.

Jun 22nd 2026
5-12 Weeks
Leadership Through Marketing (Coursera) Coursera
Northwestern University

Leadership Through Marketing (Coursera)

The success of every organization depends on attracting and retaining customers. Although the marketing concepts for doing so are well established, digital technology has empowered customers, while producing massive amounts of data, revolutionizing the processes through which organizations attract and retain customers. In this course, students will learn how to identify new opportunities to create value for empowered consumers, develop strategies that yield an advantage over rivals, and develop the data science skills to lead more effectively, allocate resources, and to confront this very challenging environment with confidence.

Jun 28th 2026
4 Weeks
Big Data Modeling and Management Systems (Coursera) Coursera
University of California, San Diego

Big Data Modeling and Management Systems (Coursera)

Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools.

Jun 22nd 2026
5-12 Weeks