EdX

AI Skills for Engineers: Data Engineering and Data Pipelines (edX)

AI Skills for Engineers: Data Engineering and Data Pipelines (edX)

Good data is central to effective AI applications. This course teaches the basics of data for AI, covering what data is needed, how to extract data from existing databases and basic data skills including setup of a Python notebook environment, basic data exploration and simple data visualizations.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Artificial Intelligence and Machine Learning have become central techniques for most services and products, ranging from web-based systems to medical procedures, self-driving cars – even intelligent coffee makers.
Alongside algorithms, data is central to AI applications. Without solid data management, AI projects typically underperform or even fail. Unfortunately, the relevance and complexity of handling data is frequently underestimated.
That’s why we developed this course which covers foundational questions like “Why is data important to AI?” and “What data does AI need?” and covers more application-oriented topics and skills like how to extract, load and query data using an SQL pipeline.
In the second part of the course, you will learn basic data engineering skills, including how to setup your Python notebook environment, explore data with advanced pandas functions, and create simple and clear data visualizations.
This introductory course is targeted at learners with little experience in data management or Python-based data management who want to develop Python-based AI applications in the future. The course covers a brief introduction into data management for AI, relational data management (e.g., SQL), and practical data handling skills in Python, pandas, and Jupyter.
This allows you to build a foundation to prepare for future AI and Machine Learning development with Python.

What you'll learn

  • Why Data Management is central to AI applications
  • What kind of data these applications need
  • How to obtain data for AI applications
  • How to extract and query data from existing databases using SQL
  • How to setup your Python notebooks
  • How to use the pandas library to work with tabular data
  • How to visualize data using the Seaborn library

Syllabus

Week 1:
We ask why we should care about data management for Artificial Intelligence and Machine Learning (ML) systems.
We examine which data are needed in the ML lifecycle and what properties that data should have.
We discuss the effort and time needed for data management activities, and look at possible data sources.

Week 2:
The basic key concepts of data management, such as databases, data models and data schemas are all introduced.
The Relational Data Model is explained and contrasted with the Single-Table Model (like CSV and Excel) and Document Models.

Week 3:
We show how to extract data from existing relational databases using SQL queries and converting the query results into CSV files for further processing using pandas in Python notebooks.

Week 4:
The different ways setoff setting up and running Python notebooks are covered, including cloud-based notebooks and local notebooks.
We will take you step by step through the process of setting up your conda environment and installing Jupyter and pandas libraries.
You will learn how to run notebooks in VS code.

Week 5:
Become a pandas expert.
Explore the essential functionalities of pandas and, most importantly, write elegant and efficient Python pandas code to process and engineer tabular data.

Week 6:
You will learn how to make simple and clear scientific figures in Python using the Seaborn library.
Use the core functions provided by Seaborn to make beautiful statistical plots.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Excel for Everyone: Data Management (edX) EdX
The University of British Columbia,UBCx

Excel for Everyone: Data Management (edX)

Further your Excel skills to manage larger datasets and more complex data wrangling, management and modelling. This intermediate Excel course builds on the teachings of the introductory Core Foundations course, teaching you to leverage the power of data calculations and reports to make informed personal or organizational decisions.

Self Paced
Self-Paced
Impacto de la Inteligencia Artificial en la Innovación de Negocios (edX) EdX
Universidad Anáhuac,AnahuacX

Impacto de la Inteligencia Artificial en la Innovación de Negocios (edX)

En este curso aprenderás sobre los factores importantes que se requieren para fundar un negocio digital, desde la importancia del propósito del negocio, hasta las implementaciones de la IA en las diferentes industrias. La inteligencia artificial es una disciplina de las ciencias de la computación que ha buscado emular los procesos del pensamiento humano para crear máquinas inteligentes que logren tomar decisiones con base en los datos que se presenten.

Self Paced
Self-Paced
Leading Digital and Data Decision Making (edX) EdX
Arizona State University,ASUx

Leading Digital and Data Decision Making (edX)

In this course, you will learn how leaders make managerial and relevant decisions based on data across multiple global industries. You will also explore how companies benefit from a digital ecosystem including sensors (IoT), Blockchain, artificial intelligence (AI), and augmented reality (AR) that move data-driven insights from the data scientist to the boardroom.

This course is archived
5-12 Weeks
Robotic process and intelligent automation for finance (edX) EdX
ACCA

Robotic process and intelligent automation for finance (edX)

In this course we explain how automation can play a key role in delivering the requirement to have robust processes and clean data. By using automation tools and machine learning, finance leaders can identify, implement and configure the right solutions for their organisation. It also shows how tools, such as Python, can be applied to finance processes and the benefits this will bring.

Self Paced
Self-Paced
Understanding Artificial Intelligence through Algorithmic Information Theory (edX) EdX
Institut Mines-Telecom,IMTx

Understanding Artificial Intelligence through Algorithmic Information Theory (edX)

Can we characterize intelligent behavior? Are there theoretical foundations on which Artificial Intelligence can be grounded? This course on Algorithmic Information will offer you such a theoretical framework. You will be able to see machine learning, reasoning, mathematics, and even human intelligence as abstract computations aiming at compressing information. This new power of yours will not only help you understand what AI does (or can’t do!) but also serve as a guide to design AI systems.

Self Paced
Self-Paced
Excel: Gestión de datos (edX) EdX
Universitat Politècnica de València,UPValenciaX

Excel: Gestión de datos (edX)

En este curso profundizarás en las herramientas para el tratamiento y manejo de datos que ofrece Excel. Este es un curso de nivel intermedio que te permitirá trabajar con datos para su tratamiento y extraer conclusiones mediante el agrupamiento de datos, el uso de tablas y gráficos dinámicos, la realización de análisis de hipótesis y la vinculación de datos de otras hojas de cálculo.

Self Paced
Self-Paced
Computer Vision and Image Processing Fundamentals (edX) EdX
IBM

Computer Vision and Image Processing Fundamentals (edX)

Learn about computer vision, one of the most exciting fields in machine learning. artificial intelligence and computer science. Computer Vision is one of the most exciting fields in Machine Learning, computer science and AI. It has applications in many industries such as self-driving cars, robotics, augmented reality, face detection in law enforcement agencies.

Self Paced
Self-Paced
Making Evidence-Based Strategic Decisions (edX) EdX
University of Maryland, College Park,University System of Maryland - USM,USMx,UMD

Making Evidence-Based Strategic Decisions (edX)

Drive alignment among managers, employees and the organizational goals through data analytics and data products. This course on digital transformation will show you how to turn your organization into a decision-making factory. What makes a good business decision? How can we combine effective data analytics and feed robust foresight and scenario planning processes?

Self Paced
Self-Paced
The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX) EdX
University of California, Berkeley,BerkeleyX

The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX)

A computer science principles course for anyone who wants to learn how to translate ideas into code. Discover the big ideas and thinking practices in computer science plus learn how to code using one of the friendliest programming languages, Snap! (based on Scratch).

No sessions available
13-24 Weeks
Mejora tu Negocio con Inteligencia Artificial (edX) EdX
Universidad Anáhuac,AnahuacX

Mejora tu Negocio con Inteligencia Artificial (edX)

Optimiza tu negocio con las nuevas herramientas de inteligencia artificial para brindar una mejor experiencia a tus clientes. La AI o inteligencia artificial, el internet de las cosas, el Big Data, los asistentes virtuales y las tecnologías digitales han cambiado las reglas del juego en el mundo de los negocios.

Self Paced
Self-Paced