Data Science with NumPy, Sets, and Dictionaries (Coursera)

Offered by Duke University,
Data Science with NumPy, Sets, and Dictionaries (Coursera)

Become proficient in NumPy, a fundamental Python package crucial for careers in data science. This comprehensive course is tailored to novice programmers aspiring to become data scientists, software developers, data analysts, machine learning engineers, data engineers, or database administrators.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Starting with foundational computer science concepts, such as object-oriented programming and data organization using sets and dictionaries, you'll progress to more intricate data structures like arrays, vectors, and matrices. Hands-on practice with NumPy will equip you with essential skills to tackle big data challenges and solve data problems effectively. You'll write Python programs to manipulate and filter data, as well as create useful insights out of large datasets.
By the end of the course, you'll be adept at summarizing datasets, such as calculating averages, minimums, and maximums. Additionally, you'll gain advanced skills in optimizing data analysis with vectorization and randomizing data.
Throughout your learning journey, you'll use many kinds of data structures and analytic techniques for a variety of data science challenges , including mathematical operations, text file analysis, and image processing. Stepwise, guided assignments each week will reinforce your skills, enabling you to solve problems and draw data-driven conclusions independently.
Prepare yourself for a rewarding career in data science by mastering NumPy and honing your programming prowess. Start this transformative learning experience today!

Syllabus

Sets and Dictionaries: Storing and Working with Data
Module 1
This week, you will learn the basics of object oriented programming as well as how to use sets and dictionaries to store and work with data in Python. You will apply these concepts with Python to perform some mathematical operations and analytical tasks, including solving geometric problems with circles and counting words in a document.

NumPy and Vectors
Module 2
This week, you will learn how to utilize NumPy--one of the most useful Python packages we use in data science--as well as learn additional data structures, arrays, beginning with the simplest type of an array, a vector. With NumPy and your new understanding of vectors, you will develop histograms as well as analyze household income distribution data in the United States, drawing your own data-driven conclusions.

Matrices and Arrays
Module 3
This week, you will first learn how NumPy handles data in your program using views and copies of your data. You will then learn how to work with more complex arrays called matrices, as well as how you can subset, filter, and modify data in matrices. Finally, you will write your own programs to manipulate data matrices and report your results for a given dataset.

Summarizing Datasets, Performance Optimization, and Data Randomization
Module 4
You will learn this week how to use NumPy to summarize data from matrices (e.g., calculating averages, minimums, maximums, etc.) as well as how to begin to analyze and manipulate image data. You will also explore two new data science techniques: how to make your analysis of data matrices more computationally efficient (vectorization) and how to randomize data (randomization).

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 22nd 2026
5-12 Weeks
Hypothesis Testing with Python and Excel (Coursera) Coursera
Tufts University

Hypothesis Testing with Python and Excel (Coursera)

In today's job market, leaders need to understand the fundamentals of data to be competitive. An essential procedure to understand business and analytics is hypothesis testing. This short course, designed by Tufts University expert faculty, will teach the fundamentals of hypothesis testing of a population mean and a population proportion, using Excel and Python for calculations. You'll also discover the central limit theorem, which is essential for hypothesis testing. To conclude the course, you will apply your newfound skills by creating a plan for an experiment in your own workplace that uses hypothesis testing.

Jun 23rd 2026
1 Week
Using Python to Interact with the Operating System (Coursera) Coursera
Google

Using Python to Interact with the Operating System (Coursera)

By the end of this course, you’ll be able to manipulate files and processes on your computer’s operating system. You’ll also have learned about regular expressions -- a very powerful tool for processing text files -- and you’ll get practice using the Linux command line on a virtual machine. And, this might feel like a stretch right now, but you’ll also write a program that processes a bunch of errors in an actual log file and then generates a summary file. That’s a super useful skill for IT Specialists to know.

Jun 23rd 2026
5-12 Weeks
Crash Course on Python (Coursera) Coursera
Google

Crash Course on Python (Coursera)

This course is designed to teach you the foundations in order to write simple programs in Python using the most common structures. No previous exposure to programming is needed. By the end of this course, you'll understand the benefits of programming in IT roles; be able to write simple programs using Python; figure out how the building blocks of programming fit together; and combine all of this knowledge to solve a complex programming problem.

Jun 23rd 2026
5-12 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 22nd 2026
4 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 22nd 2026
4 Weeks
Machine Learning: Regression (Coursera) Coursera
University of Washington

Machine Learning: Regression (Coursera)

Case Study - Predicting Housing Prices. In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.

Jun 22nd 2026
5-12 Weeks