EdX

Big Data for Agri-Food: Principles and Tools (edX)

Big Data for Agri-Food: Principles and Tools (edX)

As the big data era unfolds, developments in sensor and information technologies are evolving quickly. As a result, science and businesses are yielding enormous amounts of data. Yet, to reap the actionable business solutions data can unveil, we must learn to ask the right questions. Join Wageningen Wageningen University & Research as the data team bridges the gap between the complexity of computer science and its practical application. Decipher your unsampled big data set.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Demystify complex big data technologies
The sheer volume of a typical data set doesn’t fit on even the largest computer. And the tools that can handle big data seem too complex to grasp. To tackle these challenges, principles – such as immutability and pure functions – will help you understand big data technology. This makes big data management accessible, regardless of the programming language.
Specifically, should you scale up or scale out, how do you process big data stacks with map-reduce, using clusters, etc. In short, learn to recognise and put into practice the scalable solution that’s right for your situation. To illustrate, we will use current tools – such as Hadoop HDFS and Apache Spark – on user-friendly, hands-on examples from the agri-food sector. However, these principles can also be applied to other sectors.

Complexity of data collection and processing
Agri-food deserves special focus when it comes to choosing robust commercial data management technologies due to its inherent variability and uncertainty. Ranked the #1 university in Animal Sciences and Agriculture, Wageningen University & Research specialises in the interdisciplinarity between its knowledge domain of healthy food and living environment on the one hand and data science, artificial intelligence (AI) and robotics on the other.
Combining data from the latest sensing technologies (e.g. weather data) with machine learning/deep learning methodologies, allows us to unlock insights we didn’t have access to before. In the areas of smart farming and precision agriculture this allows us to:

  • Better manage dairy cattle by combining animal-level data on behaviour, health and feed with milk production and composition from milking machines.
  • Reduce the amount of fertilisers (nitrogen), pesticides (chemicals) and water used on crops by monitoring individual plants with a robot or drone.
  • More accurately predict crop yields on a continental scale by combining current with historic data on soil, weather patterns and crop yields.

In short, big data will allow us to bring forth effective solutions for smarter, innovative products. The possibilities are seemingly endless!

For whom?
You are a manager or researcher with a big data set on your hands, perhaps considering investing in big data tools. You’ve done some programming before, but your skills are a bit rusty. You want to learn how to effectively and efficiently manage very large datasets. This course will enable you to see and evaluate opportunities for the application of big data technologies within your domain. Enrol now.
This course has been partially supported by the European Union Horizon 2020 Research and Innovation program (Grant #810 775, “Dragon”).

What you'll learn

  • Recognize big data characteristics (volume, velocity, variety, veracity)
  • The difference between scaling up and scaling out
  • Big data principles: immutability and pure functions
  • Processing big data with map-reduce, using clusters
  • Understand technologies: distributed file systems, Hadoop
  • How dataframes and wrapper technology (Apache Spark) make life easier
  • The big data workflow and pipeline
  • How data is organized in datalakes, using lazy evaluation
  • Develop insight how to apply this to your own case

Syllabus

Module 1: Big data definition and characteristics
In module 1, you will learn how to recognize the characteristics of a big data problem in agriculture, to see where its biggest challenge lies. Should the solution focus on size, speed, various formats or uncertainty of data? Should you scale up or scale out?

Module 2: Big data principles: what are they and why do we need them
In module 2, you'll learn the principles that are required for scaling out: immutability and pure functions, and map-reduce. What are these and why do we need them?

Module 3: Bring those principles to practice
Module 3 shows you how to bring those principles into practice. You will learn what a cluster is, and how a distributed file system in a client-server architecture works, with Hadoop. You will understand why such a system is indeed scalable.

Module 4: Big data technologies that make implementation so much easier
Module 4 goes further into the application of big data technology, the “big data stack of technologies". The main message here is that if you know what you want to do, these technologies can take the work out of your hands. For example, you will see Apache Spark, a big data technology platform, that applies map-reduce for you.

Module 5: The big data workflow and pipeline; the how and why of datalakes
Module 5 dives deeper into the data. You'll learn about datalakes and why a datalake is different from a traditional database. You'll understand what a big data workflow looks like and what a pipeline is.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Introduction to Computer Science and Programming (edX) EdX
Tokyo Institute of Technology,TokyoTechX

Introduction to Computer Science and Programming (edX)

The term “Computation” refers to the action performed by a computer. A computation can be a basic operation and it can also be a sophisticated computer simultation requiring a large amount of data and substantial resources. This course aims at introducing learners with no prior knowledge to basics and key concepts of computer science. By following the lectures and exercises of this course you will have an understanding of algorithms and you will get a real experience of programming using the language Ruby.

Self Paced
Self-Paced
Programming for Data Science (edX) EdX
University of Adelaide,AdelaideX

Programming for Data Science (edX)

Learn how to apply fundamental programming concepts, computational thinking and data analysis techniques to solve real-world data science problems. There is a rising demand for people with the skills to work with Big Data sets and this course can start you on your journey through our Big Data MicroMasters program towards a recognised credential in this highly competitive area. Using practical activities you will learn how digital technologies work and will develop your coding skills through engaging and collaborative assignments.

Self Paced
Self-Paced
Excel avanzado: importación y análisis de datos (edX) EdX
Universitat Politècnica de València,UPValenciaX

Excel avanzado: importación y análisis de datos (edX)

Conoce técnicas y estrategias avanzadas para importar, consolidar y visualizar con Excel datos provenientes de cualquier fuente. En este curso de análisis e interpretación de datos te presentaremos técnicas avanzadas de importación de datos y estrategias diversas para consolidarlos y prepararlos una vez importados de forma que puedas extraer las conclusiones que necesitas (basadas en nuestra experiencia en el uso de Microsoft Excel y demostradas con casos reales).

Self Paced
Self-Paced
Knowledge Management and Big Data in Business (edX) EdX
The Hong Kong Polytechnic University,HKPolyUx

Knowledge Management and Big Data in Business (edX)

Learn why and how knowledge management and Big Data are vital to the new business era. The business landscape is changing so rapidly that traditional management, business and computing courses do not meet the needs for the next generation of workers in the business world. Most traditional methods are of a repetitive, rule-based nature and will be gradually replaced by Artificial Intelligence.

Self Paced
Self-Paced
Distributed Machine Learning with Apache Spark (edX) EdX
University of California, Berkeley,BerkeleyX

Distributed Machine Learning with Apache Spark (edX)

Learn the underlying principles required to develop scalable machine learning pipelines and gain hands-on experience using Apache Spark. Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization.

No sessions available
4 Weeks
Enabling Technologies for Data Science and Analytics: The Internet of Things (edX) EdX
Columbia University,ColumbiaX

Enabling Technologies for Data Science and Analytics: The Internet of Things (edX)

Discover the relationship between Big Data and the Internet of Things (IoT). The Internet of Things is rapidly growing. It is predicted that more than 25 billion devices will be connected by 2020. In this data science course, you will learn about the major components of the Internet of Things and how data is acquired from sensors. You will also examine ways of analyzing event data, sentiment analysis, facial recognition software and how data generated from devices can be used to make decisions.

Self Paced
Self-Paced
Big Data, Hadoop, and Spark Basics (edX) EdX
IBM

Big Data, Hadoop, and Spark Basics (edX)

This course provides foundational big data practitioner knowledge and analytical skills using popular big data tools, including Hadoop and Spark. Learn and practice your big data skills hands-on. Organizations need skilled, forward-thinking Big Data practitioners who can apply their business and technical skills to unstructured data such as tweets, posts, pictures, audio files, videos, sensor data, and satellite imagery, and more, to identify behaviors and preferences of prospects, clients, competitors, and others. ****

Self Paced
Self-Paced
Big Data Strategies to Transform Your Business (edX) EdX
Delft University of Technology,DelftX

Big Data Strategies to Transform Your Business (edX)

Make your organization’s business strategy and model, as well as your own career path, future-proof by using big data’s disruptive power. While big data infiltrates all walks of life, most firms have not changed sufficiently to meet the challenges that come with it. In this course, you will learn how to develop a big data strategy, transform your business model and your organization. This course will enable professionals to take their organization and their own career to the next level, regardless of their background and position.

Self Paced
Self-Paced