Beginning Llamafile for Local Large Language Models (LLMs) (Coursera)

Offered by Duke University,
Beginning Llamafile for Local Large Language Models (LLMs) (Coursera)

Learners will gain the skills to serve powerful language models as practical and scalable web APIs. They will learn how to use the llama.cpp example server to expose a large language model through a set of REST API endpoints for tasks like text generation, tokenization, and embedding extraction.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

The course dives into the technical details of running the llama.cpp server, configuring various options to customize model behavior, and efficiently handling requests. Learners will understand how to interact with the API using tools like curl and Python, allowing them to integrate language model capabilities into their own applications.
Throughout the course, hands-on exercises and code examples reinforce the concepts and provide learners with practical experience in setting up and using the llama.cpp server. By the end, participants will be equipped to deploy robust language model APIs for a variety of natural language processing tasks.
The course stands out by focusing on the practical aspects of serving large language models in production environments using the efficient and flexible llama.cpp framework. It empowers learners to harness the power of state-of-the-art NLP models in their projects through a convenient and performant API interface.

What you'll learn

  • Learn how to serve large language models as production-ready web APIs using the llama.cpp framework
  • Understand the architecture and capabilities of the llama.cpp example server for text generation, tokenization, and embedding extraction
  • Gain hands-on experience in configuring and customizing the server using command line options and API parameters

Syllabus

Getting Started with Mozilla Llamafile
This week, you run language models locally. Keep data private. Avoid latency and fees. Use Mixtral model and llamafile.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Computing Applications, Part 2: Big Data and Applications in the Cloud (Coursera)

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data! In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information.

Jun 22nd 2026
4 Weeks
Matrix Methods (Coursera) Coursera
University of Minnesota

Matrix Methods (Coursera)

Mathematical Matrix Methods lie at the root of most methods of machine learning and data analysis of tabular data. Learn the basics of Matrix Methods, including matrix-matrix multiplication, solving linear equations, orthogonality, and best least squares approximation. Discover the Singular Value Decomposition that plays a fundamental role in dimensionality reduction, Principal Component Analysis, and noise reduction.

Jun 22nd 2026
5-12 Weeks
Introduction to C# Programming and Unity (Coursera) Coursera
University of Colorado System

Introduction to C# Programming and Unity (Coursera)

This course is all about starting to learn how to develop video games using the C# programming language and the Unity game engine on Windows or Mac. Why use C# and Unity instead of some other language and game engine? Well, C# is a really good language for learning how to program and then programming professionally. Also, the Unity game engine is very popular with indie game developers; Unity games were downloaded 16,000,000,000 times in 2016! Finally, C# is one of the programming languages you can use in the Unity environment.

Jun 22nd 2026
4 Weeks
Practical Predictive Analytics: Models and Methods (Coursera) Coursera
University of Washington

Practical Predictive Analytics: Models and Methods (Coursera)

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Jun 22nd 2026
4 Weeks
Practical Machine Learning (Coursera) Coursera
Johns Hopkins University

Practical Machine Learning (Coursera)

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates.

Jun 22nd 2026
4 Weeks
Getting started with TensorFlow 2 (Coursera) Coursera
Imperial College London

Getting started with TensorFlow 2 (Coursera)

Welcome to this course on Getting started with TensorFlow 2! In this course you will learn a complete end-to-end workflow for developing deep learning models with Tensorflow, from building, training, evaluating and predicting with models using the Sequential API, validating your models and including regularisation, implementing callbacks, and saving and loading models.

Jun 22nd 2026
5-12 Weeks
Machine Learning: Regression (Coursera) Coursera
University of Washington

Machine Learning: Regression (Coursera)

Case Study - Predicting Housing Prices. In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.

Jun 22nd 2026
5-12 Weeks
API Design and Fundamentals of Google Cloud's Apigee API Platform (Coursera) Coursera
Google Cloud

API Design and Fundamentals of Google Cloud's Apigee API Platform (Coursera)

This course, API Design and Fundamentals of Google Cloud's Apigee API Platform, is the first in a series of three courses in the Developing APIs for Google Cloud's Apigee API Platform specialization. This course introduces you to API design and the fundamentals of the Apigee platform. The second course focuses on API security. The third course focuses on additional API development topics.

Jun 22nd 2026
2 Weeks
Machine Learning: Clustering & Retrieval (Coursera) Coursera
University of Washington

Machine Learning: Clustering & Retrieval (Coursera)

Case Studies: Finding Similar Documents. A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover?

Jun 22nd 2026
5-12 Weeks
Machine Learning With Big Data (Coursera) Coursera
University of California, San Diego

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems.

Jun 22nd 2026
5-12 Weeks
Introduction to Artificial Intelligence (AI) (Coursera) Coursera
IBM

Introduction to Artificial Intelligence (AI) (Coursera)

In this course you will learn what Artificial Intelligence (AI) is, explore use cases and applications of AI, understand AI concepts and terms like machine learning, deep learning and neural networks. You will be exposed to various issues and concerns surrounding AI such as ethics and bias, & jobs, and get advice from experts about learning and starting a career in AI. You will also demonstrate AI in action with a mini project.

Jun 22nd 2026
4 Weeks
Neural Networks and Deep Learning (Coursera) Coursera
DeepLearning.AI

Neural Networks and Deep Learning (Coursera)

If you want to break into cutting-edge AI, this course will help you do so. Deep learning engineers are highly sought after, and mastering deep learning will give you numerous new career opportunities. Deep learning is also a new "superpower" that will let you build AI systems that just weren't possible a few years ago. In this course, you will learn the foundations of deep learning.

Jun 22nd 2026
4 Weeks