Learn to serve powerful language models as practical, scalable web APIs using the llama.cpp server. Keep your data private and avoid cloud latency and fees.
Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.
In this course, you will:
- Gain the skills to expose large language models through REST API endpoints
- Learn how to configure the llama.cpp server to customize model behavior
- Understand how to efficiently handle requests and integrate language model capabilities into applications
- Reinforce concepts through hands-on exercises and code examples using tools like curl and Python
- Be equipped to deploy robust language model APIs for various NLP tasks
The course empowers you to harness state-of-the-art NLP models in your projects through a convenient and performant API interface, focusing on the practical aspects of serving large language models in production environments using the efficient and flexible llama.cpp framework.
This course is part of the Generative AI Fundamentals Professional Certificate.
What you'll learn
- Installing and using the Cosmopolitan Libc toolkit
- Running language models locally with llamafile
- Understanding the Mixtral model license and llamafile packaging
- Developing portable command-line interfaces with Cosmopolitan
- Interacting with the llamafile API for NLP tasks
Syllabus
Module 1: Getting Started with Mozilla Llamafile (2 hours)
- Video: Meet your instructor: Alfredo Deza (1 minute) [Preview module]
- Reading: Meet your instructor: Noah Gift (1 minute)
- Reading: Connect with your instructors (1 minute)
- Reading: Course structure and etiquette (1 minute)
- Reading: Key Terms (5 minutes)
- Reading: What is Llamafile? (5 minutes)
- Video: Llamafile overview by Mozilla (5 minutes)
- Video: Using the Llamafile API (2 minutes)
- Video: Creating a Llamafile (5 minutes)
- Reading: Cosmopolitan (5 minutes)
- Video: Building portable binaries with Cosmopolitan (4 minutes)
- Video: Building a phrase generator with cosmopolitan (3 minutes)
- Reading: Lesson Reflection (5 minutes)
- Assignment: Quiz-Key Components of Llamafile (10 minutes)
- Reading: Key Terms (1 minute)
- Reading: Bash Phrase Generator (5 minutes)
- Ungraded Lab: Cosmopolitan (10 minutes)
- Reading: Lesson Reflection (5 minutes)
- Assignment: Quiz-Portable CLI with Cosmopolitan (10 minutes)
- Reading: Key Terms (5 minutes)
- Reading: What are LLMs? (5 minutes)
- Video: Getting Started with Llamafile (3 minutes)
- Video: Llamafile local system metrics (3 minutes)
- Ungraded Lab: Portable CLI (10 minutes)
- Reading: Lesson Reflection (5 minutes)
- Assignment: Quiz-Running Llamafile (10 minutes)
- Reading: Key Terms (1 minute)
- Reading: Llamafile server (5 minutes)
- Ungraded Lab: Local Llamafile API (10 minutes)
- Reading: Course Conclusion (5 minutes)
- Reading: Next Steps (1 minute)
- Assignment: Final Quiz-Llamafile (10 minutes)
- Discussion Prompt: Meet and Greet (optional) (1 minute)