Learners will gain the skills to serve powerful language models as practical and scalable web APIs. They will learn how to use the llama.cpp example server to expose a large language model through a set of REST API endpoints for tasks like text generation, tokenization, and embedding extraction.
Beginning Llamafile for Local Large Language Models (LLMs)
Instructors: Noah Gift
Sponsored by Coursera Learning Team
Recommended experience
What you'll learn
Learn how to serve large language models as production-ready web APIs using the llama.cpp framework
Understand the architecture and capabilities of the llama.cpp example server for text generation, tokenization, and embedding extraction
Gain hands-on experience in configuring and customizing the server using command line options and API parameters
Details to know
Add to your LinkedIn profile
4 assignments
See how employees at top companies are mastering in-demand skills
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
There is 1 module in this course
This week, you run language models locally. Keep data private. Avoid latency and fees. Use Mixtral model and llamafile.
What's included
8 videos17 readings4 assignments1 discussion prompt4 ungraded labs
Offered by
Why people choose Coursera for their career
Recommended if you're interested in Computer Science
DeepLearning.AI
DeepLearning.AI
Fred Hutchinson Cancer Center
Board Infinity
Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy