Quantization Fundamentals with Hugging Face

Quantization Fundamentals with Hugging Face

Instructors: Younes Belkada

Sponsored by Louisiana Workforce Commission

Project

Build in-demand job skills with step-by-step instructions

Beginner level

Recommended experience

1 hour

Learn at your own pace

Hands-on learning

Learn more

Project

Build in-demand job skills with step-by-step instructions

Beginner level

Recommended experience

1 hour

Learn at your own pace

Hands-on learning

Learn more

What you'll learn

Learn how to compress models with the Hugging Face Transformers library and the Quanto library.
Learn about linear quantization, a simple yet effective method for compressing models.
Practice quantizing open source multimodal and language models.

Details to know

Taught in English

No downloads or installation required

Only available on desktop

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Learn, practice, and apply job-ready skills in less than 2 hours

Receive training from industry experts
Gain hands-on experience solving real-world job tasks

About this project

Generative AI models, like large language models, often exceed the capabilities of consumer-grade hardware and are expensive to run. Compressing models through methods such as quantization makes them more efficient, faster, and accessible. This allows them to run on a wide variety of devices, including smartphones, personal computers, and edge devices, and minimizes performance degradation.

Join this course to: 1. Quantize any open source model with linear quantization using the Quanto library. 2. Get an overview of how linear quantization is implemented. This form of quantization can be applied to compress any model, including LLMs, vision models, etc. 3. Apply “downcasting,” another form of quantization, with the Transformers library, which enables you to load models in about half their normal size in the BFloat16 data type. By the end of this course, you will have a foundation in quantization techniques and be able to apply them to compress and optimize your own generative AI models, making them more accessible and efficient.

Instructors

Younes Belkada

DeepLearning.AI

3 Courses2,679 learners

Marc Sun

DeepLearning.AI

3 Courses2,679 learners

Offered by

DeepLearning.AI

How you'll learn

Hands-on, project-based learning
Practice new skills by completing job-related tasks with step-by-step instructions.
No downloads or installation required
Access the tools and resources you need in a cloud environment.
Available only on desktop
This project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices.