What Are Program-Aided Language Models?

Written by Coursera Staff • Updated on Feb 3, 2025

Learn how program-aided language models use Python interpreters to solve mathematical problems with higher accuracy than the LLMs reasoning alone.

[Featured Image] Two colleagues look at a tablet while one explains the program-aided language models used to create a solution.

Large language models (LLMs) can have trouble using reasoning to identify a solution, but program-aided language models (PAL) are meant to alleviate this issue. Typical LLMs use a series of deep learning and neural networks leveraging information from the array of training data to generate responses. Examples of LLMs that you may have interacted with include ChatGPT and Google Gemini. Many limitations in LLMs occur in their “reasoning” stages, especially in mathematics, leading them to produce wrong or misleading information. To address this issue, program-aided language models use an interpreter like Python to aid in the LLMs logic by sending the information into the interpreter program that generates the solution.

PALs are a novel use case that combines LLMs, which encode and decode language, with interpreters, such as Python, that can take the reasoning from the LLM and formulate a solution. This makes PAL a more effective model for reasoning abilities when solving mathematical, logical, and algorithm problems.

Explore how PALs work and how they differ from chain-of-reasoning prompting. Discover more about their capabilities and the potential benefits of using PALs.

How do program-aided language models work?

Program-aided language models work by having an LLM interpret the prompt by decoding through natural language processing (NLP), which allows it to create a program during its reasoning step. It will enable the LLM to find a solution using a Python interpreter. It bypasses the typical LLM model that uses its own information and training to find a solution, leveraging just the language interpretation aspect and programming ability of LLMs while passing the final step to an interpreter.

To understand how exactly this works, let’s break down its model into four phases:

You prompt the PAL with your question.
The PAL uses NLP to decode the language.
It breaks down the language into a program that uses the Python interpreter, or any interpreter of your choice, to reason instead of the LLM itself.
It runs the code through a Python interpreter to print the answer.

The critical aspect of PAL is that it does not depend on the reasoning abilities of the LLM itself. It can delegate this responsibility to a programming language, such as Python, that is better suited to handle the logic of mathematics.

Program-aided language model reasoning vs. LLM chain-of-thought reasoning

PAL differs from the popular chain-of-thought (CoT) reasoning because it utilizes a programming language to help LLMs perform mathematical computations. CoT prompting is a prompting technique that uses intermediate reasoning steps within the prompt to help the LLM reason its way through the actual question you want it to answer.

PAL prompting is not a separate type of reasoning chain. Rather, it improves the computation with CoT (or any other reasoning). By incorporating a language like Python, it uses programming steps by breaking the CoT reasoning into programming statements rather than using LLM's own computation with CoT alone.

PAL prompting

While the majority of the research with PAL uses CoT, PAL prompts can, in theory, aid other forms of reasoning with LLMs. A PAL prompt using CoT follows the same intermediate reasoning step before asking the question; however, it also tells the LLM to break that reasoning into programming statements. Another aspect of a PAL prompt is a step that breaks down the variables in a problem to easy-to-define names so that the program can run correctly.

What LLMs work with program-aided language models?

If the LLMs have satisfactory coding ability, program-aided language models can work well with them. In the novel PAL experiments, the researchers use code-davinci-002 from OpenAI, a combination of GPT-3 trained in dozens of programming languages and optimized to best work in Python. Even though code-davinci-002 is an advanced code-based language model, PAL can also work with LLMs trained chiefly in natural language and not code, as long as the LLMs have a high coding ability. The PAL way of prompting with CoT improves the accuracy of both code-based and text-based LLMs.

Benefits and limitations of using program-aided language models

Since PAL is a novel style of reasoning, many of its benefits and limitations still need to be discovered. One of the primary benefits of these models is their ability to create solutions for training LLMs to solve complex problems. The following lists some of the known benefits you might appreciate when using PAL:

PAL improves CoT reasoning within LLMs by creating more accurate results.

PAL improves the accuracy of even weaker language models, making its benefits scalable to LLMs.

PAL works with any language model developed for natural language as long as that model has a sufficient ability to code, which means PAL can work with models other than those trained strictly for coding.

Program-aided language models and AI

PAL shows how LLMs and other neural networks can leverage tools like a Python interpreter to perform tasks better than the LLM can. PALs allow LLMs to reason using programming, such as Python, instead of on their own through either CoT alone or the neural network. As a result, the large language models using PAL produce more accurate answers, leading to more tools for your AI models to leverage.

How to use program-aided language models

While novel research with program-aided language models used the now defunct Codex from OpenAI, which is a GPT-3 based LLM also trained on programming, PAL is a model that you could apply to other LLMs like ChatGPT, Copilot, or Google Gemini. To start using PAL, you will also need a basic understanding of Python and how to execute code using an interpreter while also having access to a chat-based LLM or being able to use an application programming interface (API) from OpenAI within Python.

The researchers of PAL have made their entire research available on GitHub. You can use it to execute their findings or use the PAL style of reasoning to answer your questions using the code provided. The following steps will allow you to implement the PAL style of reasoning by using their chat-based version:

Download Python for your operating system.
Interact with an LLM of your choice, such as ChatGPT or Copilot.
Use a script from the PAL researchers, and add your own question when prompted at the bottom.
Copy the code the LLM provides you into a new Python script.
Run the code using a Python interpreter of your choosing.
Fact-check the answer to ensure it’s correct.

8 possible careers in large language models or machine learning

As the use of artificial intelligence grows, the need for people with relevant skills also increases. You will likely need a bachelor’s degree in computer science or a related subject for these positions. If you’re interested in pursuing a career that might incorporate PAL into large language models or one that works with machine learning, you might consider one of the following options, listed with its average base salary.

Prompt engineer: $135,709
Machine learning engineer: $122,930
Data scientist: $118,193
Deep learning engineer: $105,331
Big data engineer: $101,116
AI researcher: $99,434
Natural language processing engineer: $93,937
Data analyst: $85,950

*All average annual salary data sourced from Glassdoor as of February 2025. It does not include bonuses, commissions, benefits, or other sources of additional pay.

Getting started with Coursera

Program-aided language models are a way of helping an LLM with the reasoning stage of its process by having programming languages solve problems with more accuracy in mathematics, symbolic, and algorithmic problems. To expand your understanding of generative AI and LLMs, try a beginner-friendly Guided Project like ChatGPT Prompt Engineering for Developers from DeepLearning.AI to learn core concepts. Delve further into the field with the Prompt Engineering Specialization from Vanderbilt University to learn how to interact with Generative AI to leverage the abilities of LLMs.

Updated on Feb 3, 2025

Written by:

Coursera Staff

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.