Logo

dev-resources.site

for different kinds of informations.

Teaching Large Language Models (LLMs) to do Math Correctly

Published at
1/15/2025
Categories
llm
ai
node
typescript
Author
shayy
Categories
4 categories in total
llm
open
ai
open
node
open
typescript
open
Author
5 person written this
shayy
open
Teaching Large Language Models (LLMs) to do Math Correctly

When we use large language models, we often find they struggle with math. But there's a way to help them get better at it. We can create an AI agent and give it access to code that does mathematical evaluations. This way, we can make sure the math answers are correct. Let's look at a piece of code that does this and explain each part.

Setting Up the Code

First, we need to bring in some tools that will help our AI agent do math. We use these lines to get the tools ready:

import { generateText, tool } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
import { evaluate } from "mathjs";
Enter fullscreen mode Exit fullscreen mode

These lines bring in the functions and libraries we need. The generateText and tool functions come from the "ai" package. We use createOpenAI from "@ai-sdk/openai" to connect to the OpenAI service. The z function comes from "zod", which helps us define what kind of data we expect. And evaluate from "mathjs" is what we use to do the actual math.

To use these, you need to install them. You can do this by running this command in your terminal:

npm install ai @ai-sdk/openai zod mathjs
Enter fullscreen mode Exit fullscreen mode

Connecting to OpenAI

Next, we set up our connection to OpenAI. We do this with this piece of code:

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
Enter fullscreen mode Exit fullscreen mode

This code creates a connection to OpenAI using an API key. The API key is stored in an environment variable called OPENAI_API_KEY. This is a secure way to use the key without putting it directly in the code.

Creating the Math Tool

Now, we create a tool that will do the math for us. We define it like this:

const math = tool({
  description: "A tool for evaluating mathematical expressions",
  parameters: z.object({
    expression: z
      .string()
      .describe(
        "The mathematical expression to evaluate in format supported by mathjs"
      ),
  }),
  execute: async (params) => {
    const result = evaluate(params.expression, { number: "BigNumber" });
    return result.toString();
  },
});
Enter fullscreen mode Exit fullscreen mode

This tool is called math. It has a description that says it's for evaluating math expressions. The parameters part tells us what kind of data the tool needs. In this case, it needs a string called expression, which is the math problem we want to solve.

The execute part is where the tool does its work. It uses the evaluate function from mathjs to solve the math problem. The result is then turned into a string and returned.

Using the Math Tool

Finally, we use the math tool in a function called main. Here's how it works:

async function main() {
  const result = await generateText({
    model: openai("gpt-4o"),
    system:
      "You are a math expert. When you are asked to evaluate a mathematical expression, use the math tool to evaluate it. Finally, once you have the result, return it as a string.",
    tools: { math },
    maxSteps: 10,
    prompt:
      "Calculate the volume of a cylinder with a radius of 5 meters and a height of 10 meters.",
  });

  console.log(`Output: ${result.text}`);
}

main();
Enter fullscreen mode Exit fullscreen mode

In this function, we tell the AI to use the gpt-4o model and act as a math expert. We give it a system message that tells it to use the math tool when it needs to solve a math problem. We also give it a prompt, which is a math problem to solve.

The AI will use the math tool to solve the problem and return the result as a string. We then print this result to the console.

Output

When we run this code, we get the following output:

Output: The volume of the cylinder is approximately 785.4 cubic meters.
Enter fullscreen mode Exit fullscreen mode

By using this code, we can make sure that our AI agent does math correctly. It uses a tool that guarantees the right answers, which is very helpful when working with LLMs.

typescript Article's
30 articles in total
Favicon
Unique Symbols: How to Use Symbols for Type Safety
Favicon
Building bun-tastic: A Fast, High-Performance Static Site Server (OSS)
Favicon
Teaching Large Language Models (LLMs) to do Math Correctly
Favicon
Angular Addicts #33: NgRx 19, using the Page Object Model in tests, Micro Frontends using Vite & more
Favicon
Share the state of a hook between components
Favicon
Swapable React context without breaking Rules of Hooks and your neck
Favicon
Matanuska ADR 010 - Architecture, Revisited
Favicon
Building a Robust Color Mixing Engine: From Theory to Implementation
Favicon
Automating Limit Orders on Polygon with TypeScript, 0x, and Terraform
Favicon
The Magic of useCallback ✹
Favicon
Building a Secure Authentication API with TypeScript, Node.js, and MongoDB
Favicon
"yup" is the new extra virgin olive oil
Favicon
Dynamic Routes in Astro (+load parameters from JSON)
Favicon
TypeScript Discord Bot Handler
Favicon
Form-based Dataverse Web Resources with React, Typescript and FluentUI - Part 2
Favicon
Converting JPA entities to Mendix
Favicon
lodash._merge vs Defu
Favicon
React Native With TypeScript: Everything You Need To Know
Favicon
100 Days of Code
Favicon
Ship mobile apps faster with React-Native-Blossom-UI
Favicon
Import JSON Data in Astro (with Typescript)
Favicon
How to write unit tests and E2E tests for NestJS applications
Favicon
Matanuska ADR 009 - Type Awareness in The Compiler and Runtime
Favicon
How to Build an Image Processor with React and Transformers.js
Favicon
Building an AI-Powered Background Remover with React and Transformers.js
Favicon
Exploring TypeScript Support in Node.js v23.6.0
Favicon
Observing position-change of HTML elements using Intersection Observer
Favicon
Breweries App
Favicon
Using LRU Cache in Node.js and TypeScript
Favicon
Build a Mac Tool to Fix Grammar Using TypeScript, OpenAI API, and Automator

Featured ones: