Logo

dev-resources.site

for different kinds of informations.

ReasoningAgent Update - Beam Search, MCTS, and LATS for LLM Reasoning

Published at
1/7/2025
Categories
llm
agents
reasoning
mcts
Author
ag2blogger
Categories
4 categories in total
llm
open
agents
open
reasoning
open
mcts
open
Author
10 person written this
ag2blogger
open
ReasoningAgent Update - Beam Search, MCTS, and LATS for LLM Reasoning

Authors BabyCNM, Hrushikesh Dokala, Chi Wang, Qinyung Wu


Key Updates in this Release:

1. Configuration Changes

  • All reasoning parameters are now configured through a single reason_config dictionary

  • Breaking Change: Parameters like max_depth, beam_size, and answer_approach have moved from constructor arguments into reason_config

2. New Search Strategies

  • Added Monte Carlo Tree Search (MCTS) as an alternative to Beam Search

  • Introduced Language Agent Tree Search (LATS) - an enhancement to MCTS that incorporates reflection prior to the next round of simulation.

3. Enhanced Features

  • New forest_size parameter enables maintaining multiple independent reasoning trees

  • Support for ground truth answers in prompts to generate training data for LLM fine-tuning

mcts example


Introduction

In our previous post, we introduced the ReasoningAgent, which utilized Beam Search for systematic reasoning. Today, we include MCTS (Monte Carlo Tree Search) and Language Agent Tree Search (LATS) as alternative search strategies, which present advantages in different scenarios.

Our previous ReasoningAgent draws inspiration from OpenAI’s 2023 paper, Let’s Verify Step by Step, as well as the 2024 O1 feature. The landscape of contemporary research is rich, with notable works such as DeepSeek-R1, Macro-O1, and OpenR.


Quick Start Guide

Let’s start with a simple example using MCTS:

import os
from autogen import UserProxyAgent, ReasoningAgent

# Configure the model
config_list = [{"model": "gpt-4o-mini", "api_key": os.environ.get("OPENAI_API_KEY")}]

# Create a reasoning agent with MCTS
mcts_agent = ReasoningAgent(
    name="mcts_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "method": "mcts",  # Use MCTS instead of beam search
        "nsim": 5,  # Number of MCTS simulations
        "exploration_constant": 1.41  # UCT exploration parameter
    }
)

# Create a user proxy agent
user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config={"use_docker": False}
)

prompt = "What is the expected maximum dice value if you can roll a 6-sided dice three times?"
response = user_proxy.initiate_chat(mcts_agent, message=prompt)
Enter fullscreen mode Exit fullscreen mode

Configuring a Separate Grader Model

In addition to the main reasoning model, you can now specify a different model for the grader by using the grader_llm_config parameter. This allows for more flexibility in evaluating the reasoning paths generated by the agent. If this parameter is not provided, the grader will use the same model as the reasoning agent. Here’s how you can set it up:

# Configure the model
config_list = [{"model": "gpt-4o-mini", "api_key": os.environ.get("OPENAI_API_KEY")}]
config_list_larger = [{"model": "gpt-4o", "api_key": os.environ.get("OPENAI_API_KEY")}]

# Create a reasoning agent with MCTS
mcts_agent = ReasoningAgent(
    name="mcts_agent",
    llm_config={"config_list": mini_config_list},
    grader_llm_config={"config_list": config_list_larger},
    reason_config={
        "method": "mcts",
        "nsim": 5
    }
)
Enter fullscreen mode Exit fullscreen mode

Key Features in the New Version

1. Multiple Search Methods

ReasoningAgent now supports three search strategies:

As the previous blog, the default method is beam search.

# Beam Search (default)
beam_agent = ReasoningAgent(
    name="beam_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "method": "beam_search",
        "beam_size": 3,
        "answer_approach": "pool"  # or "best"
    }
)
Enter fullscreen mode Exit fullscreen mode

MCTS is also included as a common approach.

# Monte Carlo Tree Search
mcts_agent = ReasoningAgent(
    name="mcts_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "method": "mcts",
        "nsim": 5 # number of simulations
    }
)
Enter fullscreen mode Exit fullscreen mode

It is important to note that our reasoning agent operates based on “process” and lacks direct access to the environment. In contrast, the LATS approach relies on feedback from the environment. To address this, we utilize our existing grader agent to generate pseudo-rewards and provide feedback. The major difference between our LATS implementation and our MCTS implementation is that the LATS approach incorporate the reflection into prompt context before next round of simulation. You can define the agent using the LATS approach as follows.

# Language Agent Tree Search
lats_agent = ReasoningAgent(
    name="lats_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "method": "lats",
        "nsim": 5
    }
)
Enter fullscreen mode Exit fullscreen mode

2. Incorporating Ground Truth for Enhanced Training Data Synthesis

You can now include ground truth in your prompts to achieve more precise evaluations (grading). This allows you to leverage the reasoning agent to generate diverse thinking trajectories, further finetuning the base LLM.

prompt = """Solve this calculus problem: ∫x²dx

GROUND_TRUTH:
The integral of x² is (x³/3) + C
Steps:
1. Use power rule: increase power by 1
2. Divide by new power
3. Add constant of integration
"""

response = user_proxy.initiate_chat(mcts_agent, message=prompt)

# After running queries...
sft_data = extract_sft_dataset(mcts_agent._root)
rlhf_data = extract_rlhf_preference_dataset(mcts_agent._root)
Enter fullscreen mode Exit fullscreen mode

3. Forest of Trees

Enable ensemble reasoning with multiple independent trees:

forest_agent = ReasoningAgent(
    name="forest_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "method": "mcts",
        "forest_size": 5  # Run 5 independent trees
    }
)
Enter fullscreen mode Exit fullscreen mode

When to Use Each Method

Use Beam Search when:

  • You want a deterministic search process

  • You can reliably evaluate intermediate steps

  • You need fast, memory-efficient search

  • The solution space is relatively small and structured

  • Early decisions strongly influence final outcomes

Use MCTS when:

  • You need stochastic exploration of solution paths

  • Final outcome evaluation is more reliable than intermediate steps

  • The solution space is large or complex

  • You want to balance exploration vs exploitation

  • You have computational budget for multiple simulations

Use LATS when:

  • Provides immediate reflection feedback before the next simulation

  • Helps identify poor reasoning paths early for future improvement

  • Especially useful for complex multi-step reasoning


Advanced Features

1. Visualization

Visualize the reasoning tree using graphviz:

from autogen.agentchat.contrib.reasoning_agent import visualize_tree

# After running queries...
visualize_tree(mcts_agent._root)
Enter fullscreen mode Exit fullscreen mode

2. Custom Evaluation

Modify the rating scale and evaluation criteria:

custom_agent = ReasoningAgent(
    name="custom_agent",
    llm_config={"config_list": config_list},
    reason_config={
        "rating_scale": 100,  # Use 1-100 scale instead of default 1-10 for grading
    }
)
Enter fullscreen mode Exit fullscreen mode

3. Save and Load Trees

Save reasoning trees for later analysis:

import json

# Save tree
data = mcts_agent._root.to_dict()
with open("reasoning_tree.json", "w") as f:
    json.dump(data, f)

# Load tree
from autogen.agentchat.contrib.reasoning_agent import ThinkNode
loaded_tree = ThinkNode.from_dict(json.load(open("reasoning_tree.json")))
Enter fullscreen mode Exit fullscreen mode

Performance Comparison

Variables

  • d: Maximum depth of the reasoning tree

  • b: Beam size (number of parallel paths maintained)

  • w: Branching factor (number of child nodes per parent)

  • n: Number of MCTS simulations

Time Complexity

Each algorithm has different computational costs:

  • Beam Search: O(d × b × (w + 1))

    • At each depth level d, evaluates w options for each of b beams
    • Plus 1 for generating the options
  • MCTS and LATS: O(n × d)

    • Each simulation traverses down to depth d
    • Performs n total simulations

Memory Usage

Storage requirements vary by approach:

  • Beam Search: O(b × d)

    • Fixed memory proportional to beam size and depth
    • Only stores active beams
  • MCTS and LATS: O(w^d)

    • Worst case stores complete tree
    • In practice much smaller due to selective expansion

Conclusion

The new ReasoningAgent offers a flexible toolkit for systematic reasoning with LLMs. Choose between Beam Search, MCTS, and LATS based on your specific needs regarding:

  • Evaluation cost and availability

  • Time and resource constraints

  • Desired exploration vs exploitation balance

  • Training data generation requirements


Next Steps

  • Async Client Call: parallelize LLM calling to speed up searching

  • Swarm Agent implementation

  • Efficient Mode: merging thinker and grader

  • Batch Norm: normalizing scores for MCTS


For Further Reading


Finding this useful?

The AG2 team is working hard to create content like this, not to mention building a powerful, open-source, end-to-end platform for multi-agent automation.

The easiest way to show your support is just to star AG2 repo, but also take a look at it for contributions or simply to give it a try.

Also, let us know if you have any interesting use cases for ReasoningAgent? Or maybe you would like to see more features or improvements? Do join our Discord server for discussion.

agents Article's
30 articles in total
Favicon
Streaming input and output using WebSockets
Favicon
Agents ai
Favicon
AI Workflows vs AI Agents — What’s the Difference?
Favicon
Real-Time Voice Interactions with the WebSocket Audio Adapter
Favicon
Tools Dependency Injection
Favicon
AI Agents Tutorial For Beginners: A Comprehensive Guide
Favicon
Txt-to-SQL: Querying Databases with Nebius AI Studio and Agents (part 3)
Favicon
ReasoningAgent Update - Beam Search, MCTS, and LATS for LLM Reasoning
Favicon
Arcee Orchestra and Arcee Model Engine
Favicon
Cross-Framework LLM Tool Integration with AG2
Favicon
Wait, are we just handing over system access to the AI agents?
Favicon
Creating Smart AI Agents with AWS Bedrock
Favicon
ReasoningAgent - Tree of Thoughts with Beam Search in AG2
Favicon
The Future of Work: Understanding AI Agents and Digital Coworkers
Favicon
AI agents
Favicon
Building 5 AI Agents with phidata and Ollama
Favicon
Run Ollama on Intel Arc GPU (IPEX)
Favicon
Microsoft Autogen Has Split in 2... Wait 3... No, 4 Parts
Favicon
Cassi: An AI-Powered CSS Style Guide Generator
Favicon
Actionable Agents in Workflow: Enhancing Efficiency through Automation
Favicon
The Rise and Fall of RAG-based Solutions
Favicon
AI Is Your Coworker Now: Navigating Trust and Transformation in the Modern Workplace
Favicon
Automating Azure Documentation with an AI Assistant
Favicon
Anti-fungal Agents Market Share, Trends and Forecast by 2031
Favicon
Prompt engineering AI-Spreadsheet-like experience 🚀
Favicon
Anti-fungal Agents Market Competitive Landscape, Strategies, Share, Trends and Forecast by 2031
Favicon
Two new models: Arcee-Spark and Arcee-Agent
Favicon
I need a co-founder engineer interested in psychologically informed personalised AI agents.
Favicon
Multi-Agent System
Favicon
Autonomous Software Development is here!

Featured ones: