Logo

dev-resources.site

for different kinds of informations.

Building a (somewhat) intelligent agent

Published at
11/9/2024
Categories
machinelearning
ai
shell
bash
Author
aawgit
Categories
4 categories in total
machinelearning
open
ai
open
shell
open
bash
open
Author
6 person written this
aawgit
open
Building a (somewhat) intelligent agent

This started on a Saturday night. If you are very social like me, you would know that there is no better time to do some coding than a peaceful Saturday night. So I opened up a pet project I've been working on and realized that it wasn't pushed to Github yet. I didn't remember the commands to set a remote repo and could have easily Googled or "ChatGPTed" it. But, wouldn't it be cooler to add another layer of abstraction and just tell the computer to "set this projects remote as such and such", specially in this era of intelligent agents? And wouldn't it be even cooler to build that agent?. And that's exactly what I did, instead of spending a few seconds on finding the commands to set the remote repo.

I started solving the problem backwards. I would need a way to run shell commands from a program. That's easy, the subprocess module in Python.

import subprocess

def run_shell_command(command):
    try:
        # Run the shell command
        result = subprocess.run(command, shell=True, check=True, text=True, capture_output=True)
        # Return the command output
        return result.stdout
    except subprocess.CalledProcessError as e:
        # Return the error output if the command fails
        return e.stderr

print(run_shell_command('pwd'))
Enter fullscreen mode Exit fullscreen mode

Now I need a way to decide what commands to run. That's where the intelligence comes in. It needs to take the natural language input and convert them to shell commands. Large Language Models (LLMs) are good at this sort of things. So I tried the following prompt on ChatGPT.

I'm computer program created to assist a human. Currently the human is working on a DevOps task. He asked me to "ssh to the development server and install python". Please tell me what shell commands to run to do the activity. If you need more information please let me know what information you need. Please respond in the following manner and don't include anything else in the response.
Type: [Can be either "Commands" or "More information"]
Response: [List of shell commands or list of more information needed, separated by commas]

Example response 1:
Type: More information
Response: user id, key path

Example response 2:
Type: Commands
Response: ssh -i 'keyfile.pem' user1@server
It worked surprisingly well most of the time.

This was the response.

Type: More information
Response: user id, server IP or hostname, key path or password, operating system type (e.g., Ubuntu, CentOS)
Enter fullscreen mode Exit fullscreen mode

And after passing the inputs, it returned the list of commands as,

Type: Commands
Response: ssh -i 'key.pem' [email protected], sudo yum install python3 -y
Enter fullscreen mode Exit fullscreen mode

Not exactly production ready, but this is a promising start. On this high note I had to stop for the day one, or rather hour one since I'm no longer the young man I once was and it was already 10 PM.

A week later...

Zooming out a little bit, "how would I use this?". I would open up a project on the terminal, and type " set the remote repo for this project as " . Then the agent will ask the LLM for the commands to run. If it needs more information, it will prompt me. After getting the information, it will send them to the LLM, for which the LLM will give the commands or ask for more information. This will repeat until a command runs. If the command is successful, it will stop. But if it returns errors the agent will prompt the LLM for commands to resolve the issue. Also, with each request to the LLM , the agent will send the conversation history in window with a suitable size. This will provide the context to the LLM.

We would need to make the queries to LLM a little abstract to make the agent handle a wider range of tasks. After all, it wouldn't be very useful if its only capable of setting remote repo URLs. At the same time, we need to clearly defile its scope. In this case it would be an agent for running shell commands. To help handling a range of commands, we can parameterize the prompt. Those parameters would be,

  1. The natural language input from the human.
  2. Context: This is little tricky, I will use the conversation history for now.
  3. Any errors returned by running a command.

In addition to that we will have to maintain the state such as executing a command or getting more info.

Let's code it. I've changed the LLMs output to a JSON format string since it's easier to write the processing part that way.

Image description

I tested it with a few simple commands and they worked as expected.

Image description

Seems alright. Let's try another one.

Image description

That's not what I asked for. May be we need to be more specific.

Image description

That's more like it. Although I should definitely add a mechanism to verify the commands before running them. That should prevent the agent from doing something crazy. Also, explaining a command before it runs would be a good feature - but not for now.

answer = input(f" Shall I run '{command}'? (Yes/ No) ")
                if answer.lower()=='yes': # Execute the command
Enter fullscreen mode Exit fullscreen mode

So, it kind of works, but we need to make it easily accessible. Creating an alias did the trick. I added the following to ~/.bashrc.

alias shelly='/home/akalanka/projects/shelly/venv/bin/python3 /home/akalanka/projects/shelly/main.py'
Enter fullscreen mode Exit fullscreen mode

Let's see how well "Shelly" fulfills her purpose. First I told Shelly to create the remote repo, but it did't work because it was trying to setup gh CLI tools authentication, which was too complex for a simple tool like this. So I created the remote repo and then asked to set it as the origin of the local repo, which also failed the first time. But after improving the prompt template, I asked her to correct the mistake, which actually worked.

Image description

Then I went ahead and asked her to commit and push her own code, which also was done nicely enough (ignoring the fact that she ignored the instruction about the commit message).

Image description

It's not much useful for commands I use frequently, which I remember, because it's quicker and more reliable to run the shell command directly. But for other cases this actually seem to help.

So about a week later, I was finally able to set the remote repo for the project. Great success!. What a way to spend weekend evenings!.

Obviously, a lot can be done to improve this. To start, some way of persisting the user inputs between the invocations could smooth things up. Using LangChain could be a good idea. Let me know what you think. Also feel free to check out the source code and open a PR to make it more intelligent. It could use some help. Hey, you can use the Shelly to push your feature, hopefully.

P.S. This was entirely written by a human. Absolutely no intelligence - artificial or otherwise was involved in the writing.

shell Article's
30 articles in total
Favicon
Poor man's parallel in Bash
Favicon
Ergonomic Pyhon Text Piping Solution for Linux Shell with pypyp and uv
Favicon
Become a Bash Scripting Pro in 10 Minutes: A Quick Guide for Beginners
Favicon
Final Bash Script Series Mastering Remote Server Management and Web App Deployment
Favicon
kkTerminal —— A terminal for Web SSH connection
Favicon
The Complete Guide to Bash Commands
Favicon
Navigating TC39 Proposals: From Error Handling to Iterator.range
Favicon
Introducing TheShell: A Game-Changer in LivinGrimoire
Favicon
Pick Files from a List for Git Add and Stash Directly in Your Terminal
Favicon
Start Shell Programming: A Beginner's Guide âš™ [Part-I]
Favicon
Pytest Fish shell autocompletion
Favicon
Discover File Splitter & Merger: A Revolutionary Tool for Managing Large Files
Favicon
🚀 RazzShell v1.0.1 is Here: Plugin Support, Enhanced Job Management, and More! 🌟
Favicon
ps, kill -9 PID
Favicon
\\wsl$
Favicon
Escape quotes correctly when using psql via docker in bash
Favicon
Bash vs. Shell: The Ultimate Comparison
Favicon
Search and Sync Your Shell History With Atuin
Favicon
Building a (somewhat) intelligent agent
Favicon
Environment Management in Bash: Unlocking the Secrets of the Shell
Favicon
3 Must-Know File Permissions and Ownership Commands
Favicon
Ask Git to Show a Method
Favicon
UNIX
Favicon
DEV OPS JOURNEY
Favicon
Unlock the Secrets of Your Command Line with the History Command
Favicon
Mastering Text Processing with Grep, Sed, Awk, Cut, and Sort
Favicon
Shell Special Variables and Execution Environment
Favicon
Spice up Your Terminal With a Todo Reminder Using Starship Prompt and iZiDo Bash Script
Favicon
Introducing RazzShell: A Customizable Unix Shell for Modern CLI Users
Favicon
File Management in Bash : Commands and Examples

Featured ones: