Logo

dev-resources.site

for different kinds of informations.

DBChat Part 3 - Configure , Connect & Dump Databases

Published at
1/11/2025
Categories
ai
machinelearning
webdev
programming
Author
Shrijith Venkatramana
DBChat Part 3 - Configure , Connect & Dump Databases

Hi there! I'm Shrijith Venkatrama, the founder of Hexmos. Right now, Iā€™m building LiveAPI, a super-convenient tool that simplifies engineering workflows by generating awesome API docs from your code in minutes.

In this tutorial series, I am on a journey to build for myself DBChat - a simple tool for using AI chat to explore and evolve databases.

See previous posts to get more context:

  1. Building DBChat - Explore and Evolve Your DB with Simple Chat (Part 1)
  2. DBChat: Getting a Toy REPL Going in Golang (Part 2)

Problem - How Best to Specify Databases That We Want To Deal With?

My initial thought was that - we'd have a connect <connection_string> format supported in our REPL.

User can start the REPL and enter the configuration.

But on further consideration - it looked like having friendly names for databases was a better approach.

So, I settled on the following approach for now:

  1. ~/.dbchat.toml - A configuration file for dbchat in home folder. Initially, it will be simple - just a "connections" section in it, listing various database URLs.
  2. A new connect command within the shell. One can connect to both stored database connections or literal database connections. That is connect <name from .dbchat.toml> or connect <connection literal> will both be supported

In the next sections, I will explain how the above (2) were implemented

Configuring database connections in ~/.dbchat.toml

The sample configuration is initially assumed to be something like this:

# DBChat Sample Configuration File
# Copy this file to ~/.dbchat.toml and modify as needed

[connections]
# Format: name = "connection_string"
local = "postgresql://postgres:postgres@localhost:5432/postgres"
liveapi = "postgresql://user:password@ip:port/database_name" 

I implemented cmd/dbchat/utils/config.go to read and list connections from the configuration like so:

package utils

import (
    "fmt"
    "os"
    "path/filepath"
    "strings"

    "github.com/BurntSushi/toml"
)

// Config holds the application configuration
type Config struct {
    Connections map[string]string `toml:"connections"`
}

// LoadConfig reads the configuration from ~/.dbchat.toml
func LoadConfig() (*Config, error) {
    home, err := os.UserHomeDir()
    if err != nil {
        return nil, fmt.Errorf("error getting home directory: %v", err)
    }

    configPath := filepath.Join(home, ".dbchat.toml")
    var config Config

    if _, err := toml.DecodeFile(configPath, &config); err != nil {
        // Return empty config if file doesn't exist
        if os.IsNotExist(err) {
            return &Config{Connections: make(map[string]string)}, nil
        }
        return nil, fmt.Errorf("error reading config file: %v", err)
    }

    return &config, nil
}

// ListConnections returns a formatted string of all configured connections
func ListConnections(config *Config) string {
    if len(config.Connections) == 0 {
        return "No connections configured in ~/.dbchat.toml"
    }

    var sb strings.Builder
    for name := range config.Connections {
        sb.WriteString(fmt.Sprintf("- %s\n", name))
    }
    return sb.String()
}

The connect Command Implementation

There are two variations supported in the connect command:

  1. connect <connection name>
  2. connect <literal connection>

So in the REPL Eval handling, I added a new case for handling connect:

    case cmd == "connect":
        if len(fields) == 1 {
            return `Usage: connect <connection_string | connection_name>

You can either provide a full connection string:
Example: connect postgresql://username:password@localhost:5432/dbname
Format:  postgresql://[user]:[password]@[host]:[port]/[dbname]?[params]

Or use a connection name from ~/.dbchat.toml:
Example: connect local

Available connections in config:
` + utils.ListConnections(h.config)
        }

        connectionStr := strings.Join(fields[1:], " ")

        // Check if the argument matches a configured connection name
        if configStr, exists := h.config.Connections[connectionStr]; exists {
            connectionStr = configStr
        }

        // Close existing connection if it exists
        if h.db != nil {
            h.db.Close()
        }

        // Connect using the connection string
        newDb, err := db.Connect(connectionStr)
        if err != nil {
            return fmt.Sprintf("Failed to connect: %v", err)
        }

        h.db = newDb
        return "Successfully connected to PostgreSQL database! šŸŽ‰"

dump schema Command To Get Database Context

The most important action for us is to get the whole schema of the connected database.

The schema is important because: We can use the schema to inform the LLM layer later. And this information will help the LLM generate useful and accurate SQL queries.

So I created cmd/dbchat/db/schema.go:

package db

import (
    "database/sql"
    "fmt"
    "strings"
)

const schemaQuery = `
WITH tables AS (
    SELECT 
        t.table_schema,
        t.table_name,
        t.table_type,
        obj_description((quote_ident(t.table_schema)||'.'||quote_ident(t.table_name))::regclass, 'pg_class') as table_comment
    FROM information_schema.tables t
    WHERE t.table_schema NOT IN ('pg_catalog', 'information_schema')
    ORDER BY t.table_schema, t.table_name
),
columns AS (
    SELECT 
        c.table_schema,
        c.table_name,
        c.column_name,
        c.data_type,
        c.is_nullable,
        c.column_default,
        col_description((quote_ident(c.table_schema)||'.'||quote_ident(c.table_name))::regclass, 
                       c.ordinal_position) as column_comment
    FROM information_schema.columns c
    WHERE c.table_schema NOT IN ('pg_catalog', 'information_schema')
    ORDER BY c.table_schema, c.table_name, c.ordinal_position
)
SELECT 
    t.table_schema,
    t.table_name,
    t.table_type,
    t.table_comment,
    string_agg(
        format(
            '  %s %s%s%s%s',
            c.column_name,
            c.data_type,
            CASE WHEN c.is_nullable = 'NO' THEN ' NOT NULL' ELSE '' END,
            CASE WHEN c.column_default IS NOT NULL THEN ' DEFAULT ' || c.column_default ELSE '' END,
            CASE WHEN c.column_comment IS NOT NULL THEN ' -- ' || c.column_comment ELSE '' END
        ),
        E'\n'
    ) as columns
FROM tables t
LEFT JOIN columns c 
    ON c.table_schema = t.table_schema 
    AND c.table_name = t.table_name
GROUP BY t.table_schema, t.table_name, t.table_type, t.table_comment
ORDER BY t.table_schema, t.table_name;
`

// DumpSchema returns a formatted string containing the database schema
func DumpSchema(db *sql.DB) (string, error) {
    if db == nil {
        return "", fmt.Errorf("database connection required. Use 'connect' command first")
    }

    rows, err := db.Query(schemaQuery)
    if err != nil {
        return "", fmt.Errorf("error querying schema: %v", err)
    }
    defer rows.Close()

    var sb strings.Builder
    var currentSchema string

    for rows.Next() {
        var (
            schema, tableName, tableType, columns string
            tableComment                          sql.NullString
        )

        if err := rows.Scan(&schema, &tableName, &tableType, &tableComment, &columns); err != nil {
            return "", fmt.Errorf("error scanning row: %v", err)
        }

        // Print schema header if we're entering a new schema
        if schema != currentSchema {
            if currentSchema != "" {
                sb.WriteString("\n")
            }
            sb.WriteString(fmt.Sprintf("Schema: %s\n", schema))
            sb.WriteString(strings.Repeat("=", len(schema)+8) + "\n\n")
            currentSchema = schema
        }

        // Print table information
        sb.WriteString(fmt.Sprintf("Table: %s (%s)\n", tableName, tableType))
        if tableComment.Valid {
            sb.WriteString(fmt.Sprintf("Comment: %s\n", tableComment.String))
        }
        sb.WriteString("Columns:\n")
        sb.WriteString(columns)
        sb.WriteString("\n\n")
    }

    if err = rows.Err(); err != nil {
        return "", fmt.Errorf("error iterating rows: %v", err)
    }

    return sb.String(), nil
}

The Demo: Configure, Connect and Dump Schema From A PostgreSQL Database

Using Predefine Connection in DBChat

Next Steps

Since we have database connection & schema dump mechanism - we are ready to move onto the next stage: LLM integration.

In the next stage, we will combine user queries in natural language and database schema to generate SQL queries.

In the final stage - the query must be executed to produce results as well - but that is not of immediate concerned.

Like/Subscribe/Share to encourage developing this series of posts. Thanks for reading.

Featured ones: