Logo

dev-resources.site

for different kinds of informations.

Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server

Published at
12/4/2024
Categories
svelte
sveltekit
ollama
ai
Author
robertobutti
Categories
4 categories in total
svelte
open
sveltekit
open
ollama
open
ai
open
Author
12 person written this
robertobutti
open
Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server

Svelte 5 has taken modern web development by storm with its elegant, declarative approach to building user interfaces. Paired with Ollama, a lightweight model server designed for AI-driven applications, Svelte unlocks seamless integration of HTTP streams into dynamic web apps. This article demonstrates how you can use Svelte's reactivity alongside Ollama’s API to create an interactive real-time application for generating AI responses.

Let’s dive into the example code above and dissect the essential parts.

Before diving into the technical integration of SvelteKit and Ollama, let’s cover the initial setup process. If you’re new to either technology, don’t worry—this step-by-step guide will help you get started quickly.

Installing SvelteKit

SvelteKit is a modern framework for building fast and interactive web applications. It offers powerful tools, including reactive stores, server-side rendering, and API routes, making it an excellent choice for integrating with AI-powered APIs like Ollama.

Make sure you have Bun or Node.js installed on your machine.

Open a terminal and create a new SvelteKit project:

bunx sv create sveltekit-ollama
Enter fullscreen mode Exit fullscreen mode

Then, follow the prompts to configure your project:

  • Which template would you like? SvelteKit minimal
  • Add type checking with Typescript? Yes, using Typescript syntax
  • What would you like to add to your project? none
  • Which package manager do you want to install dependencies with? bun

Then navigate into the project directory:

cd sveltekit-ollama
Enter fullscreen mode Exit fullscreen mode

Start the development server:

bun dev
Enter fullscreen mode Exit fullscreen mode

You should see your SvelteKit app running at http://localhost:5173 (or a similar port). This confirms that your SvelteKit environment is ready.

Installing Ollama

Ollama is an AI framework that generates responses using large language models (LLMs). You must download and install Ollama and the Llama model to use it locally.

Installation Steps:

  1. Visit the Ollama website and download the appropriate installer for your operating system (macOS or Windows).

  2. Once downloaded, follow the installation instructions specific to your platform.

  3. Verify that Ollama is installed by running the following command in your terminal: ollama --version. You should see the printed version number, confirming that Ollama is installed correctly.


Downloading the Llama model

The Llama model is one of the popular open-source large language models supported by Ollama. To use it, you’ll need to download and configure it.

Open your terminal and run:

ollama download llama3.2
Enter fullscreen mode Exit fullscreen mode

Replace llama3.2 with the model you want to use (if different). You can explore other available models on the Ollama website.

Confirm that the model is downloaded successfully by listing the installed models:

ollama list
Enter fullscreen mode Exit fullscreen mode

You should see llama3.2 or the model you downloaded in the list.

Configuring the Ollama API server

Ollama provides a local API server to interact with the model. Start the server with:

ollama serve
Enter fullscreen mode Exit fullscreen mode

This will run the server on http://localhost:11434 by default. You can send prompts to the server using an HTTP client, such as fetch() in your SvelteKit app.

Next steps: connecting SvelteKit and Ollama

Now that both SvelteKit and Ollama are set up, you’re ready to integrate them. The next section will cover:

  • Manage the application state with SvelteKit’s $state.
  • Sending HTTP POST requests to the Ollama API server.
  • Streaming and processing the AI-generated responses.

Continue following the article for details on the full implementation.

Understanding the code

Create the src/routes/+page.svelte file, with this code:

<script lang="ts">
import { page } from "$app/state";
import { onMount } from "svelte";

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state('');

onMount(() => {
    question = page.url.searchParams.get('question') ?? '';
});

async function resetData() {
    question = "";
    text = "";
    status = "";
    statusInvalid = false;
}

async function translateData(language: string) {
    const myquestion = `Help me to translate this text into ${language} language: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function reviewData() {
    const myquestion = `Help me to review this text in a better english form, provide me only the reviewed text: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}
async function readData() {
    text = "";
    const myquestion = question;
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function askQuestion(myquestion: string) {
    try {
        if (question === "") {
            throw new Error("Question is empty");
        }
        const url = "http://localhost:11434/api/generate";
        const response = await fetch(url, {
            method: "POST",
            body: JSON.stringify({
                model: "llama3.2",
                prompt: myquestion,
            }),
        });
        if (!response.ok) {
            throw new Error(
                `HTTP error! Status: ${response.status} ${response.statusText}`,
            );
        }
        if (!response.body) {
            throw new Error("Readable stream not found in the response.");
        }
        const reader = response.body.getReader();
        while (true) {
            const { done, value } = await reader.read();
            if (done) {
                status = "";
                statusInvalid = false;
                return;
            }
            const mystring = new TextDecoder().decode(value);
            const myresponse = JSON.parse(mystring);
            console.log(myresponse.response);
            text = text + myresponse.response;
        }
    } catch (error: unknown) {
        if (error instanceof Error) {
            status = error.message;
            statusInvalid = true;
            console.error("An error occurred:", error.message);
        } else {
            console.error("An unknown error occurred:", error);
        }
    }
}
</script>

<main class="container">
    <form>
        <div class="grid">
          <div>
              <textarea
                  bind:value={question}
                  name="question"
                  placeholder="Write your question to Robertito AI"
                  aria-label="Professional short bio"
                  aria-invalid="{ statusInvalid }"
                  aria-describedby="invalid-helper"
              ></textarea>
              <small id="invalid-helper">{ status }</small>

        </div>
          <div>
        <textarea bind:value={text}></textarea>
              </div>
        </div>

        <div role="group">
            <button class="lg" onclick={() => resetData()}> Reset </button>
            <button class="lg" onclick={() => reviewData()}>
                Review the text
            </button>
            <button class="lg" onclick={() => translateData("Italian")}>
                Translate to 🇮🇹
            </button>
            <button class="lg" onclick={() => translateData("English (British)")}>
                Translate to 🇬🇧
            </button>
            <button class="danger lg" onclick={() => readData()}>
                Ask me!
            </button>
        </div>


    </form>
</main>

<style>
    textarea {
        width: 100%;
        height: 40vh;
    }
</style>

Enter fullscreen mode Exit fullscreen mode

The provided SvelteKit code demonstrates a web app that interacts with an Ollama server to send prompts and receive AI-generated text responses. Here's the breakdown of the key parts.

Reactive state variables with $state

SvelteKit's $state() is a shorthand for reactive state variables, simplifying managing mutable data that affects your UI.

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state('');

onMount(() => {
    question = page.url.searchParams.get('question') ?? '';
});
Enter fullscreen mode Exit fullscreen mode

Each variable is initialized with a default value, and any changes automatically update the UI where the variable is used. This eliminates the need for explicit event emitters or update calls.

Key Actions:

  • resetData(): Resets all state variables.
  • translateData(language): Translates question into the specified language.
  • reviewData(): Asks for a language review of the question.
  • readData(): Sends the question to the server for AI processing.

Handling HTTP streaming from Ollama

The askQuestion() function manages the interaction with Ollama's API, utilizing a streaming HTTP response. Here's how it works:

async function askQuestion(myquestion: string) {
  try {
    if (question === "") {
      throw new Error("Question is empty");
    }
    const url = "http://localhost:11434/api/generate";
    const response = await fetch(url, {
      method: "POST",
      body: JSON.stringify({
        model: "llama3.2",
        prompt: myquestion,
      }),
    });
    if (!response.ok) {
      throw new Error(
        `HTTP error! Status: ${response.status} ${response.statusText}`,
      );
    }
    if (!response.body) {
      throw new Error("Readable stream not found in the response.");
    }
    const reader = response.body.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) {
        status = "";
        statusInvalid = false;
        return;
      }
      const mystring = new TextDecoder().decode(value);
      const myresponse = JSON.parse(mystring);
      console.log(myresponse.response);
      text = text + myresponse.response;
    }
  } catch (error: unknown) {
    if (error instanceof Error) {
      status = error.message;
      statusInvalid = true;
      console.error("An error occurred:", error.message);
    } else {
      console.error("An unknown error occurred:", error);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

How It Works:

  1. Input validation: ensures the question isn't empty.
  2. API call: sends a POST request with the user's myquestion prompt.
  3. Response streaming: reads chunks of data (value) from the server as they arrive and decodes them using TextDecoder.
  4. Error Handling: captures potential errors in the HTTP call, parsing, or streaming process.

This incremental rendering provides a smooth experience where users see the response as it’s being generated.

Interactive UI: reactive state in action

The UI is tightly coupled with the reactive state variables, ensuring any changes are immediately reflected. For instance:

<div class="grid">
    <div>
        <textarea
            bind:value={question}
            name="question"
            placeholder="Write your question to Robertito AI"
            aria-label="Professional short bio"
            aria-invalid="{ statusInvalid }"
            aria-describedby="invalid-helper"
        ></textarea>
        <small id="invalid-helper">{ status }</small>
    </div>
    <div>
        <textarea bind:value={text}></textarea>
    </div>
</div>
Enter fullscreen mode Exit fullscreen mode

Here, the question and text states directly bind to the <textarea> elements, providing seamless two-way data binding. When the API response updates text, the changes instantly appear in the UI.

The buttons trigger state updates or server interactions:

<div role="group">
    <button class="lg" onclick={() => resetData()}> Reset </button>
    <button class="lg" onclick={() => reviewData()}>
        Review the text
    </button>
    <button class="lg" onclick={() => translateData("Italian")}>
        Translate to 🇮🇹
    </button>
    <button class="lg" onclick={() => translateData("English (British)")}>
        Translate to 🇬🇧
    </button>
    <button class="danger lg" onclick={() => readData()}>
        Ask me!
    </button>
</div>
Enter fullscreen mode Exit fullscreen mode

Each button invokes a specific function, directly manipulating state or sending API requests, resulting in instant feedback to the user.

Enhancing error handling

Errors are displayed dynamically to the user using the status and statusInvalid states:

<small id="invalid-helper">{ status }</small>
Enter fullscreen mode Exit fullscreen mode

This ensures users are informed of any issues, such as an empty prompt or server errors, without disrupting their workflow.

Conclusion

By combining SvelteKit's $state for state management with efficient HTTP streaming handling, this integration demonstrates the power of reactive programming for modern applications. The real-time feedback and incremental rendering ensure a delightful user experience, making it ideal for AI-driven applications powered by Ollama.

With further customization and optimization, this setup can form the foundation for a wide range of interactive, AI-powered web applications.

ollama Article's
30 articles in total
Favicon
What is ollama? Is it also a LLM?
Favicon
Semantic Kernel: Crea un API para Generación de Texto con Ollama y Aspire
Favicon
Local AI apps with C#, Semantic Kernel and Ollama
Favicon
Running Out of Space? Move Your Ollama Models to a Different Drive 🚀
Favicon
Working with LLMs in .NET using Microsoft.Extensions.AI
Favicon
Building an Ollama-Powered GitHub Copilot Extension
Favicon
Step-by-Step Guide: Write Your First AI Storyteller with Ollama (llama3.2) and Semantic Kernel in C#
Favicon
Run LLMs Locally with Ollama & Semantic Kernel in .NET: A Quick Start
Favicon
How to Set Up a Local Ubuntu Server to Host Ollama Models with a WebUI
Favicon
Ollama 0.5 Is Here: Generate Structured Outputs
Favicon
Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server
Favicon
Run Llama 3 Locally
Favicon
Building 5 AI Agents with phidata and Ollama
Favicon
Run Ollama on Intel Arc GPU (IPEX)
Favicon
Quick tip: Running OpenAI's Swarm locally using Ollama
Favicon
Langchain4J musings
Favicon
How to deploy SmolLM2 1.7B on a Virtual Machine in the Cloud with Ollama?
Favicon
Ollama - Custom Model - llama3.2
Favicon
Coding Assistants and Artificial Intelligence for the Rest of Us
Favicon
Using a Locally-Installed LLM to Fill in Client Requirement Gaps
Favicon
Create Your Own Local AI Chatbot with Ollama and LangChain
Favicon
Consuming HTTP Streams in PHP with Symfony HTTP Client and Ollama API
Favicon
Llama 3.2 Running Locally in VSCode: How to Set It Up with CodeGPT and Ollama
Favicon
Ollama Unveiled: Run LLMs Locally
Favicon
No Bullshit Guide to Youtube shorts automation in NodeJS, OpenAI, Ollama, ElevanLabs & ffmpeg
Favicon
Unloading a model from Ollama
Favicon
OLLAMA + LLAMA3 + RAG + Vector Database (Local, Open Source, Free)
Favicon
The 6 Best LLM Tools To Run Models Locally
Favicon
Demystifying AI of Your Own
Favicon
Langchain Chat Assistant using Chainlit App

Featured ones: