dev-resources.site

for different kinds of informations.

Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server

Published at

12/4/2024

Installing SvelteKit

SvelteKit is a modern framework for building fast and interactive web applications. It offers powerful tools, including reactive stores, server-side rendering, and API routes, making it an excellent choice for integrating with AI-powered APIs like Ollama.

Make sure you have Bun or Node.js installed on your machine.

Open a terminal and create a new SvelteKit project:

bunx sv create sveltekit-ollama

Then, follow the prompts to configure your project:

Which template would you like? SvelteKit minimal
Add type checking with Typescript? Yes, using Typescript syntax
What would you like to add to your project? none
Which package manager do you want to install dependencies with? bun

Then navigate into the project directory:

cd sveltekit-ollama

Start the development server:

bun dev

You should see your SvelteKit app running at http://localhost:5173 (or a similar port). This confirms that your SvelteKit environment is ready.

Installing Ollama

Ollama is an AI framework that generates responses using large language models (LLMs). You must download and install Ollama and the Llama model to use it locally.

Installation Steps:

Visit the Ollama website and download the appropriate installer for your operating system (macOS or Windows).
Once downloaded, follow the installation instructions specific to your platform.
Verify that Ollama is installed by running the following command in your terminal: ollama --version. You should see the printed version number, confirming that Ollama is installed correctly.

Downloading the Llama model

The Llama model is one of the popular open-source large language models supported by Ollama. To use it, you’ll need to download and configure it.

Open your terminal and run:

ollama download llama3.2

Replace llama3.2 with the model you want to use (if different). You can explore other available models on the Ollama website.

Confirm that the model is downloaded successfully by listing the installed models:

ollama list

You should see llama3.2 or the model you downloaded in the list.

Configuring the Ollama API server

Ollama provides a local API server to interact with the model. Start the server with:

ollama serve

This will run the server on http://localhost:11434 by default. You can send prompts to the server using an HTTP client, such as fetch() in your SvelteKit app.

Next steps: connecting SvelteKit and Ollama

Now that both SvelteKit and Ollama are set up, you’re ready to integrate them. The next section will cover:

Manage the application state with SvelteKit’s $state.
Sending HTTP POST requests to the Ollama API server.
Streaming and processing the AI-generated responses.

Continue following the article for details on the full implementation.

Understanding the code

Create the src/routes/+page.svelte file, with this code:

<script lang="ts">
import { page } from "$app/state";
import { onMount } from "svelte";

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state('');

onMount(() => {
    question = page.url.searchParams.get('question') ?? '';
});

async function resetData() {
    question = "";
    text = "";
    status = "";
    statusInvalid = false;
}

async function translateData(language: string) {
    const myquestion = `Help me to translate this text into ${language} language: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function reviewData() {
    const myquestion = `Help me to review this text in a better english form, provide me only the reviewed text: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}
async function readData() {
    text = "";
    const myquestion = question;
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function askQuestion(myquestion: string) {
    try {
        if (question === "") {
            throw new Error("Question is empty");
        }
        const url = "http://localhost:11434/api/generate";
        const response = await fetch(url, {
            method: "POST",
            body: JSON.stringify({
                model: "llama3.2",
                prompt: myquestion,
            }),
        });
        if (!response.ok) {
            throw new Error(
                `HTTP error! Status: ${response.status} ${response.statusText}`,
            );
        }
        if (!response.body) {
            throw new Error("Readable stream not found in the response.");
        }
        const reader = response.body.getReader();
        while (true) {
            const { done, value } = await reader.read();
            if (done) {
                status = "";
                statusInvalid = false;
                return;
            }
            const mystring = new TextDecoder().decode(value);
            const myresponse = JSON.parse(mystring);
            console.log(myresponse.response);
            text = text + myresponse.response;
        }
    } catch (error: unknown) {
        if (error instanceof Error) {
            status = error.message;
            statusInvalid = true;
            console.error("An error occurred:", error.message);
        } else {
            console.error("An unknown error occurred:", error);
        }
    }
}
</script>

<main class="container">
    <form>
        <div class="grid">
          <div>
              <textarea
                  bind:value={question}
                  name="question"
                  placeholder="Write your question to Robertito AI"
                  aria-label="Professional short bio"
                  aria-invalid="{ statusInvalid }"
                  aria-describedby="invalid-helper"
              ></textarea>
              <small id="invalid-helper">{ status }</small>

        </div>
          <div>
        <textarea bind:value={text}></textarea>
              </div>
        </div>

        <div role="group">
            <button class="lg" onclick={() => resetData()}> Reset </button>
            <button class="lg" onclick={() => reviewData()}>
                Review the text
            </button>
            <button class="lg" onclick={() => translateData("Italian")}>
                Translate to 🇮🇹
            </button>
            <button class="lg" onclick={() => translateData("English (British)")}>
                Translate to 🇬🇧
            </button>
            <button class="danger lg" onclick={() => readData()}>
                Ask me!
            </button>
        </div>


    </form>
</main>

<style>
    textarea {
        width: 100%;
        height: 40vh;
    }
</style>

The provided SvelteKit code demonstrates a web app that interacts with an Ollama server to send prompts and receive AI-generated text responses. Here's the breakdown of the key parts.

Reactive state variables with `$state`

SvelteKit's $state() is a shorthand for reactive state variables, simplifying managing mutable data that affects your UI.

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state('');

onMount(() => {
    question = page.url.searchParams.get('question') ?? '';
});

Each variable is initialized with a default value, and any changes automatically update the UI where the variable is used. This eliminates the need for explicit event emitters or update calls.

Key Actions:

resetData(): Resets all state variables.
translateData(language): Translates question into the specified language.
reviewData(): Asks for a language review of the question.
readData(): Sends the question to the server for AI processing.

Handling HTTP streaming from Ollama

The askQuestion() function manages the interaction with Ollama's API, utilizing a streaming HTTP response. Here's how it works:

async function askQuestion(myquestion: string) {
  try {
    if (question === "") {
      throw new Error("Question is empty");
    }
    const url = "http://localhost:11434/api/generate";
    const response = await fetch(url, {
      method: "POST",
      body: JSON.stringify({
        model: "llama3.2",
        prompt: myquestion,
      }),
    });
    if (!response.ok) {
      throw new Error(
        `HTTP error! Status: ${response.status} ${response.statusText}`,
      );
    }
    if (!response.body) {
      throw new Error("Readable stream not found in the response.");
    }
    const reader = response.body.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) {
        status = "";
        statusInvalid = false;
        return;
      }
      const mystring = new TextDecoder().decode(value);
      const myresponse = JSON.parse(mystring);
      console.log(myresponse.response);
      text = text + myresponse.response;
    }
  } catch (error: unknown) {
    if (error instanceof Error) {
      status = error.message;
      statusInvalid = true;
      console.error("An error occurred:", error.message);
    } else {
      console.error("An unknown error occurred:", error);
    }
  }
}

How It Works:

Input validation: ensures the question isn't empty.
API call: sends a POST request with the user's myquestion prompt.
Response streaming: reads chunks of data (value) from the server as they arrive and decodes them using TextDecoder.
Error Handling: captures potential errors in the HTTP call, parsing, or streaming process.

This incremental rendering provides a smooth experience where users see the response as it’s being generated.

Interactive UI: reactive state in action

The UI is tightly coupled with the reactive state variables, ensuring any changes are immediately reflected. For instance:

<div class="grid">
    <div>
        <textarea
            bind:value={question}
            name="question"
            placeholder="Write your question to Robertito AI"
            aria-label="Professional short bio"
            aria-invalid="{ statusInvalid }"
            aria-describedby="invalid-helper"
        ></textarea>
        <small id="invalid-helper">{ status }</small>
    </div>
    <div>
        <textarea bind:value={text}></textarea>
    </div>
</div>

Here, the question and text states directly bind to the <textarea> elements, providing seamless two-way data binding. When the API response updates text, the changes instantly appear in the UI.

The buttons trigger state updates or server interactions:

<div role="group">
    <button class="lg" onclick={() => resetData()}> Reset </button>
    <button class="lg" onclick={() => reviewData()}>
        Review the text
    </button>
    <button class="lg" onclick={() => translateData("Italian")}>
        Translate to 🇮🇹
    </button>
    <button class="lg" onclick={() => translateData("English (British)")}>
        Translate to 🇬🇧
    </button>
    <button class="danger lg" onclick={() => readData()}>
        Ask me!
    </button>
</div>

Each button invokes a specific function, directly manipulating state or sending API requests, resulting in instant feedback to the user.

Enhancing error handling

Errors are displayed dynamically to the user using the status and statusInvalid states:

<small id="invalid-helper">{ status }</small>

This ensures users are informed of any issues, such as an empty prompt or server errors, without disrupting their workflow.

Conclusion

By combining SvelteKit's $state for state management with efficient HTTP streaming handling, this integration demonstrates the power of reactive programming for modern applications. The real-time feedback and incremental rendering ensure a delightful user experience, making it ideal for AI-driven applications powered by Ollama.

With further customization and optimization, this setup can form the foundation for a wide range of interactive, AI-powered web applications.

ollama Article's

30 articles in total