dev-resources.site

for different kinds of informations.

Blossoming Intelligence: How to Run Spring AI Locally with Ollama

Published at

5/11/2024

TechStack

This project is built using:

Java 21.
Spring boot 3.2.5 with WebFlux.
Spring AI 3.2.5.
Ollama 0.1.36.

Ollama Setup

To install Ollama locally, you simply need to head to https://ollama.com/download and install it using the proper executable to your OS.

You check is installed by running the following command:



ollama --version

You can directly pull a model from Ollama Models) and run it using the ollama cli, in my case I used the llama3 model:



ollama pull llama3 # Should take a while.
ollama run llama3

Let's test it out with a simple prompt:

To exist, use the command:



/bye

Talking Spring

The Spring will have the following properties:



spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.model=llama3

Then is our chat package, will have a chat config bean to handle:



package io.daasrattale.generalknowledge.chat;

import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.ollama.OllamaChatClient;
import org.springframework.ai.ollama.api.OllamaApi;
import org.springframework.ai.ollama.api.OllamaOptions;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatConfig {


    @Bean
    public ChatClient chatClient() {
        return new OllamaChatClient(new OllamaApi())
                .withDefaultOptions(OllamaOptions.create()
                        .withTemperature(0.9f));
    }
}

FYI, Model temperature is a parameter that controls how random a language model's output is.
A temperature is set to 0.9 to make the model more random and willing to take more risks on the answers.

The last step is to create a simple Chat rest controller:



package io.daasrattale.generalknowledge.chat;

import org.springframework.ai.ollama.OllamaChatClient;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;

@RestController
@RequestMapping("/v1/chat")
public class ChatController {


    private final OllamaChatClient chatClient;

    public ChatController(OllamaChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping
    public Mono<ResponseEntity<String>> generate(@RequestParam(defaultValue = "Tell me to add a proper prompt in a funny way") String prompt) {
        return Mono.just(
                ResponseEntity.ok(chatClient.call(prompt))
        );
    }


}

Let's try and call a GET /v1/chat with an empty prompt: