Logo

dev-resources.site

for different kinds of informations.

Blossoming Intelligence: How to Run Spring AI Locally with Ollama

Published at
5/11/2024
Categories
spring
ai
ollama
llama3
Author
daasrattale
Categories
4 categories in total
spring
open
ai
open
ollama
open
llama3
open
Author
11 person written this
daasrattale
open
Blossoming Intelligence: How to Run Spring AI Locally with Ollama

Nobody can dispute that AI is here to stay. Among many of its benefits, developers are using its capability to boost their productivity.
It is also planned to become accessible for a fee as a SaaS or any other service once it has gained the necessary trust from enterprises.
Still, We can run pre-trained models locally and incorporate them into our current app.

In this short article, we'll look at how easy it is to create a chat bot backend powered by Spring and Olama using the llama 3 model.

TechStack

This project is built using:

  • Java 21.
  • Spring boot 3.2.5 with WebFlux.
  • Spring AI 3.2.5.
  • Ollama 0.1.36.

Ollama Setup

To install Ollama locally, you simply need to head to https://ollama.com/download and install it using the proper executable to your OS.

You check is installed by running the following command:



ollama --version


Enter fullscreen mode Exit fullscreen mode

You can directly pull a model from Ollama Models) and run it using the ollama cli, in my case I used the llama3 model:



ollama pull llama3 # Should take a while.
ollama run llama3


Enter fullscreen mode Exit fullscreen mode

Let's test it out with a simple prompt:

Image description

To exist, use the command:



/bye


Enter fullscreen mode Exit fullscreen mode

Talking Spring

The Spring will have the following properties:



spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.model=llama3


Enter fullscreen mode Exit fullscreen mode

Then is our chat package, will have a chat config bean to handle:



package io.daasrattale.generalknowledge.chat;

import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.ollama.OllamaChatClient;
import org.springframework.ai.ollama.api.OllamaApi;
import org.springframework.ai.ollama.api.OllamaOptions;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatConfig {


    @Bean
    public ChatClient chatClient() {
        return new OllamaChatClient(new OllamaApi())
                .withDefaultOptions(OllamaOptions.create()
                        .withTemperature(0.9f));
    }
}


Enter fullscreen mode Exit fullscreen mode

FYI, Model temperature is a parameter that controls how random a language model's output is.
A temperature is set to 0.9 to make the model more random and willing to take more risks on the answers.

The last step is to create a simple Chat rest controller:



package io.daasrattale.generalknowledge.chat;

import org.springframework.ai.ollama.OllamaChatClient;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;

@RestController
@RequestMapping("/v1/chat")
public class ChatController {


    private final OllamaChatClient chatClient;

    public ChatController(OllamaChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping
    public Mono<ResponseEntity<String>> generate(@RequestParam(defaultValue = "Tell me to add a proper prompt in a funny way") String prompt) {
        return Mono.just(
                ResponseEntity.ok(chatClient.call(prompt))
        );
    }


}


Enter fullscreen mode Exit fullscreen mode

Let's try and call a GET /v1/chat with an empty prompt:

Image description

What about a simple general knowledge question:

Image description

Of course, let's ask for some code:

Image description

The api url provided is not OK, still the rest of the code works.

Finally

Using models locally with such ease and simplicity can be considered as a true added value, still, the used models must be heavily inspected.

You can find the source code on this Github Repository make sure to star it if you find it useful :))

Resources

https://spring.io/projects/spring-ai

https://docs.spring.io/spring-ai/reference/api/clients/ollama-chat.html

llama3 Article's
30 articles in total
Favicon
Novita AI API on gptel: Supercharge Emacs with LLMs
Favicon
How to Effectively Fine-Tune Llama 3 for Optimal Results?
Favicon
L3 8B Lunaris: Generalist Roleplay Model Merges on Llama-3
Favicon
Accessing Novita AI API through Portkey AI Gateway: A Comprehensive Guide
Favicon
Llama 3 vs Qwen 2: The Best Open Source AI Models of 2024
Favicon
Llama 3.3 vs GPT-4o: Choosing the Right Model
Favicon
Meta's Llama 3.3 70B Instruct: Powering AI Innovation on Novita AI
Favicon
MINDcraft: Unleashing Novita AI LLM API in Minecraft
Favicon
How to Access Llama 3.2: Streamlining Your AI Development Process
Favicon
Are Llama 3.1 Free? A Comprehensive Guide for Developers
Favicon
How Much RAM Memory Does Llama 3.1 70B Use?
Favicon
How to Install Llama-3.3 70B Instruct Locally?
Favicon
Arcee.ai Llama-3.1-SuperNova-Lite is officially the 8-billion parameter model
Favicon
LLM Inference using 100% Modern Java ☕️🔥
Favicon
Enhance Your Projects with Llama 3.1 API Integration
Favicon
Llama 3.2 Running Locally in VSCode: How to Set It Up with CodeGPT and Ollama
Favicon
Llama 3.2 is Revolutionizing AI for Edge and Mobile Devices
Favicon
Two new models: Arcee-Spark and Arcee-Agent
Favicon
How to deploy Llama 3.1 405B in the Cloud?
Favicon
ChatPDFLocal: Chat with Your PDFs Offline with Llama3.1 locally,privately and safely.
Favicon
How to deploy Llama 3.1 in the Cloud: A Comprehensive Guide
Favicon
How to fine tune a model which is available in ollama
Favicon
Theoretical Limits and Scalability of Extra-LLMs: Do You Need Llama 405B
Favicon
Milvus Adventures July 29, 2024
Favicon
Lightning-Fast Code Assistant with Groq in VSCode
Favicon
Journey towards self hosted AI code completion
Favicon
Blossoming Intelligence: How to Run Spring AI Locally with Ollama
Favicon
Setup REST-API service of AI by using Local LLMs with Ollama
Favicon
Hindi-Language AI Chatbot for Enterprises Using Qdrant, MLFlow, and LangChain
Favicon
#SemanticKernel: Local LLMs Unleashed on #RaspberryPi 5

Featured ones: