Logo

dev-resources.site

for different kinds of informations.

How to get up & running a LLM locally - in 5 minutes

Published at
3/23/2024
Categories
llm
chatgpt
mistral
ollama
Author
hayerhans
Categories
4 categories in total
llm
open
chatgpt
open
mistral
open
ollama
open
Author
9 person written this
hayerhans
open
How to get up & running a LLM locally - in 5 minutes

Video Version:
https://youtube.com/shorts/y0NWVUsfLiU?si=x16bKEoHLfk87nC2

What is Ollama?

It's a lightweight framework designed for those who wish to experiment with, customize, and deploy large language models without the hassle of cloud platforms. With Ollama, the power of AI is distilled into a simple, local package, allowing developers and hobbyists alike to explore the vast capabilities of machine learning models.

Setting Up Ollama: A Step-by-Step Approach

First download ollama for your OS here:
https://ollama.com/download

Second run the model you want with:

ollama run llama2

Model library

Ollama supports a list of models available on ollama.com/library

Here are some example models that can be downloaded:

Model Parameters Size Download Command
Llama 2 7B 3.8GB ollama run llama2
Mistral 7B 4.1GB ollama run mistral
Dolphin Phi 2.7B 1.6GB ollama run dolphin-phi
Phi-2 2.7B 1.7GB ollama run phi
Neural Chat 7B 4.1GB ollama run neural-chat
Starling 7B 4.1GB ollama run starling-lm
Code Llama 7B 3.8GB ollama run codellama
Llama 2 Uncensored 7B 3.8GB ollama run llama2-uncensored
Llama 2 13B 13B 7.3GB ollama run llama2:13b
Llama 2 70B 70B 39GB ollama run llama2:70b
Orca Mini 3B 1.9GB ollama run orca-mini
Vicuna 7B 3.8GB ollama run vicuna
LLaVA 7B 4.5GB ollama run llava
Gemma 2B 1.4GB ollama run gemma:2b
Gemma 7B 4.8GB ollama run gemma:7b

Memory Requirements:
Keep in mind, running these models isn't light on resources. Ensure you have at least 8 GB of RAM for 7B models, and more for the larger ones, to keep your AI running smoothly.

Customization

With Ollama, you're not just running models; you're tailoring them. Import models with ease and customize prompts to fit your specific needs. Fancy a model that responds as Mario? Ollama makes it possible with simple command lines:

Customize a prompt

Models from the Ollama library can be customized with a prompt. For example, to customize the llama2 model:

ollama pull llama2

Create a Modelfile:

FROM llama2


# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system message

SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
""" 
Enter fullscreen mode Exit fullscreen mode

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario

hi
Hello! It's your friend Mario.


If you liked this content also have a look at my YouTube channel

Featured ones: