What it's used for

Ollama is the simplest way to run large language models locally on your Mac, Linux, or Windows machine. It packages models with their runtime into a single binary, so you can run Llama, Gemma, Mistral, Phi, and dozens of other open models with a single terminal command.

One-command setup — ollama run llama3.2 downloads and starts the model
Local API server — OpenAI-compatible API at localhost:11434
Model library — 100+ models pre-configured and ready to run
GPU acceleration — automatically uses Apple Silicon, NVIDIA, or AMD GPUs
Modelfile — customize system prompts, parameters, and model behavior

Developers use Ollama for local development, privacy-sensitive applications, offline use, and as a cost-free alternative to API calls during prototyping.

Getting started

Install Ollama:
```
curl -fsSL https://ollama.com/install.sh | sh
```
Or download from ollama.com/download for Mac/Windows.
Run a model:
```
ollama run llama3.2
```

Use the API from your code:

curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "prompt": "Hello"}'

Ollama is completely free and open-source. No API keys, no accounts, no usage limits.

Ollama

What it's used for

Getting started

Commonly paired with

No case studies yet

Related tools in General

Need a Ollama expert?