Running large language models (LLMs) directly on Linux systems gives users full control over their data, eliminates recurring API costs, and allows for offline access. Two standout tools for accomplishing this are Ollama and LM Studio. Each offers a distinct approach: Ollama provides a powerful command-line interface and API, while LM Studio delivers a streamlined graphical desktop experience. Both support a wide range of open-source models, including Llama, Mistral, DeepSeek, and more.
Using Ollama to Run Local LLMs
Ollama operates as a lightweight command-line tool that manages, downloads, and runs LLMs locally. It is cross-platform, but Linux users benefit from its straightforward installation and robust performance, especially when leveraging GPU acceleration.
curl -fsSL https://ollama.com/install.sh | sh
This script downloads and installs Ollama. After installation, verify by running:
ollama --version
This command displays the installed version, confirming a successful setup.
ollama pull llama2:7b-chat
This fetches the model weights to your machine. Download times vary depending on model size and internet speed.
ollama run llama2:7b-chat
This launches an interactive prompt where you can enter questions and receive responses. For one-off queries, you can append the prompt directly:
ollama run llama2:7b-chat "What is the capital of Poland?"
ollama serve
Then, send requests using curl or any HTTP client. For example:
curl http://localhost:11434/api/generate -d '{
"model": "llama2:7b-chat",
"prompt": "List three Linux distributions.",
"stream": false
}'
The API returns generated responses as JSON. This setup allows developers to build custom chatbots, assistants, or integrate LLMs into other software without relying on external services.
OLLAMA_DEBUG=true ollama run llama2
Look for log messages indicating CUDA usage. GPU acceleration significantly speeds up inference for larger models.
ollama create my-custom-model -f /path/to/Modelfile
This feature is valuable for automating workflows or tailoring models to specific tasks.
Join readers who trust AllThings.How
Add us as a preferred source on Google so our practical guides show up first next time you search.
Add to Google Preferences →Using LM Studio to Run Local LLMs
LM Studio offers a graphical desktop application for running LLMs locally on Linux, Windows, and macOS. It’s especially suitable for users who prefer not to use the command line or want a more interactive experience.
LM-Studio-0.3.9-6-x64.AppImage and is about 1GB in size.chmod u+x LM-Studio-0.3.9-6-x64.AppImage
./LM-Studio-0.3.9-6-x64.AppImage
This command starts the application, which automatically opens the graphical interface.
- Set system prompts to guide model behavior.
- Customize parameters such as temperature, top-p, top-k, and max tokens.
- Control GPU offload settings to balance speed and memory usage. For example:
- 4GB–8GB VRAM: Use partial offload (10–50 layers).
- 10GB–16GB VRAM: Use higher offload (50–80%).
- 24GB+ VRAM: Use full GPU offload if available.
localhost, allowing other programs to interact with your chosen model. Developers can point their OpenAI API clients to this endpoint, making integration with existing tools simple.Managing Model Storage and Sharing Between Ollama and LM Studio
Both Ollama and LM Studio store models in separate directories and may use different file formats. Ollama typically uses a Mojo-based format, while LM Studio relies on GGUF files compatible with llama.cpp. To avoid redundant downloads and conserve disk space, users can:
- Identify where each application stores its models (
~/.ollama/modelsfor Ollama,~/.cache/lm-studio/models/for LM Studio). - If a model is in GGUF format, create a symbolic link from the shared model location to the other application’s directory. For example:
mkdir /store/MyModels
cd /store/MyModels
# Download or move the model here
ln -s ./the-model-file ~/.cache/lm-studio/models/
Alternatively, use community tools like gollama or llamalink to automate symlinking models between Ollama and LM Studio. Note that not all models are cross-compatible; conversion to GGUF may be required for LM Studio.
Choosing the Right Tool for Your Needs
LM Studio is ideal for users who want a graphical, plug-and-play experience with minimal setup. Its interface simplifies model discovery, parameter tuning, and multi-model management, making it suitable for beginners and those preferring a visual workflow.
Ollama excels for developers and advanced users who need automation, scripting, or integration into larger workflows. Its command-line and API features provide granular control over model usage, customization, and performance optimization.
Both tools empower users to run powerful LLMs locally, maintaining privacy, reducing costs, and enabling experimentation without reliance on cloud services. By selecting the approach that matches your technical comfort and project goals, you can deploy advanced AI on your own hardware efficiently.
Running LLMs with Ollama and LM Studio on Linux streamlines local AI deployment, giving you flexibility and privacy without sacrificing performance. Try both methods to see which workflow fits your needs best.






