[How To] Run DeepSeek-R1 Locally: Ubuntu & Ollama Setup

This guide provides a comprehensive walkthrough on how to run the powerful DeepSeek-R1 large language model (LLM) locally on your Ubuntu system using Ollama. Running LLMs locally offers enhanced privacy, faster inference times, and the ability to experiment without cloud dependencies.

Table of Contents

Prerequisites

Before you begin, ensure your Ubuntu system meets the following requirements:

  • An Ubuntu operating system (latest LTS version recommended).
  • A stable internet connection for downloading Ollama and the DeepSeek-R1 model.
  • Sufficient hardware resources:
    • RAM: At least 16GB of RAM is recommended for running 7B parameter models effectively. Larger models will require significantly more.
    • GPU: A dedicated GPU with ample VRAM is highly recommended for faster inference. While Ollama can run on CPU, performance will be considerably slower for larger models.

Step 1: Install Ollama on Ubuntu

Ollama provides a convenient installation script that automates the setup process on Linux systems. For a more detailed guide on installing Ollama, refer to our article on Installing Ollama on Ubuntu. Open your terminal and execute the following command:

lc-root@ubuntu:~$ curl -fsSL https://ollama.com/install.sh | sh

This script downloads and installs the Ollama binary, sets up the necessary services, and adds Ollama to your system’s PATH. The output will show the installation progress. Once completed, Ollama is ready to use.

Step 2: Start the Ollama Service

After installation, the Ollama service typically starts automatically. You can verify its status or start it manually if needed by running:

lc-root@ubuntu:~$ ollama serve

Keep this terminal window open as the Ollama server needs to be running to download and interact with models. If you close it, the service will stop, and models will not be accessible.

Step 3: Pull the DeepSeek-R1 Model

DeepSeek-R1 is available in various sizes (e.g., 1.5B, 7B, 8B, 14B, 32B, 70B). The model size you choose depends on your system’s capabilities and your specific use case. For demonstration purposes, we will pull the 7B parameter version. Execute the command below in a new terminal window:

lc-root@ubuntu:~$ ollama pull deepseek-r1:7b

This command initiates the download of the DeepSeek-R1 7B model. The download time will vary based on your internet connection speed and the model’s size. Ollama displays the download progress.

Step 4: Run DeepSeek-R1 Locally and Interact

Once the DeepSeek-R1 model has been successfully downloaded, you can start an interactive session directly from your terminal. In the same terminal window where you pulled the model (or a new one, ensuring the Ollama service is still running), type:

lc-root@ubuntu:~$ ollama run deepseek-r1:7b

You can now interact with the DeepSeek-R1 model. Type your prompts, and the model will generate responses directly in your terminal. To exit the session, type `/bye` or press `Ctrl + D`.

lc-root@ubuntu:~$ ollama run deepseek-r1:7b
>>> Explain quantum computing in simple terms.
Quantum computing is a new type of computing that uses the principles of quantum mechanics to solve problems that are too complex for classical computers. It's like a super-powered computer that can do things that regular computers can't.
>>>

System Requirements and Best Practices

Running large language models locally can be resource-intensive. Consider these best practices for optimal performance:

  • Monitor Resources: Keep an eye on your system’s RAM and GPU usage during inference.
  • Choose Model Size Wisely: Select a model size that aligns with your hardware specifications to avoid performance bottlenecks.
  • Update Ollama Regularly: Ensure you are using the latest version of Ollama for new features, bug fixes, and performance improvements.
  • Utilize GPU Acceleration: If you have a compatible GPU, verify that Ollama is utilizing it for accelerated inference.

Conclusion

You have successfully set up and run the DeepSeek-R1 LLM locally on your Ubuntu system using Ollama. This powerful combination allows you to leverage advanced AI capabilities for various tasks, from content generation to coding assistance, all within your local environment. Experiment with different prompts and explore the potential of local LLM inference.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.