Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)

Introduction

Are you looking for a simple and comprehensive package to set up local AI? This article introduces an incredible starter kit developed by the n8n team that includes everything you need for local AI infrastructure. The package combines Open Llama for large language models (LLMs), Quadrant for the vector database, PostgreSQL for SQL database storage, and n8n for workflow automation—all rolled into one easy-to-install solution. Let's dive into how you can set it up in just a few minutes and even extend it to create a fully functional Retrieval-Augmented Generation (RAG) AI agent.

Why Local AI is Important

Running your own AI infrastructure is not just a trend; it's becoming increasingly accessible. With open-source models like Llama becoming more powerful, local AI can now compete with proprietary models. Setting up your local AI stack is straightforward with the tools provided in this package.

Installation Step-by-Step Guide

Prerequisites

Before you get started, make sure you have the following installed on your system:

Git: Recommended to manage your repositories.
Docker: Essential for containerization, along with Docker Compose.

Cloning the Repository

First, clone the GitHub repository for the self-hosted AI starter kit using the following command:

git clone <repository-url>

After cloning, change into the directory:

cd self-hosted-ai-starter-kit

Configure Environment Variables

You need to customize your environment settings. Open the .env file to set your PostgreSQL credentials, database name, and define secrets for n8n.

Editing the Docker Compose File

The next step is to make necessary edits in the docker-compose.yml file:

PostgreSQL Port Exposure: Add the following lines under the PostgreSQL service to expose the port:
```
ports:
  - 5432:5432
```
Add Olama Embedding Model: Include an additional line to pull the Olama embedding model.

Running Docker Compose

Once you have configured the docker-compose.yml file, execute the following command to launch all the components:

docker-compose -d up --build

Verifying Installation

Open Docker Desktop to confirm that all four containers are running—PostgreSQL, Quadrant, Olama, and n8n. You can also run commands within each container to verify functionality.

Creating a RAG AI Agent in n8n

To access the n8n interface, navigate to http://localhost:5678. From there, you can create a workflow that integrates PostgreSQL as chat memory, Quadrant as a vector store, and Olama for processing.

Setting Up the Agent

Chat Interaction: Use a chat widget for user interaction.
Integrating Llama 3.1: Set up Olama to utilize Llama 3.1 as your LLM.
PostgreSQL for Memory: Configure PostgreSQL for storing chat memory.

Setting Up RAG with Quadrant

Quadrant will act as your vector store for knowledge retrieval. Ensure proper credential settings with HTTP host.docker.internal and the exposed port.

Testing the Agent

Interact with your newly created agent using the chat widget. Provide it a question that requires access to the ingested knowledge to confirm everything is functioning as expected.

Future Enhancements

As the local AI stack is a continuously evolving system, there are plans to enhance it further with additional tools like Redis for caching and a self-hosted Supabase for more robust functionality, including authentication.

Keyword

Local AI
LLMs
RAG
n8n
PostgreSQL
Docker
Open Llama
Quadrant
AI Infrastructure

FAQ

Q1: What prerequisites do I need to install this local AI stack?
A1: You will need Git and Docker installed on your system.

Q2: Can I customize the PostgreSQL settings?
A2: Yes, you need to configure the .env file to set your own PostgreSQL username, password, and database name.

Q3: How do I verify that all containers are running?
A3: Open Docker Desktop, and you should be able to see all four containers running. You can also check logs for any issues.

Q4: How can I test my RAG AI agent?
A4: You can test your agent by using the chat widget in the n8n interface to ask questions that the model should be able to answer based on your ingested knowledge.

Q5: Are there plans for future enhancements to this setup?
A5: Yes, future enhancements may include incorporating Redis for caching and Supabase for authentication, along with more general improvements to the stack.