Graph RAG: Improving RAG with Knowledge Graphs

Introduction

Graph-based Retrieval Augmented Generation (Graph RAG) is a novel system introduced by Microsoft that enhances the conventional Retrieval Augmented Generation (RAG) by integrating knowledge graphs. This integration attempts to overcome some of the inherent limitations of traditional RAG systems, such as limited contextual understanding, scalability issues, and the complexity associated with integrating external knowledge sources.

The Traditional RAG Approach

How It Works

Document Processing:
- Documents are divided into subdocuments using a chunking strategy.
- Embeddings are computed for each chunk and stored in a vector store that acts as a knowledge base.
Query Phase:
- User inquiries are transformed into embeddings.
- Similarity searches within the vector database retrieve the most relevant subdocuments.
- These relevant chunks are combined with the user query to generate a response via a large language model.

Limitations

Limited Contextual Understanding: RAG may miss nuances due to its reliance on specific retrieved documents.
Scalability Issues: As the corpus grows, the retrieval process becomes less efficient.
Complexity: Integrating external knowledge sources meaningfully can be cumbersome.

Introduction to Graph RAG

Microsoft's Graph RAG integrates knowledge graphs to address these limitations. The source code is available on GitHub, and it can be used with models like GPT-4 or local models like Lama 3. Here's an outline of its workings:

Indexing Phase

Subdocument Creation: Original documents are divided into smaller chunks.
Entity and Relationship Extraction: Identify entities (e.g., people, places) within each chunk and extract relationships between them.
**Knowledge Graph Creation:** Use the extracted entities and relationships to form a knowledge graph.
Community Detection and Summarization: Group related entities into communities and summarize each community at various levels.

Query Phase

User Query Processing: Process the user query and select the required community level.
Retrieval and Response Generation: Retrieve summaries of the relevant communities and generate a final response.

Technical Details

The integration process involves entity recognition, relationship mapping, and summarization, making the system more robust and contextually aware.

Setting Up Graph RAG

Step-by-Step Instructions

Create a Virtual Environment:

conda create -n graph_rag python=3.8
conda activate graph_rag

Install the Package:
```
pip install graph_rag
```
Run Indexing:
- Prepare the data directory, for instance:
```
mkdir -p rag_test/input
wget -O rag_test/input/book.txt [link_to_text_file]
```
- Set up configurations and run the indexing process:
```
python -m graph_rag.index --root_dir=rag_test
```
- Provide API keys and configuration details:
```
settings.yml
```

Run Queries:

python -m graph_rag.query --root_dir=rag_test --method=<community_level>

Example Queries

Global Inquiry: "What are the main themes in this story?"

python -m graph_rag.query --root_dir=rag_test --method=global

Local Inquiry: "Describe character X."

python -m graph_rag.query --root_dir=rag_test --method=local

Cost Implications

While Graph RAG significantly improves understanding and contextual response, it comes with higher costs due to multiple API calls and token processing, especially when using advanced models like GPT-4.

Alternative Implementations

Other tools like Llama Index and Neo4j offer their own implementations of Knowledge Graph RAG. These may provide varying benefits and cost structures.

Keywords

Graph RAG
Knowledge Graph
Retrieval Augmented Generation (RAG)
GPT-4
Llama 3
Entity Extraction
Relationship Mapping
Community Detection

FAQ

What is Graph RAG?

Graph RAG is a system that integrates knowledge graphs with traditional Retrieval Augmented Generation techniques to improve contextual understanding and scalability.

How does Graph RAG improve over traditional RAG systems?

Graph RAG addresses limitations like limited contextual understanding, scalability issues, and complexity in integrating external knowledge sources.

What are the primary components of Graph RAG?

Graph RAG involves an indexing phase (subdocument creation, entity and relationship extraction, knowledge graph creation) and a query phase (user query processing, retrieval, and response generation).

How can I set up Graph RAG?

You can set up Graph RAG by creating a virtual environment, installing the necessary package, running the indexing process, and executing queries. Detailed instructions are available in the article.

What are the cost implications of running Graph RAG?

Running Graph RAG can be expensive due to multiple API calls and token processing, particularly when using advanced models like GPT-4.