You want to understand what a Knowledge Graph is and how it helps in RAG (Retrieval-Augmented Generation) — without any scary computer words.
Let’s do it step by step, like a story 🪄
What is RAG?
Imagine you’re doing homework, and you have two friends:
- Alex — who is great at explaining things (that’s the AI).
- Sam — who keeps a huge bookshelf of facts (that’s the retrieval part).
When you ask a question, Alex doesn’t guess;
he first asks Sam to find the right book or page,
then reads it and gives you an answer.
That’s RAG:
“Retrieval” = finding the right info,
“Augmented Generation” = using that info to write the answer.
So, RAG = AI that double-checks before answering
What is a Knowledge Graph?
Now imagine Sam (the librarian friend) is messy.
Books are everywhere!
He can’t remember which book connects to which topic.
So, we give Sam a magic map that shows how everything connects —
that map is a Knowledge Graph (KG).
It’s like:
[Harry Potter] → [Written by] → [J.K. Rowling]
[J.K. Rowling] → [Born in] → [UK]
[Harry Potter] → [Type] → [Book]
Each bubble (node) is a thing,
and each arrow (edge) shows how things are related.
Why do we mix Knowledge Graph with RAG?
Now imagine Alex (AI) wants to answer:
“Who wrote Harry Potter and where was that person born?”
Instead of searching thousands of pages,
Sam (the librarian) looks at his Knowledge Graph:
Harry Potter → Written by → J.K. Rowling → Born in → UK
Boom! Found the path in one glance
So, in RAG + Knowledge Graph:
- The Knowledge Graph helps the AI understand relationships between facts.
- The Retrieval part pulls real text or documents for proof.
- The AI (Alex) uses both to give a clear, correct answer.
Step 4: Comparing the two
| Thing | Normal RAG | RAG + Knowledge Graph |
|---|---|---|
| Works like | AI + search engine | AI + smart map of facts |
| Finds info | By matching similar text | By following logical links |
| Example | Finds paragraphs that mention “Harry Potter” | Knows that “Harry Potter → written by → J.K. Rowling → born in → UK” |
| Result | “Maybe J.K. Rowling” | “It’s J.K. Rowling, who was born in the UK — here’s the source.” |
Simple real-world example
Let’s say we build a small graph for animals 🐾
[Dog] → [is a] → [Animal]
[Dog] → [has sound] → [Bark]
[Cat] → [has sound] → [Meow]
Now you ask the AI:
“Which animal makes a meow sound?”
Without the graph, it has to search all text to find “cat = meow”.
With the Knowledge Graph, it just follows the link:
? → has sound → Meow ⇒ Cat
That’s how KG helps RAG think logically instead of just guessing words.
Why it’s powerful
Knowledge Graph in RAG helps the AI:
- Find accurate facts
- Understand connections (who, what, where, how)
- Explain answers clearly (show the path)
- Avoid confusion (like “Apple” the fruit vs. “Apple” the company)
Simple summary
| Concept | Meaning |
|---|---|
| Knowledge Graph | A smart map that shows how things are related |
| RAG | AI that finds facts before answering |
| RAG + KG | AI that uses a smart map and real info to answer logically and accurately |
Imagine a school version
If RAG is like:
“I’ll read all the textbooks to find the answer,”
Then RAG + Knowledge Graph is like:
“I’ll use the mind map I made in my notebook to quickly find which topic connects to which — then look up just that page.”
Why it might be slower
Here’s what happens inside the computer (in kid-friendly terms):
| Step | What happens | Speed impact |
|---|---|---|
| Entity linking | AI figures out “who/what” you’re talking about (like “Apple” = company, not fruit) | adds a few milliseconds |
| Graph search | It follows the lines (edges) in the knowledge graph to find connected facts | can be slower if graph is huge |
| Document retrieval | Finds the right paragraphs or documents | usually fast |
| LLM generation | Writes the answer using the info | same speed as before |
demo_holiday_policy_rag_vs_graph.py
import os, asyncio, textwrap
from datetime import datetime, timezone
from openai import OpenAI
import chromadb
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
SAMPLE DOC (school holiday policy)
DOC = textwrap.dedent(“””
Title: Springdale High – Holiday Policy (2025)
1) School is closed on all national holidays and second Saturdays.
2) Winter break: Dec 20–Jan 2 (classes resume Jan 3).
3) Festival allowances: Each student may take 2 optional cultural holidays per year with prior approval.
4) Make-up exams are available one week after any official holiday.
5) Office hours remain 9am–1pm during winter break (no classes).
“””).strip()
QUESTION = “Are classes running on 3rd october and can a student take an extra day for a festival?”
client = OpenAI(api_key=os.environ[“OPENAI_API_KEY”])
PLAIN VECTOR RAG (ChromaDB)
def plain_rag(doc: str, question: str) -> str:
# 1a) Build a tiny vector store
chroma = chromadb.Client()
embed = OpenAIEmbeddingFunction(api_key=os.environ[“OPENAI_API_KEY”], model_name=”text-embedding-3-small”)
coll = chroma.create_collection(name=”school_policy”, embedding_function=embed)
# 1b) Naive chunking and upsert
chunks = [c.strip() for c in doc.split("\n") if c.strip()]
coll.upsert(ids=[f"c{i}" for i in range(len(chunks))], documents=chunks)
# 1c) Retrieve
hits = coll.query(query_texts=[question], n_results=4)
context = "\n".join(hits["documents"][0])
# 1d) Generate
prompt = f"Use ONLY this policy to answer.\n\nPOLICY:\n{context}\n\nQ: {question}\nA:"
resp = client.responses.create(model="gpt-4o-mini", input=prompt)
return resp.output_text.strip()GRAPH RAG (Graphiti)
– Ingest as an ‘episode’ -> Graphiti extracts entities & relations
– Hybrid search follows graph + text to fetch precise facts
from graphiti_core import Graphiti
from graphiti_core.nodes import EpisodeType
async def graph_rag(doc: str, question: str) -> str:
g = Graphiti(os.environ[“NEO4J_URI”], os.environ[“NEO4J_USER”], os.environ[“NEO4J_PASSWORD”])
try:
# one-time init (safe to call repeatedly)
await g.build_indices_and_constraints()
# add the document as a text "episode" (auto triplet extraction + chunking)
await g.add_episode(
name="Springdale_Holiday_Policy_2025",
episode_body=doc,
source=EpisodeType.text,
source_description="school policy",
reference_time=datetime.now(timezone.utc),
)
# graph-aware search (semantic + BM25 + graph traversal)
facts = await g.search(question) # returns ranked fact edges w/ source/validity
# stitch top facts as grounded context
top = "\n".join([f"- {f.fact}" for f in facts[:6]])
prompt = f"Answer strictly from these facts.\nFACTS:\n{top}\n\nQ: {question}\nA:"
resp = client.responses.create(model="gpt-4o-mini", input=prompt)
return resp.output_text.strip()
finally:
await g.close()RUN BOTH & COMPARE
if name == “main“:
print(“\n— Plain RAG —“)
print(plain_rag(DOC, QUESTION))
print("\n--- Graph RAG (Graphiti) ---")
print(asyncio.run(graph_rag(DOC, QUESTION)))