LightRAG is an improvement to GraphRAG that focuses on
- simplicity. optimize computational cost during retrieval and generation
- fast. New data can be integrated without rebuilding th entire graph.
- efficiency.
During the processing stage,
- we first extract entities and relations with LLMs. Long texts are split into chunks and LLMs are applied for extraction. Entities include names, location, events, etc; relations capture semantic links such as “belongs to”, “contains”, “depends on”, etc.
- After extraction, LLMs are then utilized to generate structured key-value pairs, where key stands for entity, and values stand for descriptive texts.
- Then we remove duplicates. After graph construction, the system remove redundancy by merge duplicate nodes and their values, as well as edges (relations).
During retrieval, LightRAG extracts both low-level and high-level keywords from queries with LLMs. The retrieval is also split into low-level and high-level:
- Low-level retrieval locates specific entities and relations
- High-level retrieval captures border thematic context
The retrieved contents are then passed to LLM for final generation.