LightRAG is an improvement to GraphRAG that focuses on

  • simplicity. optimize computational cost during retrieval and generation
  • fast. New data can be integrated without rebuilding th entire graph.
  • efficiency.

During the processing stage,

  1. we first extract entities and relations with LLMs. Long texts are split into chunks and LLMs are applied for extraction. Entities include names, location, events, etc; relations capture semantic links such as “belongs to”, “contains”, “depends on”, etc.
  2. After extraction, LLMs are then utilized to generate structured key-value pairs, where key stands for entity, and values stand for descriptive texts.
  3. Then we remove duplicates. After graph construction, the system remove redundancy by merge duplicate nodes and their values, as well as edges (relations).

During retrieval, LightRAG extracts both low-level and high-level keywords from queries with LLMs. The retrieval is also split into low-level and high-level:

  • Low-level retrieval locates specific entities and relations
  • High-level retrieval captures border thematic context

The retrieved contents are then passed to LLM for final generation.

RAG