Graphiti构建和查询时态感知知识图谱

🔥 关注公众号“朋蛋”、“码上小明”

1 介绍

1.1 简单介绍

Graphiti 是一个用于构建时态感知知识图谱的Python 框架,专为 AI 智能体设计。它支持对知识图谱进行实时增量更新,无需批量重计算,因此非常适用于关系和信息随时间演变的动态环境。

# Github地址
https://github.com/getzep/graphiti

# 官网地址
https://help.getzep.com/graphiti

1.2 GraphRAG对比

Graphiti 专门设计用于应对动态、频繁更新数据集所带来的挑战,尤其适用于需要实时交互与精准历史查询的应用场景。

方面 GraphRAG Graphiti
主要用途 静态文档摘要 面向智能体的动态、演进式上下文
数据处理 批处理导向 持续、增量更新
知识结构 实体集群与社区摘要 时序知识图谱——包含实体、带有效窗口的事实、事件、社区
检索方法 基于大语言模型的顺序摘要 混合语义、关键词与基于图的检索
适应性
时间处理 基础时间戳追踪 显式双时态追踪,支持事实自动失效
矛盾处理 基于大语言模型的摘要判断 事实自动失效,同时保留时间历史
查询延迟 数秒至数十秒 通常为亚秒级延迟
自定义实体类型 不支持 支持,可通过 Pydantic 模型自定义
可扩展性 中等 高,针对大规模数据集进行了优化

1.3 安装环境

安装依赖

pip install graphiti-core -i https://pypi.tuna.tsinghua.edu.cn/simple

安装图数据库

docker run -itd \
--name neo4j \
-p 7474:7474 \
-p 7687:7687 \
-v /home/neo4j/data:/data \
-v /home/neo4j/logs:/logs \
-v /home/neo4j/plugins:/plugins \
-e NEO4J_AUTH=neo4j/secretgraph \
neo4j:5.26.18

2 官网代码

使用的大模型,文本生成:Kimi,嵌入:qwen,重排;qwen。

import asyncio
import json
import logging
import os
from datetime import datetime, timezone
from logging import INFO

from graphiti_core import Graphiti
from graphiti_core.cross_encoder import OpenAIRerankerClient
from graphiti_core.driver.neo4j_driver import Neo4jDriver
from graphiti_core.embedder import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.llm_client import LLMConfig
from graphiti_core.llm_client.openai_generic_client import OpenAIGenericClient
from graphiti_core.nodes import EpisodeType
from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRF

# 配置日志
logging.basicConfig(
    level=INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S',
)
logger = logging.getLogger(__name__)


neo4j_uri = 'bolt://192.168.108.147:7687'
neo4j_user = 'neo4j'
neo4j_password = 'secretgraph'


# 设置并发数量
os.environ['SEMAPHORE_LIMIT'] = '2'


async def main():
    # ---
    # 1 初始化兼容OpenAI服务的Graphiti客户端
    graphiti = Graphiti(
        # 配置驱动
        graph_driver=Neo4jDriver(
            # 配置图数据库
            uri=neo4j_uri,
            user=neo4j_user,
            password=neo4j_password
        ),

        # 配置兼容OpenAI的大模型客户端
        # 测试调用互联网的Qwen、Deepseek以及Kimi的新版本都无法使用,只能用下面kimi的旧版本
        llm_client=OpenAIGenericClient(
            config=LLMConfig(
                api_key="sk-XXXX",
                model="moonshot-v1-32k",
                base_url="https://api.moonshot.cn/v1"
            )
        ),
        # 配置兼容OpenAI的嵌入模型
        embedder=OpenAIEmbedder(
            config=OpenAIEmbedderConfig(
                api_key="sk-XXXX",
                embedding_model="text-embedding-v4",
                base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
            )
        ),
        # 配置兼容OpenAI的重排模型 rerank模型
        cross_encoder=OpenAIRerankerClient(
            config=LLMConfig(
                api_key="sk-XXXX",
                model="qwen3-rerank",
                base_url="https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
            )
        )
    )

    try:
        # ---
        # 2 添加片段(Episodes-情节、事件、剧集都可以)
        # Episodes是Graphiti中的主要信息单元。它们可以是文本或结构化的JSON,并且会被自动处理以提取实体和关系。

        # 含有文本和JSON对象的Episodes列表
        episodes = [
            {
                'content': 'Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.',
                'type': EpisodeType.text,
                'description': 'podcast transcript',
            },
            {
                'content': 'As AG, Harris was in office from January 3, 2011 – January 3, 2017',
                'type': EpisodeType.text,
                'description': 'podcast transcript',
            },
            {
                'content': {
                    'name': 'Gavin Newsom',
                    'position': 'Governor',
                    'state': 'California',
                    'previous_role': 'Lieutenant Governor',
                    'previous_location': 'San Francisco',
                },
                'type': EpisodeType.json,
                'description': 'podcast metadata',
            },
            {
                'content': {
                    'name': 'Gavin Newsom',
                    'position': 'Governor',
                    'term_start': 'January 7, 2019',
                    'term_end': 'Present',
                },
                'type': EpisodeType.json,
                'description': 'podcast metadata',
            },
        ]

        # 添加Episodes到图谱中
        for i, episode in enumerate(episodes):
            print(episode)
            # time.sleep(60)

            # 获取内容
            episode_body_str: str = ""
            if isinstance(episode['content'], str):
                # 获取字符串
                episode_body_str = episode['content']
            else:
                # 序列化为字符串
                episode_body_str = json.dumps(episode['content'])

            await graphiti.add_episode(
                name=f'Freakonomics Radio {i}',
                episode_body=episode_body_str,
                source=episode['type'],
                source_description=episode['description'],
                reference_time=datetime.now(timezone.utc),
            )
            print(f'增加的的片段: Freakonomics Radio {i} ({episode["type"].value})')

        # ---
        # 3 基础检索
        # 从Graphiti中检索关系(边)的最简单方式是使用search方法,该方法结合语义相似度和BM25文本检索实现混合搜索。
        print("\n搜索内容: 'Who was the California Attorney General?'")
        results = await graphiti.search('Who was the California Attorney General?')

        # Print search results
        print('\nSearch Results:')
        for result in results:
            print(f'UUID: {result.uuid}')
            print(f'Fact: {result.fact}')
            if hasattr(result, 'valid_at') and result.valid_at:
                print(f'Valid from: {result.valid_at}')
            if hasattr(result, 'invalid_at') and result.invalid_at:
                print(f'Valid until: {result.invalid_at}')
            print('---')

        # ---
        # 4 中心节点检索
        # 为了获得更符合上下文的结果,可以使用中心节点,根据搜索结果与特定节点的图距离对其进行重新排序

        # 使用最靠前的搜索结果的UUID作为中心节点进行重新排序
        if results and len(results) > 0:
            # 从最靠前的搜索结果中获取源节点 UUID
            center_node_uuid = results[0].source_node_uuid

            print('\n根据图距离对搜索结果进行重新排序:')
            print(f'使用的中心节点UUID: {center_node_uuid}')

            reranked_results = await graphiti.search(
                'Who was the California Attorney General?', center_node_uuid=center_node_uuid
            )

            # Print reranked search results
            print('\n重排后结果:')
            for result in reranked_results:
                print(f'UUID: {result.uuid}')
                print(f'Fact: {result.fact}')
                if hasattr(result, 'valid_at') and result.valid_at:
                    print(f'Valid from: {result.valid_at}')
                if hasattr(result, 'invalid_at') and result.invalid_at:
                    print(f'Valid until: {result.invalid_at}')
                print('---')
        else:
            print('初始搜索未返回任何结果,因此无法选取中心节点.')

        # ---
        # 5 使用预定义策略进行节点搜索
        # Graphiti提供了预定义的搜索策略,这些策略针对不同的搜索场景进行了优化。可使用NODE_HYBRID_SEARCH_RRF来直接检索节点,而不是检索边(关系)。
        # 示例:使用预定义标准策略的_search方法进行节点搜索
        print(
            '\nPerforming node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:'
        )

        # 定义预定义搜索策略,并修改其限制条件
        node_search_config = NODE_HYBRID_SEARCH_RRF.model_copy(deep=True)
        # 限制返回值的结果数量为5
        node_search_config.limit = 5

        # 执行节点搜索
        node_search_results = await graphiti._search(
            query='California Governor',
            config=node_search_config,
        )

        # 打印节点信息
        print('\n搜索的节点信息:')
        for node in node_search_results.nodes:
            print(f'Node UUID: {node.uuid}')
            print(f'Node Name: {node.name}')
            node_summary = node.summary[:100] + '...' if len(node.summary) > 100 else node.summary
            print(f'Content Summary: {node_summary}')
            print(f'Node Labels: {", ".join(node.labels)}')
            print(f'Created At: {node.created_at}')
            if hasattr(node, 'attributes') and node.attributes:
                print('Attributes:')
                for key, value in node.attributes.items():
                    print(f'  {key}: {value}')
            print('---')

    finally:
        # 清理资源
        # 结束时,务必关闭与Neo4j的连接,以正确释放资源
        # 关闭连接
        await graphiti.close()
        print('\nConnection closed')


if __name__ == '__main__':
    asyncio.run(main())

3 结果

3.1 执行结果

(1)控制面板的结果

……
增加的的片段: Freakonomics Radio 0 (text)
2026-03-20 18:13:39 - neo4j.notifications - INFO - Received notification from DBMS server: <GqlStatusObject gql_status='00NA0', status_description="note: successful completion - index or constraint already exists. The command 'CREATE RANGE INDEX community_uuid IF NOT EXISTS FOR (e:Community) ON (e.uuid)' has no effect. The index or constraint specified by 'RANGE INDEX community_uuid FOR (e:Community) ON (e.uuid)' already exists.", position=None, raw_classification='SCHEMA', classification=<NotificationClassification.SCHEMA: 'SCHEMA'>, raw_severity='INFORMATION', severity=<NotificationSeverity.INFORMATION: 'INFORMATION'>, diagnostic_record={'_classification': 'SCHEMA', '_severity': 'INFORMATION', 'OPERATION': '', 'OPERATION_CODE': '0', 'CURRENT_SCHEMA': '/'}> for query: 'CREATE INDEX community_uuid IF NOT EXISTS FOR (n:Community) ON (n.uuid)'
……
2026-03-20 18:13:42 - httpx - INFO - HTTP Request: POST https://api.moonshot.cn/v1/chat/completions "HTTP/1.1 200 OK"
2026-03-20 18:13:43 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings "HTTP/1.1 200 OK"
……
增加的的片段: Freakonomics Radio 1 (text)
……
增加的的片段: Freakonomics Radio 2 (json)
……
增加的的片段: Freakonomics Radio 3 (json)


搜索内容: 'Who was the California Attorney General?'

Search Results:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---

根据图距离对搜索结果进行重新排序:
使用的中心节点UUID: 2c26b541-1d44-4366-8335-76705c326b0c

重排后结果:
UUID: 341b9eff-343e-4129-a41a-a2176ff1c897
Fact: Kamala Harris was the district attorney for San Francisco before becoming the Attorney General of California.
---
UUID: 54bc7574-5610-487d-92c3-ed9f7094e9c6
Fact: Kamala Harris served as the Attorney General of California.
Valid from: 2026-03-20 10:13:39.601668+00:00
Valid until: 2017-01-04 00:00:00+00:00
---
UUID: da7c9b7d-6b9f-44a1-a9ce-9a227055cb44
Fact: Kamala Harris is the Attorney General of California.
Valid from: 2026-03-20 08:09:37.882078+00:00
---
UUID: 35e9fbc6-8fda-44c4-8842-f51046a52a01
Fact: Kamala Harris was previously the district attorney for San Francisco.
---
UUID: 5885816b-8454-423d-9d8e-2bcc62ae1e5d
Fact: Harris held the position of Attorney General from January 3, 2011, to January 3, 2017.
Valid from: 2011-01-03 00:00:00+00:00
Valid until: 2017-01-03 00:00:00+00:00
---
UUID: eff5d784-8bdc-494a-ba38-f36912e04616
Fact: Kamala Harris was the district attorney for San Francisco prior to her current position.
---
UUID: 2d82ef48-7252-49d6-b964-2e66ab3898c2
Fact: The position of district attorney that Kamala Harris held was in San Francisco.
---
UUID: 753abd55-1c30-4315-a1c4-9c6070086e62
Fact: Gavin Newsom is currently the Governor of California.
Valid from: 2026-03-20 10:14:27.766084+00:00
---
UUID: 7fc71156-4255-4eac-aed3-ff8f20b2cf4e
Fact: Gavin Newsom previously held the position of Lieutenant Governor.
---
UUID: 5361c1a2-68f5-4a9a-9fb6-154338a3729d
Fact: Gavin Newsom was previously based in San Francisco.
---

Performing node search using _search method with standard recipe NODE_HYBRID_SEARCH_RRF:

搜索的节点信息:
Node UUID: c95b4158-c5bf-4556-81e2-b24fe2dfc159
Node Name: California
Content Summary: Kamala Harris served as the Attorney General of California.
Gavin Newsom is currently the Governor o...
Node Labels: Entity
Created At: 2026-03-20 10:13:42.973291+00:00
---
Node UUID: b22f03ae-986f-46ff-b1ee-d49657190735
Node Name: Gavin Newsom
Content Summary: Gavin Newsom is currently the Governor of California.
Gavin Newsom previously held the position of L...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 89c7c583-5cb3-4d62-99b1-9588f78acbd4
Node Name: Governor
Content Summary: Gavin Newsom is the Governor of California, previously serving as Lieutenant Governor in San Francis...
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: 5c34f7f4-602f-4452-ae0e-3cbef1f934cf
Node Name: Lieutenant Governor
Content Summary: Gavin Newsom previously held the position of Lieutenant Governor.
Node Labels: Entity
Created At: 2026-03-20 10:14:34.540802+00:00
---
Node UUID: fa5e67c7-dd3c-4eb9-8ceb-bb4e64564f8f
Node Name: Attorney General of California
Content Summary: Kamala Harris is the Attorney General of California.
Harris held the position of Attorney General fr...
Node Labels: Entity
Created At: 2026-03-20 08:09:49.618825+00:00
---

Connection closed

(2)neo4j的结果

neo4j截图

在这里插入图片描述

详细图

在这里插入图片描述

(3)数据节点的值

{
    "n": {
      "identity": 0,
      "labels": [
        "Episodic"
      ],
      "properties": {
        "entity_edges": [
          "da7c9b7d-6b9f-44a1-a9ce-9a227055cb44",
          "35e9fbc6-8fda-44c4-8842-f51046a52a01"
        ],
        "group_id": "",
        "name": "Freakonomics Radio 0",
        "created_at": "2026-03-20T08:09:37.882078000Z",
        "source": "text",
        "uuid": "2006447a-2971-454a-97cb-7e0cf7a580f9",
        "content": "Kamala Harris is the Attorney General of California. She was previously the district attorney for San Francisco.",
        "source_description": "podcast transcript",
        "valid_at": "2026-03-20T08:09:37.882078000Z"
      },
      "elementId": "4:d48ecc53-79cb-424c-aa6e-0eba75e0ea5a:0"
    }
}

3.2 错误解决方法

下面的错误就是大模型的版本不兼容导致,只能更换模型。我测试的kimi中的moonshot-v1-32k可以执行。

报错的位置。原因应该是模型解析后,无法有效映射数据。

await graphiti.add_episode(
    name=f'Freakonomics Radio {i}',
    episode_body=episode_body_str,
    source=episode['type'],
    source_description=episode['description'],
    reference_time=datetime.now(timezone.utc),
)

报错结果

pydantic_core._pydantic_core.ValidationError: 3 validation errors for ExtractedEntities
extracted_entities.0.name
  Field required [type=missing, input_value={'entity_name': 'Kamala H...s', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.1.name
  Field required [type=missing, input_value={'entity_name': 'Attorney...a', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing
extracted_entities.2.name
  Field required [type=missing, input_value={'entity_name': 'San Fran...o', 'entity_type_id': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/missing

Github上待修复的Bug地址

https://bgithub.xyz/getzep/graphiti/issues/912
Logo

欢迎加入 MCP 技术社区!与志同道合者携手前行,一同解锁 MCP 技术的无限可能!

更多推荐