ADK 工具系统：从 FunctionTool 到 AgentTool 的实现

星航夜空的帆舟

162人浏览 · 2026-07-05 18:24:46

星航夜空的帆舟 · 2026-07-05 18:24:46 发布

1. 工具系统架构总览

ADK 的工具系统是智能体与外部世界交互的桥梁。它采用分层设计，支持多种工具类型，并且具有良好的扩展性。

在这里插入图片描述

2. Tool 接口：工具的基础抽象

2.1 Tool 接口与 runnableTool

在 ADK 中，工具系统的设计遵循接口分离原则。Tool 接口只定义工具的元数据：

type Tool interface {
    Name() string          // 返回工具的名称
    Description() string   // 返回工具的描述信息
    IsLongRunning() bool   // 返回该工具是否为长运行工具
}

这个接口非常简洁，但一个真正可执行的工具需要实现 runnableTool 接口：

type runnableTool interface {
    Tool                                                  // 嵌入 Tool 接口，继承元数据方法
    Declaration() *genai.FunctionDeclaration               // 返回 LLM 可理解的函数声明
    Run(ctx Context, args any) (result map[string]any, err error) // 执行工具的核心方法
}

此外，工具还可以实现 RequestProcessor 接口来处理 LLM 请求：

type RequestProcessor interface {
    ProcessRequest(ctx tool.Context, req *model.LLMRequest) error // 在 LLM 请求前处理请求
}

2.2 Tool Context：工具执行的上下文

工具执行时需要访问丰富的上下文信息：

type Context interface {
    agent.CallbackContext                                      // 嵌入回调上下文接口

    FunctionCallID() string                                   // 返回当前函数调用的唯一 ID
    Actions() *session.EventActions                           // 返回事件动作，可修改状态、跳过总结等
    SearchMemory(context.Context, string) (*memory.SearchResponse, error) // 在记忆中进行语义搜索
    ToolConfirmation() *toolconfirmation.ToolConfirmation     // 获取人工确认状态
    RequestConfirmation(hint string, payload any) error       // 请求用户确认（HITL）
}

FunctionCallID() 方法返回当前函数调用的唯一标识符，这在需要跟踪多次工具调用的场景中非常有用。Actions() 方法返回一个 EventActions 对象，通过这个对象工具可以影响智能体的后续行为，比如跳过总结（SkipSummarization）或标记向上转移（Escalate）。SearchMemory() 方法允许工具在记忆服务中进行语义搜索，获取跨会话的上下文信息。ToolConfirmation() 方法检查是否已有用户确认信息，而 RequestConfirmation() 方法则用于发起新的确认请求，实现人机协同的审批流程。

3. FunctionTool：将 Go 函数转换为工具

FunctionTool 是 ADK 中最常用的工具类型。它利用 Go 的泛型和反射技术，自动将普通的 Go 函数包装成智能体可以调用的工具。这种设计的核心价值在于降低了工具开发的门槛——开发者不需要学习复杂的接口或配置，只需要编写一个标准的 Go 函数，FunctionTool 就会自动处理参数类型转换、JSON Schema 生成、函数声明构建等所有细节。

FunctionTool 的工作原理可以分为三个阶段。首先是创建阶段，在这个阶段，框架会通过反射分析函数的参数和返回类型，自动推断输入输出的 JSON Schema。其次是注册阶段，工具的函数声明会被打包到 LLM 请求中，让 LLM 知道有哪些工具可以调用。最后是执行阶段，当 LLM 决定调用某个工具时，框架会将 LLM 生成的参数转换为 Go 函数所需的类型，执行函数，然后将结果转换回 LLM 可以理解的格式。

这种自动化的流程使得开发者可以专注于业务逻辑的实现，而不需要关心 LLM 和 Go 之间的类型转换细节。

3.1 创建 FunctionTool

New() 函数是创建 FunctionTool 的入口：

func New[TArgs, TResults any](cfg Config, handler Func[TArgs, TResults]) (tool.Tool, error)
// TArgs：函数的输入参数类型（必须是结构体、Map 或其指针）
// TResults：函数的返回结果类型
// cfg：工具的配置选项
// handler：用户定义的业务逻辑函数

创建 FunctionTool 的完整流程：

在这里插入图片描述

3.2 配置选项详解

Config 结构体提供了丰富的配置选项：

type Config struct {
    Name                        string             // 工具名称，LLM 通过此名称调用工具
    Description                 string             // 工具描述，LLM 根据此描述决定是否调用
    InputSchema                 *jsonschema.Schema // 输入参数的 JSON Schema（可选，默认自动推断）
    OutputSchema                *jsonschema.Schema // 输出结果的 JSON Schema（可选，默认自动推断）
    IsLongRunning               bool               // 是否为长运行工具，默认 false
    RequireConfirmation         bool               // 是否总是需要人工确认，默认 false
    RequireConfirmationProvider any                // 动态确认提供者函数，默认 nil
}

Name 和 Description 是必需的配置项。名称是 LLM 调用工具时使用的标识符，而描述则是 LLM 判断是否应该调用该工具的关键依据。一个清晰、准确的描述对于 LLM 的工具选择决策至关重要。InputSchema 和 OutputSchema 用于定义输入输出的结构约束，通常不需要手动指定，ADK 会通过 Go 的反射机制自动从函数的参数和返回类型推断出来。IsLongRunning 标志用于标记长运行工具，当设置为 true 时，框架会自动在工具描述中添加提示，提醒 LLM 这是一个长时间运行的操作。RequireConfirmation 和 RequireConfirmationProvider 用于配置人工确认机制，前者是静态标志，后者是动态函数，可以根据输入参数决定是否需要确认。

3.3 参数类型校验

New() 函数首先检查输入参数类型是否符合要求：

var zeroArgs TArgs                         // 创建 TArgs 类型的零值
argsType := reflect.TypeOf(zeroArgs)       // 获取零值的反射类型
for argsType != nil && argsType.Kind() == reflect.Pointer {
    argsType = argsType.Elem()             // 解引用指针，获取底层元素类型
}
if argsType == nil || (argsType.Kind() != reflect.Struct && argsType.Kind() != reflect.Map) {
    return nil, fmt.Errorf("input must be a struct or a map or a pointer to those types")
    // 输入类型必须是结构体、Map 或它们的指针，否则返回错误
}

这确保了输入类型必须是结构体、Map 或它们的指针，nil 类型不允许。

3.4 自动生成 JSON Schema

FunctionTool 的核心功能是自动从 Go 类型生成 JSON Schema：

func resolvedSchema[T any](override *jsonschema.Schema) (*jsonschema.Resolved, error) {
    if override != nil {
        return override.Resolve(nil)       // 如果用户提供了 override Schema，直接使用
    }
    schema, err := jsonschema.For[T](nil)  // 否则从泛型类型 T 自动推断 Schema
    if err != nil {
        return nil, err                    // 推断失败，返回错误
    }
    return schema.Resolve(nil)             // 解析 Schema 并返回
}

自动推断会识别结构体标签，例如：

type WeatherArgs struct {
    City string `json:"city" jsonschema:"description=城市名称"` // 结构体标签：JSON 字段名 + Schema 描述
}

3.5 动态确认提供者

RequireConfirmationProvider 允许动态决定是否需要确认：

if cfg.RequireConfirmationProvider != nil {
    fn, ok := cfg.RequireConfirmationProvider.(func(TArgs) bool) // 类型断言，验证函数签名
    if !ok {
        return nil, fmt.Errorf("error RequireConfirmationProvider must be a function with signature func(%T) bool", *new(TArgs))
        // 签名不匹配，返回错误
    }
    confirmWrapper = fn // 保存确认包装函数
}

提供者函数签名必须是 func(TArgs) bool，返回 true 表示需要确认。

3.6 FunctionTool 的执行流程

Run() 方法是工具执行的核心：

func (f *functionTool[TArgs, TResults]) Run(ctx tool.Context, args any) (result map[string]any, err error) {
    // 1. 恢复 panic：防止工具函数崩溃导致整个流程中断
    defer func() {
        if r := recover(); r != nil {
            err = fmt.Errorf("panic in tool %q: %v\nstack: %s", f.Name(), r, debug.Stack())
        }
    }()

    // 2. 参数类型转换：将 map[string]any 转换为泛型类型 TArgs
    m, ok := args.(map[string]any)
    if !ok {
        return nil, fmt.Errorf("unexpected args type, got: %T", args)
    }
    input, err := typeutil.ConvertToWithJSONSchema[map[string]any, TArgs](m, f.inputSchema)
    if err != nil {
        return nil, err // 转换失败，返回错误
    }

    // 3. 确认检查（HITL）：检查是否需要人工确认
    if confirmation := ctx.ToolConfirmation(); confirmation != nil {
        if !confirmation.Confirmed {
            return nil, fmt.Errorf("error tool %q %w", f.Name(), tool.ErrConfirmationRejected)
            // 已有确认信息但被拒绝，返回拒绝错误
        }
    } else {
        requireConfirmation := f.requireConfirmation // 静态确认标志
        if f.requireConfirmationProvider != nil {
            requireConfirmation = f.requireConfirmationProvider(input) // 动态确认提供者
        }
        if requireConfirmation {
            err := ctx.RequestConfirmation("Please approve...", nil) // 发起确认请求
            ctx.Actions().SkipSummarization = true                   // 跳过总结
            return nil, fmt.Errorf("error tool %q %w", f.Name(), tool.ErrConfirmationRequired)
        }
    }

    // 4. 调用原始函数：执行用户定义的业务逻辑
    output, err := f.handler(ctx, input)
    if err != nil {
        return nil, err // 业务逻辑执行失败，返回错误
    }

    // 5. 结果转换：将 TResults 转换为 map[string]any
    resp, err := typeutil.ConvertToWithJSONSchema[TResults, map[string]any](output, f.outputSchema)
    if err == nil {
        return resp, nil // 转换成功，直接返回
    }

    // 6. 包装基本类型结果：转换失败时包装为 {"result": output}
    wrappedOutput := map[string]any{"result": output}
    return wrappedOutput, nil
}

执行流程的六个阶段：
在这里插入图片描述

3.7 确认检查逻辑

确认检查是 FunctionTool 的重要特性：

已有确认：如果上下文中已有确认信息，检查是否已确认
静态确认：如果配置了 RequireConfirmation = true，需要确认
动态确认：如果配置了 RequireConfirmationProvider，调用它决定是否需要确认
请求确认：如果需要确认，调用 ctx.RequestConfirmation() 并返回 ErrConfirmationRequired

注意：当请求确认时，会设置 SkipSummarization = true，避免 LLM 对确认请求进行总结。

3.8 结果转换与包装

结果转换的逻辑：

尝试将 TResults 转换为 map[string]any
如果转换成功，直接返回
如果转换失败（例如返回的是基本类型），包装为 {"result": output}

这与 Python 版本的实现保持一致。

3.9 ProcessRequest 与 Declaration

FunctionTool 实现了 RequestProcessor 接口，在 LLM 请求前注册工具声明：

func (f *functionTool[TArgs, TResults]) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    return toolutils.PackTool(req, f) // 将工具声明打包到 LLM 请求中
}

Declaration() 方法生成 LLM 需要的函数声明：

func (f *functionTool[TArgs, TResults]) Declaration() *genai.FunctionDeclaration {
    decl := &genai.FunctionDeclaration{
        Name:        f.Name(),        // 函数名称
        Description: f.Description(), // 函数描述
    }
    if f.inputSchema != nil {
        decl.ParametersJsonSchema = f.inputSchema.Schema() // 设置输入参数的 JSON Schema
    }
    if f.outputSchema != nil {
        decl.ResponseJsonSchema = f.outputSchema.Schema()  // 设置输出结果的 JSON Schema
    }

    if f.cfg.IsLongRunning {
        instruction := "NOTE: This is a long-running operation..."
        // 长运行工具的提示说明
        decl.Description += "\n\n" + instruction // 追加到描述中
    }

    return decl
}

对于长运行工具，会自动添加提示说明。

4. AgentTool：将智能体作为工具调用

AgentTool 是一种特殊的工具，它允许一个智能体调用另一个智能体。这是实现智能体组合的核心机制，也是 ADK 多智能体架构的关键设计之一。

AgentTool 的设计理念源于“智能体即工具”的思想——每个智能体都可以被视为一个可以被调用的工具，这种视角使得智能体之间的协作变得非常灵活。父智能体不需要知道子智能体的内部实现细节，只需要通过 AgentTool 的接口调用它，就像调用普通工具一样。

这种设计带来了几个重要的优势。首先是关注点分离，每个智能体可以专注于自己的专业领域，而不需要关心其他智能体的实现。其次是可组合性，通过 AgentTool，开发者可以将多个专业智能体组合成一个更强大的复合智能体。最后是隔离性，每个子智能体拥有独立的会话和上下文，不会污染父智能体的状态。

AgentTool 的另一个重要用途是解决工具类型冲突问题。由于 genai API 的限制，某些工具类型（如 Google Search 原生工具和自定义函数工具）不能在同一个智能体中混合使用。通过将不同类型的工具放在不同的子智能体中，然后用 AgentTool 来调度它们，可以有效地解决这个问题。

4.1 创建 AgentTool

func New(agent agent.Agent, cfg *Config) tool.Tool
// agent：被包装的子智能体
// cfg：AgentTool 的配置选项

4.2 AgentTool 的执行原理

当 LLM 调用 AgentTool 时，它会创建一个新的 Runner 来执行子智能体：

func (t *agentTool) Run(toolCtx tool.Context, args any) (map[string]any, error) {
    // 1. 参数验证和内容构建
    var content *genai.Content
    if agentInputSchema != nil {
        jsonData, _ := json.Marshal(margs)                       // 将参数序列化为 JSON
        content = genai.NewContentFromText(string(jsonData), genai.RoleUser) // 创建用户消息
    } else {
        inputText := margs["request"].(string)                   // 从参数中提取请求文本
        content = genai.NewContentFromText(inputText, genai.RoleUser)        // 创建用户消息
    }

    // 2. 创建会话服务和 Runner（子智能体拥有独立的执行环境）
    sessionService := session.InMemoryService()                  // 创建内存会话服务
    r, _ := runner.New(runner.Config{
        AppName:        t.agent.Name(),                          // 应用名称为子智能体名称
        Agent:          t.agent,                                 // 子智能体
        SessionService: sessionService,                          // 独立的会话服务
        ArtifactService: artifact.InMemoryService(),             // 独立的制品服务
        MemoryService:   memory.InMemoryService(),               // 独立的内存服务
    })

    // 3. 继承父智能体的状态（过滤内部状态）
    stateMap := make(map[string]any)
    for k, v := range toolCtx.State().All() {
        if !strings.HasPrefix(k, "_adk") {                      // 过滤以 _adk 开头的内部状态键
            stateMap[k] = v                                     // 只复制业务状态
        }
    }

    // 4. 创建子会话并执行
    subSession, _ := sessionService.Create(toolCtx, &session.CreateRequest{
        AppName: t.agent.Name(),                                 // 应用名称
        UserID:  toolCtx.UserID(),                               // 继承用户 ID
        State:   stateMap,                                       // 传递过滤后的状态
    })

    eventCh := r.Run(toolCtx, subSession.Session.UserID(), subSession.Session.ID(), content, ...)
    // 运行子智能体，获取事件通道

    // 5. 收集结果：获取最后一个有内容的事件
    var lastEvent *session.Event
    for event, err := range eventCh {
        if event.LLMResponse.Content != nil {
            lastEvent = event                                     // 更新最后一个有内容的事件
        }
    }

    // 6. 结果验证（如果有输出 Schema）
    if agentOutputSchema != nil {
        parsedOutput, err := utils.ValidateOutputSchema(outputText, agentOutputSchema)
        // 验证输出是否符合 Schema
        return parsedOutput, nil
    }

    return map[string]any{"result": outputText}, nil              // 返回结果
}

执行流程：
在这里插入图片描述

4.3 状态继承机制

AgentTool 会继承父智能体的状态，但会过滤掉 ADK 内部状态（以 _adk 开头的键）：

for k, v := range toolCtx.State().All() {
    if !strings.HasPrefix(k, "_adk") { // 过滤以 _adk 开头的内部状态键
        stateMap[k] = v                // 只复制业务状态到子智能体
    }
}

这确保了子智能体可以访问父智能体的业务状态，但不会受到内部实现细节的影响。

4.4 AgentTool vs 智能体转移

这两种方式有本质区别。智能体转移是一种“控制权转移”的模式——当父智能体决定将任务转移给子智能体时，控制权完全交给目标智能体，直到子智能体完成或再次转移。在这种模式下，所有智能体共享同一个会话和对话历史。而 AgentTool 则是一种“工具式调用”的模式——当父智能体通过 AgentTool 调用子智能体时，子智能体的执行是独立的，它拥有自己的会话和上下文。调用完成后，子智能体的执行结果作为工具响应返回给父智能体，控制权始终在父智能体手中。这两种模式适用于不同的场景：智能体转移适合任务委派和专业分工，而 AgentTool 适合需要获取特定结果的工具式调用。

4.5 AgentTool 的输入输出 Schema

AgentTool 会根据被包装智能体的 InputSchema 自动生成函数声明：

如果智能体有 InputSchema：使用它作为工具参数 schema，并验证输入
如果没有 InputSchema：使用默认的 {"request": string} schema

同样，若智能体有 OutputSchema，AgentTool 会验证输出是否符合 schema，并返回解析后的结构化数据。

4.6 AgentTool 多工具组合实战

问题背景：genai API 的工具类型限制

genai API 存在一个限制：你不能在同一个智能体中混合使用 Google 搜索（Gemini 原生工具）和自定义函数工具。这个限制来自 Gemini API 的底层设计——Google 搜索是一种服务端原生工具（genai.Tool 类型），而自定义函数通过 FunctionDeclaration 声明，它们使用不同的声明方式，在单一请求中不能共存。

解决方案：AgentTool 组件化

核心思路是把不同类型的工具放在不同的子智能体中，然后用根智能体通过 AgentTool 来调度它们。
在这里插入图片描述

创建搜索子智能体（只包含 Google Search 原生工具）：

searchAgent, err := llmagent.New(llmagent.Config{
    Name:        "search_agent",                   // 子智能体名称
    Model:       model,                            // 使用的模型
    Description: "Does google search.",            // 描述
    Instruction: "You're a specialist in Google Search.", // 系统指令
    Tools: []tool.Tool{
        geminitool.GoogleSearch{},                 // Gemini 原生 Google 搜索工具
    },
})

创建诗歌子智能体（只包含自定义 FunctionTool）：

type Input struct {
    LineCount int `json:"lineCount"`               // 诗歌行数
}
type Output struct {
    Poem string `json:"poem"`                      // 生成的诗歌
}

handler := func(ctx tool.Context, input Input) (Output, error) {
    return Output{
        Poem: strings.Repeat("A line of a poem,", input.LineCount) + "\n", // 生成重复行
    }, nil
}

poemTool, _ := functiontool.New(functiontool.Config{
    Name:        "poem",                           // 工具名称
    Description: "Returns poem",                   // 工具描述
}, handler)

poemAgent, _ := llmagent.New(llmagent.Config{
    Name:        "poem_agent",                     // 子智能体名称
    Model:       model,                            // 使用的模型
    Description: "returns poem",                   // 描述
    Instruction: "You return poems.",              // 系统指令
    Tools:       []tool.Tool{poemTool},            // 只包含诗歌工具
})

创建根智能体（调度中心）：

a, err := llmagent.New(llmagent.Config{
    Name:        "root_agent",                                          // 根智能体名称
    Model:       model,                                                 // 使用的模型
    Description: "You can do a google search and generate poems.",      // 描述
    Instruction: "Answer questions about weather based on google search unless asked for a poem," +
        " for a poem generate it with a tool.",                          // 系统指令
    Tools: []tool.Tool{
        agenttool.New(searchAgent, nil),                                // 将搜索智能体包装为工具
        agenttool.New(poemAgent, nil),                                  // 将诗歌智能体包装为工具
    },
})

执行流程分析（以天气查询为例）：
在这里插入图片描述

AgentTool 的通用价值：

工具类型隔离：将不同类型的工具隔离在不同的子智能体中
职责分离：每个子智能体专注于一个专业领域
独立推理：每个子智能体可以有自己的系统指令和推理策略
状态隔离：子智能体拥有独立的会话，不会污染父智能体的上下文

5. MCPToolset：Model Context Protocol 集成

MCP（Model Context Protocol）是一种标准协议，允许智能体调用外部工具。ADK 通过 MCPToolset 实现了这个协议，使得智能体可以无缝地使用任何符合 MCP 标准的工具。

MCP 的核心价值在于标准化——它定义了一套通用的接口，使得工具提供者和工具消费者可以独立演进。就像数据库世界里的 ODBC 一样，MCP 提供了一种标准化的方式来连接各种不同的后端服务。这种标准化带来了几个重要的好处：工具开发者只需要实现一次，就可以被任何支持 MCP 的框架使用；智能体开发者可以使用任何 MCP 工具，而不需要关心工具的具体实现细节。

ADK 的 MCPToolset 实现了 MCP 客户端的所有功能，包括连接管理、自动重连、工具发现和转换等。它支持两种传输模式：本地内存传输适合开发测试，远程 HTTP 传输适合连接外部服务。无论使用哪种模式，智能体开发者的使用方式都是相同的：创建一个 MCPToolset，然后将它注入到智能体中，智能体就可以使用所有可用的 MCP 工具了。

5.1 MCP 协议概述

MCP 是 Anthropic 发起的开放协议。它的核心目标是让 AI 应用能够通过一套标准化的接口，安全地连接外部数据源和工具。你可以把它类比为数据库世界里的 ODBC——一套标准协议，连接各种不同的后端服务。

MCP 协议定义了三个核心角色：

MCP 服务器：提供工具和数据源的一方
MCP 客户端：消费工具的一方（在 ADK 中就是智能体）
传输层：客户端和服务器之间的通信方式

ADK 的 mcptoolset 包封装了 MCP 客户端的所有细节，让智能体开发者可以像使用本地工具一样使用远程 MCP 工具。

5.2 两种集成模式

ADK 的 MCP 示例展示了两种集成模式，通过环境变量 AGENT_MODE 来切换：

模式一：本地内存 MCP（local）：在同一个进程内启动一个 MCP 服务器，使用内存传输通道，适合本地工具开发和测试。

模式二：远程 HTTP MCP（github）：通过 HTTP 连接到 GitHub 的远程 MCP 服务器，适合连接外部服务，需要认证令牌。

5.3 本地内存 MCP 模式详解

首先定义 MCP 工具函数，签名遵循 MCP SDK 的规范：

// 定义输入和输出结构体
type Input struct {
    City string `json:"city" jsonschema:"city name"` // 城市名称参数
}

type Output struct {
    WeatherSummary string `json:"weather_summary" jsonschema:"weather summary in the given city"`
    // 天气摘要结果
}

// GetWeather 是 MCP 工具的实现函数
// 返回值包含三个部分：CallToolResult（MCP 协议元数据）、Output（业务结果）、error
func GetWeather(ctx context.Context, req *mcp.CallToolRequest, input Input) (*mcp.CallToolResult, Output, error) {
    return nil, Output{
        WeatherSummary: fmt.Sprintf("Today in %q is sunny\n", input.City), // 生成天气摘要
    }, nil
}

创建内存传输通道并启动 MCP 服务器：

func localMCPTransport(ctx context.Context) mcp.Transport {
    // 创建一对内存传输通道（客户端和服务端各持一端）
    clientTransport, serverTransport := mcp.NewInMemoryTransports()

    // 创建 MCP 服务器，注册工具
    server := mcp.NewServer(
        &mcp.Implementation{Name: "weather_server", Version: "v1.0.0"}, // 服务器实现信息
        nil,                                                              // 可选配置
    )
    mcp.AddTool(server,
        &mcp.Tool{Name: "get_weather", Description: "returns weather in the given city"}, // 工具元数据
        GetWeather, // 工具处理函数
    )

    // 启动服务器，连接服务端传输通道
    _, err := server.Connect(ctx, serverTransport, nil)
    if err != nil {
        log.Fatal(err) // 连接失败，终止程序
    }

    // 返回客户端传输通道，供后续创建 MCPToolset 使用
    return clientTransport
}

5.4 远程 HTTP MCP 模式详解

连接到 GitHub 的远程 MCP 服务器需要 OAuth2 认证：

func githubMCPTransport(ctx context.Context) mcp.Transport {
    // 从环境变量读取 GitHub Personal Access Token
    ts := oauth2.StaticTokenSource(
        &oauth2.Token{AccessToken: os.Getenv("GITHUB_PAT")}, // 静态令牌源
    )
    // 创建 HTTP 传输，指向 GitHub MCP 服务端点
    return &mcp.StreamableClientTransport{
        Endpoint:   "https://api.githubcopilot.com/mcp/",   // 远程 MCP 服务端点 URL
        HTTPClient: oauth2.NewClient(ctx, ts),               // 带 OAuth2 认证的 HTTP 客户端
    }
}

5.5 mcptoolset.New() 创建工具集

无论使用哪种传输方式，创建 MCP 工具集的代码都是一样的：

mcpToolSet, err := mcptoolset.New(mcptoolset.Config{
    Transport: transport, // 本地内存传输或远程 HTTP 传输
})

mcptoolset.New() 函数内部创建了一个包含 connectionRefresher 的结构体，负责 MCP 客户端连接管理。

5.6 connectionRefresher：自动重连机制

connectionRefresher 是 MCP 工具集的一个关键设计，它负责：

延迟连接：MCP 会话在第一次使用工具时才会创建，而不是在工具集初始化时
自动重连：当连接断开时，自动尝试重新建立会话
重连安全：在重连前先 Ping 验证连接是否真的断开，避免并发重连

type connectionRefresher struct {
    client    *mcp.Client          // MCP 客户端
    transport mcp.Transport        // 传输层
    mu        sync.Mutex           // 互斥锁，保护并发访问
    session   *mcp.ClientSession   // 当前 MCP 会话
}

触发重连的错误类型包括：ErrConnectionClosed、ErrSessionMissing、io.ErrClosedPipe、io.EOF。

withRetry 泛型函数封装了"执行-失败-重连-重试"的逻辑：

func withRetry[T any](ctx context.Context, c *connectionRefresher, fn func(*mcp.ClientSession) (T, error)) (T, bool, error) {
    session, err := c.getSession(ctx) // 获取当前会话
    // 首次尝试
    result, err := fn(session)
    if err != nil {
        if !shouldRefreshConnection(err) {
            return zero, false, err // 不可重连的错误，直接返回
        }
        // 刷新连接后重试
        session, refreshErr := c.refreshConnection(ctx) // 重新建立连接
        result, err = fn(session)                        // 重试操作
        return result, true, err
    }
    return result, false, err
}

5.7 工具发现与转换流程

当智能体需要工具列表时，set.Tools() 方法被调用：

func (s *set) Tools(ctx agent.ReadonlyContext) ([]tool.Tool, error) {
    mcpTools, err := s.mcpClient.ListTools(ctx) // 从 MCP 服务器获取所有可用工具
    // 遍历转换
    for _, mcpTool := range mcpTools {
        t, err := convertTool(mcpTool, s.mcpClient, s.requireConfirmation, s.requireConfirmationProvider)
        // 将 MCP 工具转换为 ADK 的 Tool 接口
        if s.toolFilter != nil && !s.toolFilter(ctx, t) {
            continue // 如果配置了过滤器且工具被过滤，跳过
        }
        adkTools = append(adkTools, t) // 添加到 ADK 工具列表
    }
    return adkTools, nil
}

MCP 工具到 ADK 工具的转换逻辑：

func convertTool(t *mcp.Tool, client MCPClient, requireConfirmation bool, requireConfirmationProvider tool.ConfirmationProvider) (tool.Tool, error) {
    mcp := &mcpTool{
        name:        t.Name,                           // 工具名称
        description: t.Description,                    // 工具描述
        funcDeclaration: &genai.FunctionDeclaration{
            Name:        t.Name,                       // 函数声明名称
            Description: t.Description,                // 函数声明描述
        },
        mcpClient: client,                             // MCP 客户端引用
        requireConfirmation: requireConfirmation,      // 静态确认标志
        requireConfirmationProvider: requireConfirmationProvider, // 动态确认提供者
    }

    // 注意：nil 指针的接口包装问题
    // 因为 InputSchema 是指针类型，而 ResponseJsonSchema 是接口类型
    // 如果直接赋值 nil 指针，会变成"包装了 nil 的接口"，导致 omitempty 失效
    if t.InputSchema != nil {
        mcp.funcDeclaration.ParametersJsonSchema = t.InputSchema // 只在非 nil 时赋值
    }
    if t.OutputSchema != nil {
        mcp.funcDeclaration.ResponseJsonSchema = t.OutputSchema  // 只在非 nil 时赋值
    }

    return mcp, nil
}

5.8 MCP 工具的执行

MCP 工具的 Run 方法处理确认检查后，通过 MCP 客户端调用远程工具：

func (t *mcpTool) Run(ctx tool.Context, args any) (map[string]any, error) {
    // 确认检查...

    // 调用 MCP 工具
    res, err := t.mcpClient.CallTool(ctx, &mcp.CallToolParams{
        Name:      t.name,                    // 工具名称
        Arguments: args,                      // 工具参数
    })

    // 处理错误响应
    if res.IsError {
        // 收集错误文本
        return nil, errors.New(errMsg)        // 返回错误信息
    }

    // 优先返回结构化内容
    if res.StructuredContent != nil {
        return map[string]any{"output": res.StructuredContent}, nil // 结构化响应
    }

    // 否则返回文本内容
    return map[string]any{"output": textResponse.String()}, nil     // 文本响应
}

MCP 工具支持两种响应格式：

结构化内容（StructuredContent）：直接返回 JSON 对象
文本内容（Content）：返回文本片段

5.9 完整使用示例

func main() {
    ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
    // 创建可取消的上下文，监听中断信号
    defer stop() // 确保函数退出时取消上下文

    model, _ := gemini.NewModel(ctx, "gemini-2.5-flash", &genai.ClientConfig{
        APIKey: os.Getenv("GOOGLE_API_KEY"), // 从环境变量读取 API 密钥
    })

    // 选择传输模式
    var transport mcp.Transport
    if strings.ToLower(os.Getenv("AGENT_MODE")) == "github" {
        transport = githubMCPTransport(ctx) // 远程 HTTP 模式
    } else {
        transport = localMCPTransport(ctx)  // 本地内存模式
    }

    // 创建 MCP 工具集
    mcpToolSet, _ := mcptoolset.New(mcptoolset.Config{
        Transport: transport, // 传入传输层
    })

    // 创建智能体，注入 MCP 工具集
    a, _ := llmagent.New(llmagent.Config{
        Name:        "helper_agent",                                           // 智能体名称
        Model:       model,                                                    // 使用的模型
        Description: "Helper agent.",                                          // 描述
        Instruction: "You are a helpful assistant that helps users with various tasks.", // 系统指令
        Toolsets:    []tool.Toolset{mcpToolSet},                               // 注入 MCP 工具集
    })

    // 启动
    config := &launcher.Config{AgentLoader: agent.NewSingleLoader(a)} // 创建启动配置
    l := full.NewLauncher()                                            // 创建启动器
    l.Execute(ctx, config, os.Args[1:])                                // 执行启动
}

MCP 集成架构图：
在这里插入图片描述

6. SkillToolset：技能系统

SkillToolset 是一种特殊的工具集，它允许智能体从文件系统加载和使用“技能”。Skill 是一套“专家知识”，打包成文件夹的形式，智能体可以在需要的时候加载这些知识。

6.1 技能的结构

一个技能是一个文件夹，包含以下内容：

SKILL.md（必需）：主指令文件，包含技能元数据和详细说明
references/（可选）：额外的文档或示例
assets/（可选）：模板、脚本等资源
scripts/（可选）：可执行脚本

6.2 SkillToolset 创建与三个核心工具

func New(ctx context.Context, cfg Config) (*SkillToolset, error) {
    listTool, _ := skilltool.ListSkills(cfg.Source)        // 创建列出技能的工具
    loadTool, _ := skilltool.LoadSkill(cfg.Source)          // 创建加载技能的工具
    loadResourceTool, _ := skilltool.LoadSkillResource(cfg.Source) // 创建加载资源的工具

    return &SkillToolset{
        tools: []tool.Tool{listTool, loadTool, loadResourceTool}, // 三个核心工具
        source: cfg.Source,                                        // 技能数据源
        systemInstruction: instruction,                            // 系统指令文本
    }, nil
}

list_skills 工具用于列出所有可用的技能，让 LLM 了解当前有哪些专业知识可供使用。load_skill 工具用于加载指定技能的完整指令，包括 SKILL.md 文件中的所有内容。load_skill_resource 工具则用于加载技能目录中的其他文件资源，比如参考文档、模板文件或脚本。这三个工具协同工作，实现了技能的发现、加载和资源访问的完整流程。

6.3 系统指令注入

SkillToolset 会在每个 LLM 请求中注入系统指令：

func (ts *SkillToolset) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    skills, err := ts.source.ListFrontmatters(ctx) // 列出所有技能的前言信息
    if err != nil || len(skills) == 0 {
        return nil // 没有技能或出错，跳过
    }
    utils.AppendInstructions(req, ts.systemInstruction, skilltool.SkillsToXML(skills))
    // 将系统指令和技能列表注入到请求中
    return nil
}

系统指令会告诉 LLM：

如果技能与当前查询相关，必须先使用 load_skill 读取完整指令
必须严格按照指令执行
使用 load_skill_resource 访问技能目录中的文件

6.4 工作流程与使用方式

在这里插入图片描述
使用方式：

import (
    "google.golang.org/adk/tool/skilltoolset"
    "google.golang.org/adk/tool/skilltoolset/skill"
)

// 从本地目录加载技能
source := skill.NewFileSystemSource("./skills") // 创建文件系统技能源

skillToolset, err := skilltoolset.New(ctx, skilltoolset.Config{
    Source: source,    // 技能数据源
    Name:   "MySkills", // 工具集名称
})

agent := llmagent.New(llmagent.Config{
    Toolsets: []tool.Toolset{skillToolset}, // 将技能工具集注入智能体
})

Skill 机制的优势：

模块化知识：把专业知识打包成独立的技能，便于管理
按需加载：只在需要的时候加载相关技能，节省上下文
可复用：同一个技能可以在不同的智能体中使用
可扩展：添加新技能不需要修改代码，只需要加文件夹

7. 特殊工具

ADK 提供了几个特殊用途的工具，它们各自承担着独特的功能。

7.1 exitlooptool：终结循环的开关

exitlooptool 是 LoopAgent 退出循环的关键。它使用 functiontool.New 包装一个简单的 Go 函数：

func exitLoop(ctx tool.Context, myArgs struct{}) (map[string]string, error) {
    ctx.Actions().Escalate = true          // 通知 Runner 退出当前循环
    ctx.Actions().SkipSummarization = true // 跳过摘要，直接退出
    return map[string]string{}, nil        // 返回空结果
}

func New() (tool.Tool, error) {
    exitLoopTool, err := functiontool.New(functiontool.Config{
        Name:        "exit_loop",                                       // 工具名称
        Description: "Exits the loop.\n\nCall this function only when you are instructed to do so.\n",
        // 描述中包含指令约束，防止 LLM 随意退出
    }, exitLoop)
    return exitLoopTool, nil
}

关键设计点：

两个 Action 标志：Escalate = true 通知 Runner 向上冒泡退出信号；SkipSummarization = true 避免生成不必要的摘要内容
functiontool 包装：使用 functiontool.New 创建，自动处理 JSON Schema 生成、参数转换和确认检查
空参数结构体：myArgs struct{} 表示这个工具不需要任何参数
描述中的指令约束："Call this function only when you are instructed to do so" 防止 LLM 在未明确指令时随意退出

7.2 preloadmemorytool：请求前的自动记忆注入

preloadmemorytool 是一个“不可见”的工具——它实现了 ProcessRequest 接口，在每个 LLM 请求前自动执行，但 LLM 不会直接调用它。核心逻辑在 ProcessRequest 方法中：

func (t *preloadMemoryTool) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    // 1. 获取用户当前查询文本
    userContent := ctx.UserContent()                         // 获取用户消息内容
    if userContent == nil || len(userContent.Parts) == 0 ||
        userContent.Parts[0] == nil || userContent.Parts[0].Text == "" {
        return nil // 没有用户文本，跳过
    }
    userQuery := userContent.Parts[0].Text                   // 提取用户查询文本

    // 2. 用用户查询搜索记忆
    searchResponse, err := ctx.SearchMemory(ctx, userQuery) // 语义搜索记忆
    if err != nil {
        return fmt.Errorf("preload memory search failed: %v", err) // 搜索失败
    }

    // 3. 没有记忆，跳过
    if searchResponse == nil || len(searchResponse.Memories) == 0 {
        return nil // 没有搜索到记忆
    }

    // 4. 格式化记忆并注入到系统指令
    memoryText := formatMemories(searchResponse.Memories)   // 格式化记忆条目
    if memoryText == "" {
        return nil // 格式化后为空，跳过
    }
    utils.AppendInstructions(req, fmt.Sprintf(preloadInstructions, memoryText))
    // 将记忆注入到系统指令中
    return nil
}

记忆被注入到系统指令中，格式为：

const preloadInstructions = `The following content is from your previous conversations with the user.
They may be useful for answering the user's current query.
<PAST_CONVERSATIONS>
%s
</PAST_CONVERSATIONS>`
// %s 会被替换为格式化的记忆内容

formatMemories 函数把记忆条目格式化为带有时间戳和作者的文本：

func formatMemories(memories []memory.Entry) string {
    var lines []string
    for _, mem := range memories {
        memText := extractText(mem)                          // 提取记忆文本
        if memText == "" {
            continue                                         // 空文本跳过
        }
        if !mem.Timestamp.IsZero() {
            lines = append(lines, fmt.Sprintf("Time: %s", mem.Timestamp.Format(time.RFC3339)))
            // 添加时间戳行
        }
        if mem.Author != "" {
            memText = fmt.Sprintf("%s: %s", mem.Author, memText) // 添加作者前缀
        }
        lines = append(lines, memText)                       // 添加记忆内容行
    }
    return strings.Join(lines, "\n")                         // 用换行符连接
}

7.3 loadmemorytool：LLM 主动调用的记忆搜索

与 preloadmemorytool 不同，loadmemorytool 是 LLM 可以主动调用的工具。它同时实现了 ProcessRequest 和 Run 两个接口：

type loadMemoryTool struct {
    name        string // 工具名称
    description string // 工具描述
}

// Declaration 返回工具的 FunctionDeclaration
func (t *loadMemoryTool) Declaration() *genai.FunctionDeclaration {
    return &genai.FunctionDeclaration{
        Name:        t.name,           // "load_memory"
        Description: t.description,    // "Loads the memory for the current user."
        Parameters: &genai.Schema{
            Type: "OBJECT",            // 参数类型为对象
            Properties: map[string]*genai.Schema{
                "query": {
                    Type:        "STRING",                      // 查询参数类型为字符串
                    Description: "The query to search memory for.", // 参数描述
                },
            },
            Required: []string{"query"}, // query 是必填参数
        },
    }
}

// Run 执行记忆搜索
func (t *loadMemoryTool) Run(toolCtx tool.Context, args any) (map[string]any, error) {
    m, ok := args.(map[string]any)                      // 类型断言为 map
    query, ok := m["query"].(string)                     // 提取查询字符串
    if !ok {
        return nil, fmt.Errorf("query must be a string, got: %T", m["query"]) // 类型错误
    }
    searchResponse, err := toolCtx.SearchMemory(toolCtx, query) // 执行记忆搜索
    if err != nil {
        return nil, fmt.Errorf("failed to search memory: %w", err) // 搜索失败
    }
    if searchResponse == nil || searchResponse.Memories == nil {
        return map[string]any{"memories": []memory.Entry{}}, nil // 无结果返回空列表
    }
    return map[string]any{"memories": searchResponse.Memories}, nil // 返回记忆列表
}

// ProcessRequest 打包工具声明并添加指令
func (t *loadMemoryTool) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    if err := toolutils.PackTool(req, t); err != nil {
        return err // 打包工具声明失败
    }
    utils.AppendInstructions(req, `You have memory. You can use it to answer questions.
If any questions need you to look up the memory, you should call load_memory function with a query.`)
    // 注入记忆使用指令
    return nil
}

与 preloadmemorytool 的关键区别在于触发方式和适用场景。preloadmemorytool 是自动触发的——它在每个 LLM 请求前自动执行，通过 ProcessRequest 接口在后台完成记忆搜索和注入。它不向 LLM 暴露任何函数声明，搜索结果直接注入到系统指令中，适用于每次请求都需要上下文信息的场景。而 loadmemorytool 是 LLM 可以主动调用的工具——它有完整的函数声明（load_memory），LLM 在推理过程中可以根据需要调用它。搜索结果作为函数响应返回给 LLM，适用于需要按需查询特定记忆的场景。这两种工具可以互补使用：preloadmemorytool 提供基础的上下文支持，而 loadmemorytool 则提供更灵活的按需检索能力。

7.4 loadartifactstool：并行加载制品的智能工具

loadartifactstool 是最复杂的特殊工具。它的 ProcessRequest 方法执行三个步骤：

func (t *artifactsTool) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    // 步骤1：打包工具声明到请求中
    if err := toolutils.PackTool(req, t); err != nil {
        return err // 打包失败
    }
    // 步骤2：列出可用制品并注入指令
    if err := t.appendInitialInstructions(ctx, req); err != nil {
        return err // 注入指令失败
    }
    // 步骤3：处理已产生的 load_artifacts 函数调用
    return t.processLoadArtifactsFunctionCall(ctx, req)
}

步骤1：打包工具声明

func (t *artifactsTool) Declaration() *genai.FunctionDeclaration {
    return &genai.FunctionDeclaration{
        Name:        "load_artifacts",                        // 工具名称
        Description: "Loads the artifacts and adds them to the session.", // 工具描述
        Parameters: &genai.Schema{
            Type: "OBJECT",                                  // 参数类型
            Properties: map[string]*genai.Schema{
                "artifact_names": {
                    Type:  "ARRAY",                          // 制品名称数组
                    Items: &genai.Schema{Type: "STRING"},    // 数组元素为字符串
                },
            },
        },
    }
}

步骤2：注入制品列表指令

func (t *artifactsTool) appendInitialInstructions(ctx tool.Context, req *model.LLMRequest) error {
    resp, err := ctx.Artifacts().List(ctx)                   // 列出所有可用制品
    if err != nil {
        return fmt.Errorf("failed to list artifacts: %w", err)
    }
    if len(resp.FileNames) == 0 {
        return nil // 没有制品，不注入指令
    }
    artifactNamesJSON, _ := json.Marshal(resp.FileNames)     // 序列化制品名称列表
    instructions := fmt.Sprintf(
        "You have a list of artifacts:\n  %s\n\nWhen the user asks questions about"+
            " any of the artifacts, you should call the `load_artifacts` function"+
            " to load the artifact. Do not generate any text other than the"+
            " function call. Whenever you are asked about artifacts, you"+
            " should first load it. You must always load an artifact to access its"+
            " content, even if it has been loaded before.",
        string(artifactNamesJSON))                           // 生成制品加载指令
    utils.AppendInstructions(req, instructions)              // 注入指令
    return nil
}

步骤3：并行加载制品（核心逻辑）

func (t *artifactsTool) processLoadArtifactsFunctionCall(ctx tool.Context, req *model.LLMRequest) error {
    // 检查最后一个 content 是否是 load_artifacts 的函数响应
    lastContent := req.Contents[len(req.Contents)-1]         // 获取最后一个内容
    firstPart := lastContent.Parts[0]                        // 获取第一个部分
    functionResponse := firstPart.FunctionResponse           // 获取函数响应
    if functionResponse.Name != "load_artifacts" {
        return nil // 不是 load_artifacts 的响应，跳过
    }

    // 获取要加载的制品名称列表
    artifactNames := functionResponse.Response["artifact_names"].([]string)

    // 使用 errgroup 并行加载所有制品
    results := make([]*genai.Content, len(artifactNames))    // 预分配结果切片
    group, childCtx := errgroup.WithContext(ctx)             // 创建 errgroup
    for i, artifactName := range artifactNames {
        group.Go(func() error {                              // 启动 goroutine 并行加载
            content, err := t.loadIndividualArtifact(childCtx, artifactsService, artifactName)
            if err != nil {
                return fmt.Errorf("failed to load artifact %s: %w", artifactName, err)
            }
            results[i] = content                             // 存储结果
            return nil
        })
    }

    if err := group.Wait(); err != nil {
        return err // 等待所有 goroutine 完成，有错误则返回
    }

    // 把加载的制品内容追加到请求中
    req.Contents = append(req.Contents, results...)          // 追加所有制品内容
    return nil
}

每个制品被加载为一个用户消息：

func (t *artifactsTool) loadIndividualArtifact(ctx context.Context, artifactsService agent.Artifacts, artifactName string) (*genai.Content, error) {
    resp, err := artifactsService.Load(ctx, artifactName)    // 加载制品
    return &genai.Content{
        Parts: []*genai.Part{
            genai.NewPartFromText("Artifact " + artifactName + " is:"), // 制品名称标签
            resp.Part,                                                   // 制品的实际内容
        },
        Role: genai.RoleUser, // 以用户角色添加
    }, nil
}

7.5 exampletool：Few-Shot 示例注入

exampletool 是一个不直接被 LLM 调用的工具，它通过 ProcessRequest 接口在每次 LLM 请求前注入 few-shot 示例：

type Example struct {
    Input  *genai.Content   `json:"input"`   // 用户输入示例
    Output []*genai.Content `json:"output"`  // 期望的模型输出
}

type exampleTool struct {
    examples []*Example // 示例列表
}

func (s exampleTool) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    parts := ctx.UserContent().Parts               // 获取用户消息的各个部分
    if len(parts) == 0 || parts[0].Text == "" {
        return nil // 没有用户消息，跳过
    }
    instruction := buildExamplesSystemInstruction(s.examples, req.Model) // 构建示例系统指令
    utils.AppendInstructions(req, instruction)                           // 注入指令
    return nil
}

buildExamplesSystemInstruction 把示例构建为结构化系统指令，格式如下：

<EXAMPLES>
Begin few-shot
The following are examples of user queries and model responses using the available tools.

EXAMPLE 1:
Begin example
[user]
今天北京天气怎么样？
[model]
```tool_code
get_weather(city='北京')
{"weather_summary": "Today in Beijing is sunny"}
[model]
北京今天天气晴朗。
End example
End few-shot
<EXAMPLES>

关键设计细节：

Gemini 2 适配：检测模型名称是否包含 "gemini-2"，为 Gemini 2 系列使用 tool_code/tool_outputs 格式
标签保护：strings.ReplaceAll(part.Text, "End few-shot", "[PROTECTED]") 防止示例内容中的子串意外终止标签
角色合并：连续相同角色的内容不重复输出角色前缀，使示例更紧凑自然

8. 人工确认机制（HITL）

某些工具的执行可能产生不可逆的后果，比如删除文件、发送邮件、执行支付。对于这类操作，你希望在工具真正执行前，先让人类用户确认一下。这就是 HITL（Human-in-the-Loop，人机协同）确认机制。

8.1 确认流程

8.2 确认机制的两种触发方式

方式一：静态确认标志

在 FunctionTool 的配置中直接设置 RequireConfirmation 为 true，每次调用都需要确认：

tool, _ := functiontool.New(functiontool.Config{
    Name:                "delete_file",         // 工具名称
    Description:         "删除指定文件",         // 工具描述
    RequireConfirmation: true,                  // 每次调用都需要确认
}, deleteFileHandler)                           // 业务处理函数

方式二：动态确认提供者

通过 RequireConfirmationProvider 提供函数，在运行时根据参数动态决定是否需要确认：

tool, _ := functiontool.New(functiontool.Config{
    Name: "transfer_money",                     // 工具名称
    Description: "转账操作",                    // 工具描述
    RequireConfirmationProvider: func(args TransferArgs) bool {
        // 只有金额超过 1000 才需要确认
        return args.Amount > 1000               // 动态判断逻辑
    },
}, transferHandler)                            // 业务处理函数

动态提供者的优先级高于静态标志。如果两者都设置了，以动态提供者的返回值为准。

8.3 手动确认控制（实战：休假申请审批）

手动确认控制提供了最大的灵活性。下面通过一个休假审批示例来理解完整生命周期。

核心数据结构：

// 工具函数的输入参数
type RequestVacationArgs struct {
    Days   int    `json:"days"`               // 请假天数
    UserID string `json:"user_id"`            // 用户 ID
}

// 确认请求中携带的自定义载荷
type ConfirmationPayload struct {
    DaysApproved int `json:"days_approved"`   // 批准的天数
}

// 工具函数的返回结果
type RequestVacationResults struct {
    Status       string `json:"status"`       // 审批状态
    DaysApproved int    `json:"days_approved"` // 批准天数
    RequestID    string `json:"request_id"`   // 请求 ID
}

// 内部管理用的休假请求记录
type VacationRequest struct {
    ID           string                              // 请求 ID
    UserID       string                              // 用户 ID
    Days         int                                 // 请假天数
    Status       string                              // 状态：PENDING, APPROVED, REJECTED
    CallID       string                              // 函数调用 ID
    DaysApproved int                                 // 批准天数
    Confirmation *toolconfirmation.ToolConfirmation   // 确认信息
}

requestVacationDays 工具函数：

func requestVacationDays(ctx tool.Context, args RequestVacationArgs) (*RequestVacationResults, error) {
    // 参数校验
    if args.Days <= 0 {
        return nil, fmt.Errorf("invalid days to request %d", args.Days) // 无效天数
    }

    // 第一步：检查是否已有确认信息
    confirmation := ctx.ToolConfirmation()
    if confirmation == nil {
        // 没有确认信息——这是第一次调用，需要发起确认请求
        requestID := fmt.Sprintf("req-%d", requestCounter) // 生成请求 ID
        requestCounter++                                    // 计数器自增

        // 创建待处理的请求记录
        req := &VacationRequest{
            ID:     requestID,                              // 请求 ID
            UserID: args.UserID,                            // 用户 ID
            Days:   args.Days,                              // 请假天数
            Status: "PENDING",                              // 状态为待处理
        }
        requestsByReqID[requestID] = req                    // 按请求 ID 索引
        requestsByCallID[ctx.FunctionCallID()] = req        // 按调用 ID 索引

        // 发起确认请求，携带自定义载荷
        err := ctx.RequestConfirmation(
            "Please approve or reject the tool call request_time_off()...", // 提示信息
            ConfirmationPayload{DaysApproved: 0},                           // 初始载荷
        )

        return &RequestVacationResults{
            Status:    "Manager approval is required.",    // 等待审批
            RequestID: requestID,                          // 返回请求 ID
        }, nil
    }

    // 第二步：已有确认信息——处理用户的决定
    req, ok := requestsByCallID[ctx.FunctionCallID()]     // 按调用 ID 查找请求
    req.Confirmation = confirmation                        // 保存确认信息

    if confirmation.Confirmed {
        // 用户批准了
        var payload ConfirmationPayload
        json.Unmarshal(jsonBytes, &payload)                // 解析载荷
        approvedDays := min(payload.DaysApproved, args.Days) // 取较小值
        req.Status = "APPROVED"                            // 更新状态为已批准
        req.DaysApproved = payload.DaysApproved            // 记录批准天数
        return &RequestVacationResults{
            Status:       "The time off request is accepted.", // 批准结果
            DaysApproved: approvedDays,                        // 实际批准天数
            RequestID:    req.ID,                              // 请求 ID
        }, nil
    } else {
        // 用户拒绝了
        req.Status = "REJECTED"                            // 更新状态为已拒绝
        return &RequestVacationResults{
            Status:       "The time off request is rejected.", // 拒绝结果
            RequestID:    req.ID,                              // 请求 ID
        }, nil
    }
}

这个函数展示了确认流程的完整生命周期：

第一次调用（confirmation == nil）：工具发现没有确认信息，创建待处理请求，调用 ctx.RequestConfirmation() 发起确认
第二次调用（confirmation != nil）：ADK 框架收到用户的确认响应后，重新执行工具，根据 confirmation.Confirmed 决定批准还是拒绝

用户确认响应的构造：

func processApproval(ctx context.Context, r *runner.Runner, sessionID, requestID string, approved bool, reader *bufio.Reader) {
    req := requestsByReqID[requestID]                    // 按请求 ID 查找请求

    // 构造确认响应的载荷
    payload := ConfirmationPayload{DaysApproved: daysApproved} // 批准天数

    // 关键：构造 FunctionResponse，名称必须是 adk_request_confirmation
    funcResponse := &genai.FunctionResponse{
        Name: toolconfirmation.FunctionCallName,          // "adk_request_confirmation"
        ID:   req.CallID,                                  // 必须与请求的 ID 匹配
        Response: map[string]any{
            "confirmed": approved,                         // 批准或拒绝
            "payload":   payload,                          // 自定义载荷
        },
    }

    // 作为用户消息发送回去
    appResponse := &genai.Content{
        Role:  string(genai.RoleUser),                     // 用户角色
        Parts: []*genai.Part{{FunctionResponse: funcResponse}}, // 函数响应
    }
    runTurn(ctx, r, sessionID, appResponse)                // 继续对话
}

这里有三个关键要求：

Name 必须是 "adk_request_confirmation"
ID 必须与收到的确认请求事件中的 FunctionCall.ID 完全匹配
Response 中必须包含 "confirmed": bool 字段

确认流程完整时序图：

8.4 确认工具包装器与 WithConfirmation

除了在单个工具上配置，ADK 还提供了 WithConfirmation 函数，可以给整个工具集注入确认逻辑：

func WithConfirmation(ts Toolset, requireConfirmation bool,
    requireConfirmationProvider ConfirmationProvider) Toolset
// ts：原始工具集
// requireConfirmation：静态确认标志
// requireConfirmationProvider：动态确认提供者

这个函数返回一个新的 confirmationToolset。它会把工具集中所有实现了 runnableTool 接口的工具，包装成 confirmationTool。

confirmationTool 包装了原始工具，在执行前插入确认检查：

func (t *confirmationTool) Run(ctx Context, args any) (map[string]any, error) {
    // 1. 检查是否已有确认结果
    if confirmation := ctx.ToolConfirmation(); confirmation != nil {
        if !confirmation.Confirmed {
            return nil, fmt.Errorf("error tool %q %w", t.Name(), ErrConfirmationRejected)
            // 已被拒绝，返回拒绝错误
        }
        // 用户已确认，继续执行
    } else {
        // 2. 判断是否需要确认
        requireConfirmation := t.requireConfirmation        // 静态确认标志
        if t.provider != nil {
            requireConfirmation = t.provider(t.Name(), args) // 动态确认提供者
        }

        // 3. 需要确认时，发起确认请求
        if requireConfirmation {
            err := ctx.RequestConfirmation(hint, nil)        // 发起确认请求
            if err != nil {
                return nil, err                              // 发起失败
            }
            ctx.Actions().SkipSummarization = true           // 跳过总结
            return nil, fmt.Errorf("error tool %q %w", t.Name(), ErrConfirmationRequired)
            // 返回确认请求错误
        }
    }

    // 4. 不需要确认或已确认，执行原始工具
    return ft.Run(ctx, args)                                // 执行原始工具
}

关键点在于：当需要确认时，工具返回 ErrConfirmationRequired 错误，同时设置 SkipSummarization = true。这告诉 Flow 不要对这个错误进行总结，因为它不是真正的错误，而是一个等待用户确认的暂停信号。

8.5 adk_request_confirmation 确认事件协议

确认请求不是直接弹窗，而是通过一个特殊的事件机制实现。ADK 使用一个名为 adk_request_confirmation 的特殊函数调用名称：

const FunctionCallName = "adk_request_confirmation"
// 确认请求的函数调用名称常量

8.6 客户端的处理职责

客户端应用（前端 UI）需要做以下几件事：

监听包含 adk_request_confirmation 函数调用的事件
从参数中提取 originalFunctionCall（原始工具调用）
向用户展示清晰的确认提示
捕获用户的决定
发送 FunctionResponse 回 ADK，必须：
- 使用与收到的事件相同的 id
- 名称设为 adk_request_confirmation
- 包含 {"confirmed": bool} 响应负载

ADK 提供了 OriginalCallFrom 辅助函数来从确认事件中提取原始工具调用：

originalCall, err := toolconfirmation.OriginalCallFrom(functionCall)
// 从确认事件中提取原始工具调用

ToolConfirmation 结构体：

type ToolConfirmation struct {
    Hint      string `json:"hint"`       // 给用户的提示信息
    Confirmed bool   `json:"confirmed"`  // 用户的决定：true=批准，false=拒绝
    Payload   any    `json:"payload"`    // 自定义业务数据
}

Hint：向用户解释为什么需要确认
Confirmed：用户还没有做出决定时为 false（零值），批准后设为 true
Payload：应用可以在此字段中放入任何自定义数据

三种确认触发方式对比：

方式	配置位置	适用场景	灵活性
静态确认标志	`Config.RequireConfirmation = true`	每次调用都需要确认（如删除操作）	低
动态确认提供者	`Config.RequireConfirmationProvider`	根据参数决定是否需要确认（如金额阈值）	中
手动确认控制	在工具函数中调用 `ctx.RequestConfirmation()`	复杂的确认逻辑、自定义载荷	高

9. GeminiTool：Gemini 原生工具

除了自定义的 FunctionTool，ADK 还支持使用 Gemini 的原生工具。这些工具由 Gemini API 直接提供，不需要你自己实现业务逻辑。

9.1 什么是 Gemini 原生工具

Gemini 原生工具是 Google 为 Gemini 模型提供的内置工具。geminitool 是 genai.Tool 的包装器：

type geminiTool struct {
    name        string       // 工具名称
    description string       // 工具描述
    value       *genai.Tool  // Gemini 原生工具对象
}

func New(name, description string, t *genai.Tool) tool.Tool {
    return &geminiTool{
        name:        name,         // 工具名称
        description: description,  // 工具描述
        value:       t,            // 原生工具对象
    }
}

9.2 工作原理

GeminiTool 和 FunctionTool 的工作方式完全不同：

FunctionTool：LLM 生成工具调用 -> Flow 引擎执行 Go 函数 -> 返回结果给 LLM -> LLM 继续推理
GeminiTool：LLM 请求中包含工具配置 -> Gemini API 服务端内部执行工具 -> 响应中直接包含工具结果

9.3 Google 搜索工具示例

最常用的 Gemini 原生工具是 Google 搜索：

import (
    "google.golang.org/genai"
    "google.golang.org/adk/tool/geminitool"
)

searchTool := geminitool.New(
    "google_search",                            // 工具名称
    "使用 Google 搜索获取最新信息",                // 工具描述
    &genai.Tool{
        GoogleSearch: &genai.GoogleSearch{},    // Google 搜索配置
    },
)

agent := llmagent.New(llmagent.Config{
    Tools: []tool.Tool{searchTool},             // 注入搜索工具
})

9.4 其他 Gemini 工具

工具类型	用途	适用场景
`GoogleSearch`	Google 搜索	需要最新信息、实时数据
`Retrieval`	检索增强生成（RAG）	基于私有知识库问答
`CodeExecution`	代码执行	数学计算、数据分析
`FunctionCalling`	函数调用	FunctionTool 的底层

注意：不同 Gemini 模型支持的工具可能不同，使用前请查阅官方文档。

9.5 ProcessRequest 实现

GeminiTool 通过 ProcessRequest 把工具配置添加到 LLM 请求中：

func (t *geminiTool) ProcessRequest(ctx tool.Context, req *model.LLMRequest) error {
    return setTool(req, t.value)                // 把原生工具设置到请求中
}

func setTool(req *model.LLMRequest, t *genai.Tool) error {
    if req.Config == nil {
        req.Config = &genai.GenerateContentConfig{} // 确保配置不为 nil
    }
    req.Config.Tools = append(req.Config.Tools, t)  // 追加工具到配置列表
    return nil
}

它直接把 genai.Tool 添加到请求配置的 Tools 列表里，然后交给 Gemini API 处理。

10. 工具集过滤与长运行工具

10.1 工具集过滤

ADK 支持根据上下文动态过滤工具：

type Predicate func(ctx agent.ReadonlyContext, tool tool.Tool) bool
// 过滤谓词函数类型：接收上下文和工具，返回是否保留

func AllowedToolsPredicate(allowedTools []string) Predicate {
    m := make(map[string]bool)                   // 创建允许的工具名集合
    for _, t := range allowedTools {
        m[t] = true                              // 将允许的工具名加入集合
    }
    return func(ctx agent.ReadonlyContext, tool tool.Tool) bool {
        return m[tool.Name()]                    // 检查工具名是否在允许列表中
    }
}

func FilterToolset(toolset Toolset, predicate Predicate) Toolset {
    return &filteredToolset{toolset: toolset, predicate: predicate}
    // 创建过滤后的工具集
}

这在需要根据用户权限或上下文动态调整可用工具时非常有用。

10.2 长运行工具

某些工具可能需要较长时间才能完成。ADK 通过 IsLongRunning() 方法来标记这些工具。

当工具标记为长运行时：

工具执行后立即返回一个资源 ID
客户端可以通过这个 ID 查询进度
最终结果通过后续事件返回

func (f *functionTool[TArgs, TResults]) Declaration() *genai.FunctionDeclaration {
    decl := &genai.FunctionDeclaration{...}    // 构建函数声明

    if f.cfg.IsLongRunning {
        instruction := "NOTE: This is a long-running operation..."
        // 长运行操作的提示说明
        if decl.Description != "" {
            decl.Description += "\n\n" + instruction // 追加到已有描述
        } else {
            decl.Description = instruction            // 直接设置为描述
        }
    }

    return decl
}

10.3 工具回调

和智能体一样，工具也支持回调机制。回调在 Flow.callTool() 方法中被调用：

func (f *Flow) callTool(toolCtx tool.Context, tool toolinternal.FunctionTool, fArgs map[string]any) map[string]any {
    // 前置回调
    response, err = f.invokeBeforeToolCallbacks(toolCtx, tool, fArgs)
    // 在工具执行前调用前置回调

    // 执行工具
    if response == nil && err == nil {
        response, err = tool.Run(toolCtx, fArgs) // 执行工具
    }

    // 错误处理回调
    if err != nil {
        response, err = f.invokeOnToolErrorCallbacks(toolCtx, tool, fArgs, err)
        // 工具执行出错时调用错误回调
    }

    // 后置回调
    response, err = f.invokeAfterToolCallbacks(toolCtx, tool, fArgs, response, err)
    // 在工具执行后调用后置回调

    return response
}

11. 最佳实践

11.1 工具描述要清晰

工具的描述是 LLM 判断是否调用工具的关键：

// 好的描述：明确说明功能和参数含义
// SearchWeather 查询指定城市的当前天气
// city: 城市名称（如：北京、上海、广州）

// 不好的描述：信息不充分
// Weather 获取天气

11.2 使用泛型定义工具参数

使用结构体定义工具参数，便于自动生成 JSON Schema：

type WeatherArgs struct {
    City string `json:"city" jsonschema:"description=城市名称"` // 参数结构体
}

type WeatherResult struct {
    Temperature float64 `json:"temperature"` // 温度字段
    Condition   string  `json:"condition"`   // 天气状况字段
}

tool, _ := functiontool.New(functiontool.Config{
    Name:        "SearchWeather",              // 工具名称
    Description: "查询天气",                    // 工具描述
}, func(ctx tool.Context, args WeatherArgs) (WeatherResult, error) {
    // 实现业务逻辑...
})

11.3 使用 HITL 保护敏感操作

对于涉及金钱、数据修改的操作，使用人工确认：

tool, _ := functiontool.New(functiontool.Config{
    Name:                  "TransferMoney",       // 工具名称
    Description:           "转账",                 // 工具描述
    RequireConfirmation:   true,                  // 需要人工确认
}, func(ctx tool.Context, args TransferArgs) (TransferResult, error) {
    // 实现转账逻辑...
})

11.4 使用 AgentTool 组合智能体

当一个任务需要多个专业智能体协作时，使用 AgentTool：

researchAgent := llmagent.New(...)               // 研究智能体
analysisAgent := llmagent.New(...)               // 分析智能体

researchTool := agenttool.New(researchAgent, nil) // 将研究智能体包装为工具

mainAgent := llmagent.New(llmagent.Config{
    Toolset: toolset.New("main", []tool.Tool{researchTool}), // 主智能体
})

12. 总结

ADK 的工具系统是一个设计精良的子系统，支持：

FunctionTool：将 Go 函数自动转换为工具，支持泛型和自动 Schema 生成
AgentTool：将智能体作为工具调用，实现智能体组合
MCPToolset：集成外部工具服务，支持本地内存和远程 HTTP 两种模式
SkillToolset：从文件系统加载技能，支持 list_skills、load_skill、load_skill_resource 三个核心工具
特殊工具：exitlooptool（退出循环）、preloadmemorytool（自动记忆注入）、loadmemorytool（主动记忆搜索）、loadartifactstool（并行加载制品）、exampletool（Few-Shot 示例注入）
GeminiTool：Gemini 原生工具包装器，支持 Google 搜索、代码执行等服务端工具
HITL 确认机制：保护敏感操作，支持静态确认、动态确认和手动确认三种方式
工具过滤：根据上下文动态调整可用工具
长运行工具：支持异步执行

工具类型总结：

工具类型	实现位置	执行位置	典型用途
FunctionTool	`tool/functiontool`	本地（你的代码）	自定义业务逻辑
AgentTool	`tool/agenttool`	本地（子智能体）	智能体组合
GeminiTool	`tool/geminitool`	Gemini 服务端	Google 搜索、代码执行
MCPTool	`tool/mcptoolset`	MCP 服务器	连接外部服务
SkillToolset	`tool/skilltoolset`	本地（文件系统）	加载专家知识
ExitLoopTool	`tool/exitlooptool`	本地	退出循环
PreloadMemoryTool	`tool/preloadmemorytool`	本地	预加载记忆
LoadMemoryTool	`tool/loadmemorytool`	本地	搜索记忆
LoadArtifactsTool	`tool/loadartifactstool`	本地	加载制品文件

MCP技术社区

欢迎加入 MCP 技术社区！与志同道合者携手前行，一同解锁 MCP 技术的无限可能！

更多推荐

AI Agent 面试题 719：Agent的多层安全防御架构设计原则

Prompt 注入攻击与防御是 AI Agent 技术体系中的重要组成部分。简单来说，它涉及到 Agent 如何在 Agent安全与对齐层面实现智能化的行为和决策。在实际应用中，Prompt 注入攻击与防御的核心目标是让 Agent 能够更加高效、准确地完成特定任务。这需要我们深入理解其底层原理和实现机制。从学术角度来看，Prompt 注入攻击与防御的研究可以追溯到人工智能的早期阶段。早在

MCP技术社区

AI Agent 面试题 712：如何实现Agent的输出内容的多维度安全检查？

输出安全过滤是 AI Agent 技术体系中的重要组成部分。简单来说，它涉及到 Agent 如何在 Agent安全与对齐层面实现智能化的行为和决策。在实际应用中，输出安全过滤的核心目标是让 Agent 能够更加高效、准确地完成特定任务。这需要我们深入理解其底层原理和实现机制。从学术角度来看，输出安全过滤的研究可以追溯到人工智能的早期阶段。早在 1950 年代，Alan Turing 就提出

MCP技术社区

AI Agent 面试题 715：Agent的对抗样本检测和防御机制

MCP技术社区

所有评论(0)

查看更多评论

星航夜空的帆舟

@qq_45927003

已为社区贡献1条内容

ADK 工具系统：从 FunctionTool 到 AgentTool 的实现

星航夜空的帆舟

1. 工具系统架构总览

2. Tool 接口：工具的基础抽象

2.1 Tool 接口与 runnableTool

2.2 Tool Context：工具执行的上下文

3. FunctionTool：将 Go 函数转换为工具

3.1 创建 FunctionTool

3.2 配置选项详解

3.3 参数类型校验

3.4 自动生成 JSON Schema

3.5 动态确认提供者

3.6 FunctionTool 的执行流程

3.7 确认检查逻辑

3.8 结果转换与包装

3.9 ProcessRequest 与 Declaration

4. AgentTool：将智能体作为工具调用

4.1 创建 AgentTool

4.2 AgentTool 的执行原理

4.3 状态继承机制

4.4 AgentTool vs 智能体转移

4.5 AgentTool 的输入输出 Schema

4.6 AgentTool 多工具组合实战

5. MCPToolset：Model Context Protocol 集成

5.1 MCP 协议概述

5.2 两种集成模式

5.3 本地内存 MCP 模式详解

5.4 远程 HTTP MCP 模式详解

5.5 mcptoolset.New() 创建工具集

5.6 connectionRefresher：自动重连机制

5.7 工具发现与转换流程

5.8 MCP 工具的执行

5.9 完整使用示例

6. SkillToolset：技能系统

6.1 技能的结构

6.2 SkillToolset 创建与三个核心工具

6.3 系统指令注入

6.4 工作流程与使用方式

7. 特殊工具

7.1 exitlooptool：终结循环的开关

7.2 preloadmemorytool：请求前的自动记忆注入

7.3 loadmemorytool：LLM 主动调用的记忆搜索

7.4 loadartifactstool：并行加载制品的智能工具

7.5 exampletool：Few-Shot 示例注入

8. 人工确认机制（HITL）

8.1 确认流程

8.2 确认机制的两种触发方式

8.3 手动确认控制（实战：休假申请审批）

8.4 确认工具包装器与 WithConfirmation

8.5 adk_request_confirmation 确认事件协议

8.6 客户端的处理职责

9. GeminiTool：Gemini 原生工具

9.1 什么是 Gemini 原生工具

9.2 工作原理

9.3 Google 搜索工具示例

9.4 其他 Gemini 工具

9.5 ProcessRequest 实现

10. 工具集过滤与长运行工具

10.1 工具集过滤

10.2 长运行工具

10.3 工具回调

11. 最佳实践

11.1 工具描述要清晰

11.2 使用泛型定义工具参数

11.3 使用 HITL 保护敏感操作

11.4 使用 AgentTool 组合智能体

12. 总结

所有评论(0)

温馨提示：您尚未绑定手机号

星航夜空的帆舟