AutoGPT 精通教程

基于 AutoGPT 最新版本（2025年）编写
涵盖 AutoGPT Platform（新架构）与 AutoGPT Classic（经典版）

第一章：AutoGPT 概述

1.1 什么是 AutoGPT？

AutoGPT 是由 Significant Gravitas 团队发起的开源项目，目标是构建"人人可用的 AI Agent 平台"。它不是单个 AI 助手，而是一套完整的 Agent 开发、部署和运行平台。

截至 2025 年，AutoGPT 已发展出两条产品线：

产品线	说明	适用场景
AutoGPT Platform	新一代 Web 平台，基于 Block 可视化搭建 Workflow	快速构建生产级 Agent
AutoGPT Classic	经典命令行 Agent，包含 Forge 开发套件	深度定制 Agent 逻辑

两者共享同一个仓库：https://github.com/Significant-Gravitas/AutoGPT

Star 数：185k+ · Fork 数：46k+ · 语言：Python + TypeScript

1.2 核心特性

AutoGPT Platform（新架构）

可视化 Agent Builder：拖拽式搭建 Agent Workflow
Block 体系：每个 Block 是一个独立功能单元（LLM 调用、网页搜索、文件处理等）
Workflow Engine：支持 DAG（有向无环图）执行引擎
内置市场（Store）：发布和分享 Agent 模板
REST API：所有功能可通过 API 调用
Supabase 集成：用户认证、数据存储、实时同步
多模型支持：OpenAI、Claude、Llama、Gemini 等

AutoGPT Classic（经典版）

命令行 Agent：自主决策、循环执行目标任务
Forge 框架：快速开发自定义 Agent 的工具包
Benchmark 评测：标准化测试 Agent 性能
CLI 管理：创建、启动、停止 Agent

1.3 项目结构

AutoGPT/
├── autogpt_platform/          # 新平台（核心）
│   ├── backend/               # Python FastAPI 后端
│   │   ├── backend/           # 核心逻辑
│   │   │   ├── blocks/        # Block 实现
│   │   │   ├── agents/        # Agent 执行逻辑
│   │   │   └── api/           # API 路由
│   │   ├── agents/            # Agent 模板定义
│   │   └── migrations/        # 数据库迁移
│   ├── frontend/              # Next.js 前端
│   └── autogpt_libs/          # 共享 Python 库
├── classic/                   # 经典版本
│   ├── forge/                 # Forge 开发框架
│   ├── benchmark/             # Agent 评测工具
│   ├── frontend/              # 经典 UI
│   └── cli.py                 # 命令行入口
└── docs/                      # 文档

第二章：AutoGPT Platform 架构原理

2.1 整体架构

AutoGPT Platform 采用微服务 + 事件驱动架构：

graph TB
    %% 样式定义 (双模式自适应：高饱和度底色 + 纯白文字)
    classDef frontend fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#ffffff;
    classDef gateway fill:#107c41,stroke:#0b5930,stroke-width:2px,color:#ffffff;
    classDef service fill:#7a24db,stroke:#5c1baf,stroke-width:1px,color:#ffffff;
    classDef queue fill:#d83b01,stroke:#a82e00,stroke-width:2px,color:#ffffff;
    classDef database fill:#2b2b2b,stroke:#111111,stroke-width:2px,color:#ffffff;

    %% 1. 前端层
    Frontend["💻 前端 (Next.js) <br> Agent Builder · Workflow 管理 · Store"]:::frontend

    %% 2. 网关层
    Gateway["🛡️ API 网关 (FastAPI) <br> 认证 · 路由 · 限流 · 缓存保护"]:::gateway

    %% 3. 核心服务层
    subgraph Services["核心业务服务层"]
        Executor["⚙️ Agent Executor <br> (智能体执行器)"]:::service
        Registry["📦 Block Registry <br> (积木/组件注册)"]:::service
        Integration["🔌 Integration Manager <br> (OAuth / API Key 管理)"]:::service
    end

    %% 4. 异步与消息层
    Queue["📥 消息队列 (RabbitMQ) <br> + 任务调度中心"]:::queue

    %% 5. 数据持久化层
    DB["🗄️ 数据库 (PostgreSQL + Supabase) <br> 用户 · Agent · Workflow · 执行记录 · 文件"]:::database

    %% 架构流转连接
    Frontend ===> |"HTTP / REST"| Gateway
    
    Gateway ---> Executor
    Gateway ---> Registry
    Gateway ---> Integration
    
    Executor ---> Queue
    Registry ---> Queue
    
    Queue ===> DB

    %% 容器与全局线条样式微调
    style Services fill:none,stroke:#888888,stroke-width:2px,stroke-dasharray: 5 5
    linkStyle default stroke:#888888,stroke-width:2px;

核心组件详解

1. 前端（Frontend）
基于 Next.js 构建，提供 Agent Builder（可视化画布）、Agent 管理、Store 市场、执行监控等功能。

2. API 服务器（Backend）
使用 Python FastAPI 构建，异步非阻塞架构。包含：

REST API 路由
用户认证（Supabase Auth）
Block 注册和执行
Agent 调度管理
文件上传与病毒扫描（ClamAV）

3. Agent 执行器（Executor）
负责解析 Workflow DAG、按拓扑排序执行 Block、处理数据传递和错误恢复。

4. 消息队列（RabbitMQ）
异步任务调度，确保长时间运行的 Agent 不会阻塞 API。

5. 数据库（PostgreSQL/Supabase）
存储用户、Agent 定义、Workflow 配置、执行日志、文件元数据。

6. 缓存保护中间件
默认所有 API 端点设置 Cache-Control: no-store, no-cache, must-revalidate, private，防止敏感数据被缓存。只有静态资源、健康检查等白名单路径允许缓存。

2.2 Block 架构

Block 是 AutoGPT Platform 的最小功能单元。每个 Block 封装一个独立的操作。

Block 的生命周期

graph TD
    %% 样式定义 (双模式自适应：高饱和度底色 + 纯白文字)
    classDef init fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#ffffff;
    classDef trigger fill:#107c41,stroke:#0b5930,stroke-width:2px,color:#ffffff;
    classDef core fill:#7a24db,stroke:#5c1baf,stroke-width:2px,color:#ffffff;
    classDef data fill:#ffaa00,stroke:#cc8800,stroke-width:1px,color:#ffffff;
    classDef storage fill:#2b2b2b,stroke:#111111,stroke-width:1px,color:#ffffff;

    %% 1. 配置与准备阶段
    subgraph Phase_Config["1. 设计时 (Design Time)"]
        Drag["🖱️ 用户拖拽 Block"]:::init
        Instance["📦 Block 实例化"]:::init
        Param["⚙️ 配置输入参数"]:::init
    end

    %% 2. 触发条件
    subgraph Phase_Trigger["2. 运行时条件 (Runtime Condition)"]
        Upstream["🔗 上游 Block 执行完成"]:::trigger
    end

    %% 3. 核心执行
    Execute["🚀 Block.execute(input)"]:::core

    %% 4. 输出与流转
    subgraph Phase_Output["3. 输出与后续 (Output & Persistence)"]
        OutData["📊 输出数据"]:::data
        Downstream["⏭️ 下游 Block"]:::data
        Save["💾 缓存/持久化结果"]:::storage
    end

    %% 流程连接关系
    Drag ===> Instance
    Instance ===> Param
    
    Param ===> |"激活等待"| Execute
    Upstream ===> |"输入就绪"| Execute
    
    Execute ===> OutData
    OutData ===> Downstream
    Execute ---> Save

    %% 容器与全局线条样式微调
    style Phase_Config fill:none,stroke:#888888,stroke-width:1.5px,stroke-dasharray: 5 5
    style Phase_Trigger fill:none,stroke:#888888,stroke-width:1.5px,stroke-dasharray: 5 5
    style Phase_Output fill:none,stroke:#888888,stroke-width:1.5px,stroke-dasharray: 5 5
    
    linkStyle default stroke:#888888,stroke-width:2px;

Block 接口规范

每个 Block 实现以下核心接口：

class Block:
    # Block 元数据
    id: str              # 唯一标识
    name: str            # 显示名称
    description: str     # 功能描述
    categories: list     # 分类标签
    
    # 输入输出定义
    input_schema: dict   # JSON Schema 定义输入
    output_schema: dict  # JSON Schema 定义输出
    
    # 核心执行方法
    async def execute(input_data: dict) -> dict:
        """执行 Block 逻辑，返回输出"""
        pass

内置 Block 分类

分类	示例 Block
AI/LLM	LLM Call、Chat、Text Completion、Embedding
网络	Web Search、Web Scrape、HTTP Request
数据处理	Text Split、JSON Parse、CSV Read/Write
文件	File Read、File Write、Image Processing
存储	Database Query、Redis Get/Set
通信	Send Email、Slack Message、Discord Webhook
工具	Calculator、Code Execute、Shell Command
输入/输出	User Input、Display Result、Generate Report

2.3 Workflow 引擎

Workflow 定义了 Agent 的行为逻辑，本质上是一个有向无环图（DAG）。

Workflow 的 JSON 表示

{
  "id": "wf_xxxx",
  "name": "热点新闻摘要 Agent",
  "description": "抓取热点新闻并用 AI 生成摘要",
  "nodes": [
    {
      "id": "node_1",
      "block_id": "web_search",
      "config": {
        "query": "今日热点新闻",
        "max_results": 10
      }
    },
    {
      "id": "node_2",
      "block_id": "web_scrape",
      "config": {
        "max_content_length": 5000
      }
    },
    {
      "id": "node_3",
      "block_id": "llm_call",
      "config": {
        "model": "gpt-4o",
        "system_prompt": "你是一个新闻编辑。请为以下文章生成200字以内的中文摘要。",
        "temperature": 0.3
      }
    },
    {
      "id": "node_4",
      "block_id": "display_result",
      "config": {}
    }
  ],
  "edges": [
    {"from": "node_1", "to": "node_2"},
    {"from": "node_2", "to": "node_3"},
    {"from": "node_3", "to": "node_4"}
  ]
}

DAG 执行算法

# 简化版执行逻辑
async def execute_workflow(workflow: dict):
    graph = build_dag(workflow["nodes"], workflow["edges"])
    sorted_nodes = topological_sort(graph)
    node_outputs = {}
    
    for node_id in sorted_nodes:
        node = graph[node_id]
        # 收集上游输出
        inputs = {}
        for upstream_id in node.dependencies:
            inputs[upstream_id] = node_outputs[upstream_id]
        
        # 执行 Block
        block_instance = resolve_block(node.block_id)
        output = await block_instance.execute(inputs)
        node_outputs[node_id] = output
    
    return node_outputs

数据传递机制

Block 之间的数据通过命名管道传递
支持自动类型转换（文本、JSON、文件引用等）
大文件使用 Workspace 存储，传递文件引用路径
跨 Block 传递图片时使用 for_block_output() 工具函数自动适配上下文

# 在 Block 中输出图片
async def execute(self, input_data):
    # ... 生成图片逻辑 ...
    from autogpt_libs.utils import for_block_output
    result_url = for_block_output("image_url", image_data)
    yield "image_url", result_url

2.4 认证与集成

用户认证

基于 Supabase Auth，支持：

邮箱密码登录
OAuth（Google、GitHub）
JWT Token 验证
行级安全策略（RLS）

第三方集成

Platform 提供 Integration Manager 统一管理外部服务认证：

# Integration 注册示例
integration = {
    "id": "int_xxxx",
    "provider": "slack",
    "user_id": "usr_xxx",
    "credentials": {
        "access_token": "xoxb-xxx",
        "team_id": "Txxx"
    }
}

支持的集成包括：OpenAI、Claude、Slack、Discord、Gmail、Google Drive、GitHub、Notion 等。

第三章：AutoGPT Classic 架构原理

3.1 经典架构

AutoGPT Classic 是项目的起源版本，其架构如下：

graph LR
    %% 样式定义 (双模式自适应：高饱和度底色 + 纯白文字)
    classDef loopNode fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#ffffff;
    classDef memNode fill:#7a24db,stroke:#5c1baf,stroke-width:2px,color:#ffffff;

    subgraph Agent_Main_Loop["🔄 Agent 主循环 (Main Loop)"]
        %% 核心三阶段（横向排列）
        Perceive["👁️ 感知 (Perceive) <br> 环境输入/信息接收"]:::loopNode
        Think["🧠 思考 (Cognition) <br> 规划/决策/调用LLM"]:::loopNode
        Act["⚡ 执行 (Action) <br> 工具调用/输出结果"]:::loopNode
        
        %% 记忆底座放在下方
        Memory["💾 记忆 (Memory) <br> 短期对话历史 / 长期知识库"]:::memNode
    end

    %% 主循环核心流转（横向）
    Perceive ===> Think
    Think ===> Act

    %% 记忆在下方的环形流转
    Think ---> |"检索/更新"| Memory
    Act ---> |"沉淀结果"| Memory
    Memory ---> |"指导感知"| Perceive

    %% 容器与全局线条样式微调
    style Agent_Main_Loop fill:none,stroke:#888888,stroke-width:2px,stroke-dasharray: 5 5
    linkStyle default stroke:#888888,stroke-width:2px;

核心循环

AutoGPT Classic 的核心是一个感知-思考-执行循环：

感知（Perceive）：接收用户目标，获取环境信息
思考（Think）：LLM 推理当前状态，决定下一步行动
执行（Act）：调用工具（浏览器、文件系统、API 等）
记忆（Memory）：将结果存入短期/长期记忆

每个循环中，LLM 会收到完整的上下文（历史、当前状态、可用工具列表），并输出 JSON 格式的决策：

{
  "thoughts": {
    "text": "我需要搜索最新的 AI 新闻...",
    "reasoning": "用户想要了解 AI 领域动态，搜索是第一步",
    "plan": "- 使用 Google 搜索\n- 打开搜索结果\n- 提取关键信息",
    "criticism": "需要确保搜索关键词准确"
  },
  "tool": "web_search",
  "tool_args": {
    "query": "2025 AI breakthroughs"
  }
}

3.2 Forge 框架

Forge 是一个轻量级 Agent 开发框架，简化了自定义 Agent 的构建。

快速创建 Agent

# 使用 CLI 创建新的 Agent 项目
./run agent create my_agent

Agent 核心模板

from forge.agent import BaseAgent
from forge.llm import LLMProvider
from forge.memory import Memory
from forge.tools import ToolRegistry

class MyAgent(BaseAgent):
    def __init__(self, llm_provider: LLMProvider, memory: Memory):
        super().__init__(llm_provider, memory)
        self.tools = ToolRegistry()
        self.tools.register(web_search_tool)
        self.tools.register(file_read_tool)
        self.tools.register(code_execute_tool)
    
    async def execute_step(self, prompt: str) -> str:
        # 自定义执行逻辑
        context = self.build_context(prompt)
        decision = await self.llm_provider.analyze(context)
        result = await self.tools.execute(decision.tool, decision.tool_args)
        self.memory.store(result)
        return result

第四章：部署与安装

4.1 AutoGPT Platform 部署

环境要求

资源	最低配置	推荐配置
CPU	4 核	8 核
内存	8 GB	16 GB
存储	10 GB	50 GB
系统	Linux/macOS/WSL2	Linux (Ubuntu 20.04+)

必须安装的软件：

Docker Engine 20.10+
Docker Compose 2.0+
Git 2.30+
Node.js 16.x+
npm 8.x+

一键安装（推荐）

macOS / Linux：

curl -fsSL https://setup.agpt.co/install.sh -o install.sh && bash install.sh

Windows（PowerShell）：

powershell -c "iwr https://setup.agpt.co/install.bat -o install.bat; ./install.bat"

该脚本会自动：

克隆仓库
配置 Docker 环境
复制 .env 配置文件
启动所有服务（数据库、Redis、RabbitMQ、ClamAV）
运行数据库迁移
构建前后端
启动 Platform

手动安装

# 1. 克隆仓库
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd AutoGPT/autogpt_platform

# 2. 配置环境变量
cp .env.default .env
# 编辑 .env，填入必要的 API Key

# 3. 启动基础设施
docker compose up -d

# 4. 安装后端依赖
cd backend
cp .env.default .env
poetry install
poetry run prisma migrate dev

# 5. 安装前端依赖
cd ../frontend
cp .env.default .env
npm install

# 6. 启动开发服务器
# 终端1 - 后端
cd backend && poetry run uvicorn backend.main:app --reload --port 8000

# 终端2 - 前端
cd frontend && npm run dev

环境变量配置

核心环境变量（autogpt_platform/.env）：

# LLM API Keys
OPENAI_API_KEY=sk-xxxxxxxx
ANTHROPIC_API_KEY=sk-ant-xxxxx  # 如需使用 Claude

# Supabase（认证与数据库）
SUPABASE_URL=https://xxxx.supabase.co
SUPABASE_SERVICE_KEY=eyJxxx

# 数据库
DATABASE_URL=postgresql://user:pass@localhost:5432/autogpt

# Redis
REDIS_URL=redis://localhost:6379

# RabbitMQ
RABBITMQ_URL=amqp://guest:guest@localhost:5672

# 存储
S3_ENDPOINT=http://localhost:9000
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin

使用 Docker Compose 启动

# 一键启动所有服务
docker compose up -d

# 查看状态
docker compose ps

# 查看日志
docker compose logs -f backend

# 停止
docker compose down

启动后访问：

前端界面：http://localhost:3000
API 文档：http://localhost:8000/docs
Supabase Studio：http://localhost:8001

4.2 AutoGPT Classic 安装

# 克隆仓库
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd AutoGPT/classic

# 安装依赖
pip install -r requirements.txt

# 配置环境变量
cp .env.example .env
# 编辑 .env，填入 OPENAI_API_KEY

# 运行 Agent
python -m autogpt --gpt3only --continuous

CLI 命令参考

./run
Usage: cli.py [OPTIONS] COMMAND [ARGS]...

命令：
  agent        创建、启动和停止 Agent
  benchmark    运行基准测试
  setup        安装系统依赖

第五章：Block 开发实战

5.1 Block 开发规范

AutoGPT Platform 的 Block 位于 autogpt_platform/backend/backend/blocks/ 目录。

Block 目录结构

graph LR
    %% 样式定义 (双模式自适应：高饱和度底色 + 纯白文字)
    classDef root fill:#2b2b2b,stroke:#111111,stroke-width:2px,color:#ffffff;
    classDef ai fill:#0078d4,stroke:#005a9e,stroke-width:1px,color:#ffffff;
    classDef network fill:#107c41,stroke:#0b5930,stroke-width:1px,color:#ffffff;
    classDef storage fill:#7a24db,stroke:#5c1baf,stroke-width:1px,color:#ffffff;
    classDef comm fill:#d83b01,stroke:#a82e00,stroke-width:1px,color:#ffffff;
    classDef data fill:#ffaa00,stroke:#cc8800,stroke-width:1px,color:#ffffff;
    classDef custom fill:#666666,stroke:#444444,stroke-width:1px,color:#ffffff;

    %% 根节点
    Root["📁 blocks/ <br> (组件根目录)"]:::root

    %% 1. AI 模块
    subgraph Sub_AI["🧠 AI 模块"]
        AI_LLM["📄 llm_call.py <br> (大模型调用)"]:::ai
        AI_Chat["📄 chat.py <br> (对话管理)"]:::ai
        AI_Embed["📄 embedding.py <br> (向量嵌入)"]:::ai
    end

    %% 2. 网络模块
    subgraph Sub_Network["🌐 网络模块"]
        Net_Search["📄 web_search.py <br> (网页搜索)"]:::network
        Net_Scrape["📄 web_scrape.py <br> (网页爬取)"]:::network
        Net_HTTP["📄 http_request.py <br> (HTTP 请求)"]:::network
    end

    %% 3. 存储模块
    subgraph Sub_Storage["💾 存储模块"]
        Store_Read["📄 file_read.py <br> (文件读取)"]:::storage
        Store_Write["📄 file_write.py <br> (文件写入)"]:::storage
    end

    %% 4. 通信模块
    subgraph Sub_Comm["💬 通信模块"]
        Comm_Email["📄 send_email.py <br> (发送邮件)"]:::comm
        Comm_Slack["📄 slack_message.py <br> (Slack 消息)"]:::comm
    end

    %% 5. 数据处理模块
    subgraph Sub_Data["📊 数据处理"]
        Data_Split["📄 text_split.py <br> (文本切分)"]:::data
        Data_JSON["📄 json_parse.py <br> (JSON 解析)"]:::data
    end

    %% 6. 自定义扩展
    subgraph Sub_Custom["⚙️ 自定义扩展"]
        Cust_Block["📄 my_block.py <br> (用户自定义)"]:::custom
    end

    %% 横向连接线
    Root ---> Sub_AI
    Root ---> Sub_Network
    Root ---> Sub_Storage
    Root ---> Sub_Comm
    Root ---> Sub_Data
    Root ---> Sub_Custom

    %% 容器与全局线条样式微调（去背景色，自适应深浅模式）
    style Sub_AI fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Sub_Network fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Sub_Storage fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Sub_Comm fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Sub_Data fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Sub_Custom fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    
    linkStyle default stroke:#888888,stroke-width:2px;

5.2 开发第一个自定义 Block

让我们开发一个天气查询 Block，它调用外部天气 API 并返回格式化结果。

步骤1：创建 Block 文件

# backend/backend/blocks/custom/weather_block.py

from typing import Any
import httpx
from backend.blocks.base import Block, BlockCategory, BlockInput, BlockOutput

class WeatherBlock(Block):
    """
    天气查询 Block
    根据城市名称查询实时天气信息
    """
    
    # === 元数据 ===
    id = "weather_query"                    # 全局唯一 ID
    name = "天气查询"                        # 显示名称
    description = "根据城市名称查询实时天气信息"  # 功能描述
    categories = [BlockCategory.TOOLS]      # 分类：工具类
    
    # === 输入定义 ===
    # 使用 JSON Schema 格式定义输入参数
    input_schema = {
        "type": "object",
        "required": ["city"],
        "properties": {
            "city": {
                "type": "string",
                "title": "城市名称",
                "description": "例如：北京、上海、London"
            },
            "units": {
                "type": "string",
                "title": "温度单位",
                "enum": ["metric", "imperial"],
                "default": "metric",
                "description": "metric=摄氏度, imperial=华氏度"
            }
        }
    }
    
    # === 输出定义 ===
    output_schema = {
        "type": "object",
        "properties": {
            "temperature": {
                "type": "number",
                "title": "当前温度"
            },
            "feels_like": {
                "type": "number",
                "title": "体感温度"
            },
            "humidity": {
                "type": "integer",
                "title": "湿度 (%)"
            },
            "description": {
                "type": "string",
                "title": "天气描述"
            },
            "wind_speed": {
                "type": "number",
                "title": "风速 (m/s)"
            },
            "city": {
                "type": "string",
                "title": "城市名称"
            },
            "country": {
                "type": "string",
                "title": "国家代码"
            },
            "raw_response": {
                "type": "string",
                "title": "原始响应"
            }
        }
    }
    
    # === 执行逻辑 ===
    async def execute(self, input_data: BlockInput) -> BlockOutput:
        """
        执行天气查询
        """
        city = input_data["city"]
        units = input_data.get("units", "metric")
        api_key = self.get_integration("openweathermap")  # 从集成中获取 API Key
        
        if not api_key:
            raise ValueError("需要配置 OpenWeatherMap API Key")
        
        # 调用外部 API
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.openweathermap.org/data/2.5/weather",
                params={
                    "q": city,
                    "appid": api_key,
                    "units": units,
                    "lang": "zh_cn"
                }
            )
            response.raise_for_status()
            data = response.json()
        
        # 解析并返回结果
        return {
            "temperature": data["main"]["temp"],
            "feels_like": data["main"]["feels_like"],
            "humidity": data["main"]["humidity"],
            "description": data["weather"][0]["description"],
            "wind_speed": data["wind"]["speed"],
            "city": data["name"],
            "country": data["sys"]["country"],
            "raw_response": data
        }

步骤2：注册 Block

在 backend/backend/blocks/__init__.py 中注册：

# 导入自定义 Block
from backend.blocks.custom.weather_block import WeatherBlock

# 注册到 Block 注册表
__all__ = [
    # ... 已有 Block ...
    "WeatherBlock",
]

步骤3：添加集成配置

在 backend/.env 中添加 API Key：

OPENWEATHERMAP_API_KEY=your_api_key_here

步骤4：配置集成管理器

# backend/backend/integrations/openweathermap.py
from backend.integrations.base import Integration

class OpenWeatherMapIntegration(Integration):
    provider = "openweathermap"
    title = "OpenWeatherMap"
    description = "天气数据 API"
    
    @classmethod
    async def validate_credentials(cls, credentials: dict) -> bool:
        api_key = credentials.get("api_key")
        # 发送测试请求验证 Key 是否有效
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.openweathermap.org/data/2.5/weather",
                params={"q": "London", "appid": api_key}
            )
            return response.status_code == 200

5.3 Block 高级开发技巧

处理大文件输出

当 Block 输出大文件（如生成的图片）时，使用 Workspace：

class ImageGenerationBlock(Block):
    async def execute(self, input_data):
        prompt = input_data["prompt"]
        
        # 调用 AI 生成图片
        image_data = await generate_image(prompt)
        
        # 保存到 Workspace 并返回引用路径
        workspace_path = await self.save_to_workspace(
            filename=f"generated_{uuid.uuid4()}.png",
            data=image_data,
            mime_type="image/png"
        )
        
        # 使用 for_block_output 自动适配上下文
        from autogpt_libs.utils import for_block_output
        return {
            "image_url": for_block_output("image_url", workspace_path),
            "prompt_used": prompt
        }

流式输出 Block

class StreamingLLMBlock(Block):
    """支持流式输出的 LLM Block"""
    
    async def execute(self, input_data):
        prompt = input_data["prompt"]
        model = input_data.get("model", "gpt-4o")
        
        # 使用流式 API
        async for chunk in self.stream_llm(prompt, model):
            # yield 实现流式输出
            yield {"chunk": chunk, "is_final": False}
        
        yield {"chunk": "", "is_final": True}

条件分支 Block

class ConditionBlock(Block):
    """根据条件选择不同执行路径"""
    
    input_schema = {
        "type": "object",
        "properties": {
            "value": {"type": "number"},
            "operator": {
                "type": "string",
                "enum": [">", "<", ">=", "<=", "==", "!="]
            },
            "threshold": {"type": "number"}
        }
    }
    
    # 多个输出端口
    outputs = {
        "true": "条件成立时执行",
        "false": "条件不成立时执行"
    }
    
    async def execute(self, input_data):
        value = input_data["value"]
        operator = input_data["operator"]
        threshold = input_data["threshold"]
        
        result = eval(f"{value} {operator} {threshold}")
        
        if result:
            return {"true": value, "false": None}
        else:
            return {"true": None, "false": value}

5.4 测试 Block

AutoGPT Platform 推荐测试驱动开发（TDD）：

# tests/blocks/test_weather_block.py
import pytest
from backend.blocks.custom.weather_block import WeatherBlock

@pytest.mark.asyncio
async def test_weather_block_success():
    block = WeatherBlock()
    
    # 模拟输入
    result = await block.execute({
        "city": "London",
        "units": "metric"
    })
    
    # 验证输出
    assert result["city"] == "London"
    assert isinstance(result["temperature"], (int, float))
    assert -50 < result["temperature"] < 60  # 合理温度范围
    assert isinstance(result["description"], str)
    assert len(result["description"]) > 0

@pytest.mark.asyncio
async def test_weather_block_missing_city():
    block = WeatherBlock()
    
    # 缺少必填参数应报错
    with pytest.raises(ValueError):
        await block.execute({"units": "metric"})

第六章：构建完整 Agent 实战

6.1 场景：自动社交媒体内容助手

构建一个 Agent，实现以下功能：

抓取 Reddit 热门话题
用 AI 分析趋势
生成社交媒体帖子
自动发布到 Twitter/LinkedIn

Workflow 设计

graph LR
    %% 样式定义 (双模式自适应：高饱和度底色 + 纯白文字)
    classDef trigger fill:#107c41,stroke:#0b5930,stroke-width:2px,color:#ffffff;
    classDef ingest fill:#0078d4,stroke:#005a9e,stroke-width:1px,color:#ffffff;
    classDef ai fill:#7a24db,stroke:#5c1baf,stroke-width:2px,color:#ffffff;
    classDef publish fill:#d83b01,stroke:#a82e00,stroke-width:1px,color:#ffffff;
    classDef log fill:#666666,stroke:#444444,stroke-width:1px,color:#ffffff;

    %% 阶段 1：数据源输入
    subgraph Phase_Ingest["1. 数据输入"]
        Timer["⏱️ 定时触发 <br> (Cron / Schedule)"]:::trigger
        Scrape["🕸️ Reddit 爬取 <br> (Web Scraper / API)"]:::ingest
    end

    %% 阶段 2：AI 核心处理
    subgraph Phase_AI["2. AI 核心智能"]
        Analyze["🧠 AI 分析趋势 <br> (Trend Analysis)"]:::ai
        Generate["✍️ 生成帖子文案 <br> (LLM Content Gen)"]:::ai
    end

    %% 阶段 3：多渠道分发与审计
    subgraph Phase_Out["3. 分发与日志"]
        Twitter["🐦 Twitter 发布"]:::publish
        LinkedIn["💼 LinkedIn 发布"]:::publish
        Log["📝 记录执行日志"]:::log
    end

    %% 业务流转连接 (横向主线用粗线，分发用标准线)
    Timer ===> Scrape
    Scrape ===> Analyze
    Analyze ===> Generate
    
    %% 一分多异步分发
    Generate ---> Twitter
    Generate ---> LinkedIn
    
    %% 汇聚到日志
    Twitter ---> Log
    LinkedIn ---> Log

    %% 容器边框透明自适应（无背景色干扰，完美两栖）
    style Phase_Ingest fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Phase_AI fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    style Phase_Out fill:none,stroke:#888888,stroke-width:1px,stroke-dasharray: 3 3
    
    %% 全局连线颜色自适应
    linkStyle default stroke:#888888,stroke-width:2px;

在 Agent Builder 中配置

在 AutoGPT Platform 前端中：

创建新 Agent：点击 “New Agent”
添加触发节点：选择 Schedule Block，设置每 6 小时执行一次
添加 Reddit Block：配置 subreddit = “technology”, limit = 10
添加 LLM Block：

系统提示词：
你是一个社交媒体运营专家。分析以下 Reddit 热门帖子，
找出 3 个最值得讨论的话题，每个话题生成一条推广文案。
要求：语气活泼、包含 emoji、不超过 280 字符。

添加 Twitter/LinkedIn Block：配置认证信息
添加 Log Block：记录执行结果到数据库
连接所有节点：形成 DAG

配置导出（JSON）

{
  "name": "社交媒体自动运营",
  "trigger": {
    "type": "schedule",
    "config": {
      "interval": "*/360 * * * *",
      "timezone": "Asia/Shanghai"
    }
  },
  "nodes": [
    {
      "id": "reddit",
      "block": "reddit_fetch",
      "config": {
        "subreddit": "technology",
        "sort": "hot",
        "limit": 10
      }
    },
    {
      "id": "analyze",
      "block": "llm_call",
      "config": {
        "model": "gpt-4o",
        "temperature": 0.7,
        "system_prompt": "...（同上）..."
      }
    },
    {
      "id": "twitter",
      "block": "twitter_post",
      "config": {
        "integration": "twitter_oauth"
      }
    },
    {
      "id": "linkedin",
      "block": "linkedin_post",
      "config": {
        "integration": "linkedin_oauth"
      }
    },
    {
      "id": "logger",
      "block": "database_log",
      "config": {
        "collection": "social_media_logs"
      }
    }
  ],
  "edges": [
    {"from": "trigger", "to": "reddit"},
    {"from": "reddit", "to": "analyze"},
    {"from": "analyze", "to": "twitter"},
    {"from": "analyze", "to": "linkedin"},
    {"from": "twitter", "to": "logger"},
    {"from": "linkedin", "to": "logger"}
  ]
}

6.2 使用 REST API 管理 Agent

AutoGPT Platform 提供完整的 REST API：

# 设置 API 基础地址
BASE_URL="http://localhost:8000/api"
TOKEN="your_jwt_token"

# 1. 创建 Agent
curl -X POST "$BASE_URL/agents" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "我的第一个 Agent",
    "description": "自动生成每日摘要",
    "workflow": { ... }  # 上述 JSON
  }'

# 2. 启动 Agent
curl -X POST "$BASE_URL/agents/agent_xxx/start" \
  -H "Authorization: Bearer $TOKEN"

# 3. 查看执行状态
curl -X GET "$BASE_URL/agents/agent_xxx/runs" \
  -H "Authorization: Bearer $TOKEN"

# 4. 查看执行日志
curl -X GET "$BASE_URL/agents/agent_xxx/runs/run_yyy/logs" \
  -H "Authorization: Bearer $TOKEN"

# 5. 停止 Agent
curl -X POST "$BASE_URL/agents/agent_xxx/stop" \
  -H "Authorization: Bearer $TOKEN"

Python SDK 示例

# 使用 Python 管理 Agent
import httpx
import asyncio

class AutoGPTClient:
    def __init__(self, base_url: str, token: str):
        self.client = httpx.AsyncClient(
            base_url=base_url,
            headers={"Authorization": f"Bearer {token}"}
        )
    
    async def create_agent(self, name: str, workflow: dict):
        response = await self.client.post("/api/agents", json={
            "name": name,
            "workflow": workflow
        })
        return response.json()
    
    async def start_agent(self, agent_id: str):
        response = await self.client.post(f"/api/agents/{agent_id}/start")
        return response.json()
    
    async def wait_for_completion(self, agent_id: str, run_id: str, timeout=300):
        start = asyncio.get_event_loop().time()
        while True:
            response = await self.client.get(
                f"/api/agents/{agent_id}/runs/{run_id}"
            )
            data = response.json()
            if data["status"] in ["completed", "failed"]:
                return data
            if asyncio.get_event_loop().time() - start > timeout:
                raise TimeoutError("Agent 执行超时")
            await asyncio.sleep(2)
    
    async def close(self):
        await self.client.aclose()

# 使用示例
async def main():
    client = AutoGPTClient(
        base_url="http://localhost:8000",
        token="your_jwt_token"
    )
    
    # 创建 Agent
    agent = await client.create_agent(
        name="每日新闻摘要",
        workflow={...}  # Workflow JSON
    )
    
    # 启动并等待
    run = await client.start_agent(agent["id"])
    result = await client.wait_for_completion(
        agent["id"], run["id"]
    )
    print(f"执行结果: {result}")
    
    await client.close()

asyncio.run(main())

第七章：AutoGPT Classic 高级开发

7.1 使用 Forge 开发自定义 Agent

Forge 提供了一套完整的 Agent 开发脚手架。

创建 Agent 项目

cd AutoGPT/classic
./run agent create my_custom_agent

Agent 骨架代码

# classic/forge/forge/components/agent.py

from typing import Optional
from forge.agent.protocols import Command
from forge.llm import LLMResponse
from forge.memory import Memory, VectorMemory
from forge.planning import TaskPlanner

class CustomAgent:
    """自定义 Agent"""
    
    def __init__(self, ai_profile: str, goals: list[str], 
                 llm_provider=None, memory: Optional[Memory] = None):
        self.ai_profile = ai_profile
        self.goals = goals
        self.llm = llm_provider or self._default_llm()
        self.memory = memory or VectorMemory()
        self.planner = TaskPlanner()
        self.commands = self._register_commands()
    
    def _register_commands(self) -> dict[str, Command]:
        """注册 Agent 可用的命令"""
        return {
            "web_search": Command(
                name="web_search",
                description="搜索互联网",
                method=self.web_search,
                parameters={
                    "query": "搜索关键词",
                    "num_results": "返回结果数量（默认 5）"
                }
            ),
            "write_file": Command(
                name="write_file",
                description="写入文件",
                method=self.write_file,
                parameters={
                    "file_path": "文件路径",
                    "content": "文件内容"
                }
            ),
            "read_file": Command(
                name="read_file",
                description="读取文件",
                method=self.read_file,
                parameters={
                    "file_path": "文件路径"
                }
            ),
            "execute_python": Command(
                name="execute_python",
                description="执行 Python 代码",
                method=self.execute_python,
                parameters={
                    "code": "要执行的 Python 代码"
                }
            ),
        }
    
    async def execute(self, user_input: str) -> str:
        """执行用户请求"""
        context = self._build_context(user_input)
        
        while True:
            # 1. LLM 分析当前状态
            response: LLMResponse = await self.llm.analyze(
                system_prompt=self._system_prompt(),
                context=context
            )
            
            # 2. 检查是否完成任务
            if response.command is None or response.command.name == "finish":
                break
            
            # 3. 执行命令
            command = self.commands.get(response.command.name)
            if command:
                result = await command.method(**response.command.args)
                context.append(f"执行 {response.command.name}: {result}")
                
                # 4. 存入记忆
                self.memory.store({
                    "thought": response.thoughts.text,
                    "action": response.command.name,
                    "result": result
                })
            else:
                context.append(f"未知命令: {response.command.name}")
            
            # 5. 检查是否超过最大迭代次数
            if len(context) > 50:
                return "执行次数超限，强制终止"
        
        return context[-1] if context else "任务已完成"

7.2 Agent 协议与通信

Classic Agent 支持 Agent Protocol，这是一个标准化的 Agent 通信接口：

# Agent Protocol 标准接口
POST /agent/tasks          # 创建任务
GET  /agent/tasks/{id}     # 获取任务状态
POST /agent/tasks/{id}/steps  # 执行下一步
GET  /agent/tasks/{id}/steps  # 获取步骤列表

这个协议的好处是：前端（UI）、Benchmark（评测工具）、CLI（命令行） 都通过同一协议与 Agent 通信，实现了组件间的解耦。

第八章：Benchmark 评测

8.1 为什么需要 Benchmark？

AutoGPT 提供了一套 Benchmark 评测系统，用于：

客观评估 Agent 的性能
比较不同 LLM 作为 Agent 大脑的效果
在开发过程中进行回归测试

8.2 运行 Benchmark

cd AutoGPT/classic

# 运行所有评测
./run benchmark

# 运行特定类别
./run benchmark --category "information_retrieval"

# 运行单个测试
./run benchmark --test "web_search_basic"

# 自定义 Agent 评测
./run benchmark --agent-path ./my_agent

Benchmark 测试类别

类别	说明	示例测试
information_retrieval	信息检索能力	搜索特定网页、提取数据
code_generation	代码生成能力	编写函数、调试代码
reasoning	推理能力	逻辑推理、数学计算
memory	记忆能力	长文本理解、多轮对话
safety	安全能力	拒绝有害请求、隐私保护
planning	规划能力	多步骤任务分解

8.3 编写自定义测试

# tests/custom/my_test.py
from benchmark import BaseTest, TestResult

class WebSearchAccuracyTest(BaseTest):
    name = "web_search_accuracy"
    category = "information_retrieval"
    description = "测试搜索结果的准确度"
    
    async def run(self, agent) -> TestResult:
        query = "2024年诺贝尔物理学奖得主是谁？"
        
        result = await agent.execute(query)
        
        # 验证结果包含关键信息
        checks = {
            "contains_winner": "约翰·霍普菲尔德" in result or 
                               "John Hopfield" in result,
            "contains_year": "2024" in result,
            "no_hallucination": "诺贝尔" in result  # 基本验证
        }
        
        score = sum(checks.values()) / len(checks) * 100
        
        return TestResult(
            test_name=self.name,
            passed=score >= 80,
            score=score,
            details=checks
        )

第九章：生产部署与运维

9.1 生产环境架构

# docker-compose.prod.yml
version: '3.8'

services:
  traefik:
    image: traefik:v3.0
    command:
      - "--providers.docker=true"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.letsencrypt.acme.tlschallenge=true"
    ports:
      - "443:443"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"
  
  backend:
    build: ./backend
    environment:
      - ENVIRONMENT=production
      - DATABASE_URL=postgresql://user:pass@db:5432/autogpt
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 2G
  
  frontend:
    build: ./frontend
    deploy:
      replicas: 2
  
  db:
    image: postgres:15
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
  
  redis:
    image: redis:7-alpine
  
  rabbitmq:
    image: rabbitmq:3-management-alpine
  
  minio:
    image: minio/minio
    volumes:
      - miniodata:/data
  
  clamav:
    image: clamav/clamav:latest

volumes:
  pgdata:
  miniodata:

9.2 监控与日志

# 查看 Agent 执行日志
docker compose logs -f backend | grep "agent_run"

# 使用 Prometheus + Grafana 监控
# backend 已暴露 /metrics 端点

关键监控指标

指标	说明	告警阈值
`agent_runs_total`	Agent 执行总数	-
`agent_run_duration_seconds`	执行耗时	> 300s
`block_execution_errors`	Block 执行错误数	> 5%
`queue_depth`	任务队列深度	> 100
`api_response_time`	API 响应时间	> 2s

9.3 安全最佳实践

API Key 管理：使用 Integration Manager，勿硬编码
文件扫描：所有上传文件经过 ClamAV 扫描
缓存保护：敏感端点默认禁止缓存
速率限制：API 网关配置限流
审计日志：记录所有 Agent 执行操作
网络隔离：使用 Docker 网络隔离服务

第十章：常见问题与解决方案

10.1 平台部署问题

Q: 启动后端口被占用？

# 查找占用端口的进程
netstat -ano | findstr :3000  # Windows
lsof -i :3000                 # Linux/macOS

# 修改 docker-compose.yml 中的端口映射

Q: 数据库迁移失败？

# 重置数据库
poetry run prisma migrate reset --force

# 重新迁移
poetry run prisma migrate dev

10.2 Block 开发问题

Q: Block 无法在 UI 中显示？

确认 Block 已在 __init__.py 中注册
确认 block.id 全局唯一
重启后端服务

Q: Block 执行超时？

检查是否在 execute() 中使用了 asyncio 非阻塞调用
增大 Block 的超时配置

10.3 经典版问题

Q: Agent 循环执行不停止？

# 设置最大执行次数
python -m autogpt --continuous --max-iterations 50

Q: LLM Token 消耗过大？

# 减少上下文大小
agent = CustomAgent(
    max_context_length=4000,  # 限制 Token 数
    summary_on_limit=True     # 超限时自动摘要
)

附录

A. 资源链接

资源	链接
GitHub 仓库	https://github.com/Significant-Gravitas/AutoGPT
官方文档	https://docs.agpt.co
Discord 社区	https://discord.gg/autogpt
官方网站	https://agpt.co

B. 支持的模型

提供商	模型	说明
OpenAI	GPT-4o、GPT-4o-mini、GPT-4-turbo	默认推荐
Anthropic	Claude 3.5 Sonnet、Claude 3 Haiku	长上下文优势
Meta	Llama 3.1 8B/70B/405B	开源模型
Google	Gemini 1.5 Pro/Flash	多模态
Mistral	Mistral Large、Mixtral 8x7B	高效推理
Nous Research	Hermes 3	指令遵循
NVIDIA	Nemotron	企业级
Vercel	v0	前端生成

C. 术语表

术语	解释
Agent	自主执行任务的 AI 程序
Block	Platform 中最小的功能单元
Workflow	定义 Agent 行为的 DAG 工作流
DAG	有向无环图，Workflow 的执行结构
Forge	经典版中的 Agent 开发框架
Benchmark	Agent 性能评测工具
Workspace	文件存储工作区
Integration	第三方服务认证集成
Store	Agent 模板市场
Agent Protocol	标准化的 Agent 通信协议
RLS	行级安全策略（Supabase）

D. 版本历史

版本	日期	主要变更
1.0	2023-03	AutoGPT 最初发布
2.0	2024-06	Forge 框架发布
3.0	2024-12	AutoGPT Platform 发布
3.1+	2025	持续更新，Block 生态扩展

AutoGPT_精通教程