網易首頁 > 網易號 > 正文申請入駐

Anthropic官宣PTC突破，中國開發者一年前就實現了

2025-12-05 12:36:22　來源: 新智元

北京舉報

分享至

新智元報道

編輯：LRST

【新智元導讀】Anthropic發布了Programmatic Tool Calling（PTC）特性，讓Claude通過代碼編排工具執行，降低token消耗、減少延遲并提升準確性。不過，國產minion框架從一開始就采用類似架構，其LLM規劃決策，代碼環境執行，僅返回最終結果。相比PTC需顯式啟用，minion將此作為基礎架構，還支持Python生態、狀態管理、錯誤處理等功能，在實際應用中展現出更高的效率和靈活性。

2025年11月24日，Anthropic正式發布了Programmatic Tool Calling （PTC）特性，允許Claude通過代碼而非單次API調用來編排工具執行。

這一創新被認為是Agent開發的重要突破，能夠顯著降低token消耗、減少延遲并提升準確性。

然而，Minion框架的創建者最近分享了一個有趣的事實：Minion從一開始就采用了這種架構理念。

代碼鏈接:https://github.com/femto/minion

在PTC概念被正式提出之前，minion已經在生產環境中證明了這種方法的價值。

PTC解決了什么問題？

Anthropic在博文中指出了傳統Tool Calling的兩個核心問題：

1. Context污染問題

傳統方式中，每次工具調用的結果都會返回到LLM的context中。例如分析一個10MB的日志文件時，整個文件內容會進入context window，即使LLM只需要錯誤頻率的摘要。

2. 推理開銷與手動綜合

每次工具調用都需要一次完整的模型推理。LLM必須「眼球式」地解析數據、提取相關信息、推理片段如何組合，然后決定下一步——這個過程既緩慢又容易出錯。

Minion的解決方案

天然的PTC架構

Minion框架從設計之初就采用了一種根本不同的架構：LLM專注于規劃和決策，具體執行交給代碼環境。

Minion的典型工作流包括： 1. LLM分析用戶需求，制定執行計劃； 2. LLM生成Python代碼來編排工具調用； 3. 代碼在隔離環境中執行，處理所有數據操作； 4. 只有最終結果返回給LLM

這正是PTC想要實現的效果，但minion將其作為基礎架構而非可選特性。

實際案例對比

Anthropic博文中的預算合規檢查示例。

任務：找出Q3差旅超預算的團隊成員

傳統Tool Calling方式：

獲取團隊成員 → 20人
為每人獲取Q3費用 → 20次工具調用，每次返回50-100條費用明細
獲取各級別預算限額
所有數據進入context：2000+條費用記錄（50KB+）
LLM手動匯總每人費用、查找預算、比較超支情況

使用PTC后：

Claude寫一段Python腳本編排整個流程
腳本在Code Execution環境運行
LLM只看到最終結果：2-3個超支人員

在Minion中，這種模式是默認行為，llm會生成代碼：

# Minion中的實現（偽代碼）async def check_budget_compliance():    # LLM生成的計劃代碼    team = await get_team_members("engineering")    # 并行獲取所有數據    levels = list(set(m["level"] for m in team))    budgets = {        level: await get_budget_by_level(level)        for level in levels    }    # 數據處理在本地完成    exceeded = []    for member in team:        expenses = await get_expenses(member["id"], "Q3")        total = sum(e["amount"] for e in expenses)        budget = budgets[member["level"]]        if total > budget["travel_limit"]:            exceeded.append({                "name": member["name"],                "spent": total,                "limit": budget["travel_limit"]            })    return exceeded  # 只返回關鍵結果

關鍵區別在于，Minion是框架的核心設計，所有復雜任務都這樣處理；

而PTC需要顯式啟用，存在多重架構限制：

必須顯式標記哪些工具允許programmatic調用（allowed_callers配置）
運行在受限的Claude容器環境中，無法自由安裝任意包
文件需要通過額外的Files API上傳（單文件最大500MB限制）
工具必須在容器4.5分鐘不活動超時前返回結果
Web工具、MCP工具無法通過programmatic方式調用

Minion的優勢

更進一步

Minion不僅實現了PTC的核心理念，還提供了更多優勢：

完整的Python生態系統

Minion中的代碼執行環境擁有完整的Python生態訪問權：

# Minion可以直接使用任何Python庫import pandas as pdimport numpy as npfrom sklearn.cluster import KMeans# 強大的數據處理df = pd.DataFrame(expense_data)analysis = df.groupby('category').agg({    'amount': ['sum', 'mean', 'std'],    'count': 'size'})# 復雜的數據科學任務model = KMeans(n_clusters=3)clusters = model.fit_predict(spending_patterns)

狀態管理和持久化

Minion天然支持復雜的狀態管理：

class BudgetAnalyzer:    def __init__(self):        self.cache = {}        self.history = []    async def analyze_department(self, dept):        # 狀態在整個分析過程中保持        if dept in self.cache:            return self.cache[dept]        result = await self._deep_analysis(dept)        self.cache[dept] = result        self.history.append(result)        return result

錯誤處理和重試邏輯

在代碼中顯式處理各種邊界情況：

async def robust_fetch(user_id, max_retries=3):    for attempt in range(max_retries):        try:            return await get_expenses(user_id, "Q3")        except RateLimitError:            await asyncio.sleep(2 ** attempt)        except DataNotFoundError:            return []  # 合理的默認值    raise Exception(f"Failed after {max_retries} attempts")

并行和異步操作

充分利用Python的異步能力：

# 高效的并行處理async def analyze_all_departments():    departments = ["eng", "sales", "marketing", "ops"]    # 同時分析所有部門    results = await asyncio.gather(*[        analyze_department(dept)        for dept in departments    ])    # 整合分析結果    return consolidate_results(results)

性能數據對比

根據Anthropic的內部測試，PTC帶來了顯著改進：

Token節省：復雜研究任務從43,588降至27,297 tokens（減少37%）
延遲降低：消除了多次模型推理往返
準確率提升：
- 內部知識檢索：25.6% → 28.5%
- GIA基準測試：46.5% → 51.2%

在minion的生產使用中，能觀察到類似甚至更好的指標，因為：

更少的模型調用：LLM只在規劃階段和最終總結時參與
更高效的資源利用：本地數據處理不消耗API tokens
更可預測的性能：代碼執行路徑明確，減少了LLM的不確定性

架構哲學

誰應該做什么？

Minion的設計基于一個核心信念：

LLM擅長理解、規劃和推理；Python擅長執行、處理和轉換。

這種職責分離帶來了清晰的架構：

用戶請求    ↓[LLM：理解意圖，制定計劃]    ↓[生成Python代碼]    ↓[代碼執行環境：調用工具、處理數據、控制流程]    ↓[返回結構化結果]    ↓[LLM：解讀結果，生成用戶友好的響應]

這不僅僅是優化，而是一種架構級別的重新思考。

Tool Search Tool

Minion的動態工具發現

Anthropic的另一個新特性是Tool Search Tool，解決大型工具庫的context消耗問題。Minion在這方面也有相應的機制：

分層工具暴露

# Minion的工具分層策略class MinionToolRegistry:    def __init__(self):        self.core_tools = []      # 始終加載        self.domain_tools = {}    # 按需加載        self.rare_tools = {}      # 搜索發現    def get_tools_for_task(self, task_description):        # 智能工具選擇        tools = self.core_tools.copy()        # 基于任務描述添加相關工具        if "database" in task_description:            tools.extend(self.domain_tools["database"])        if "visualization" in task_description:            tools.extend(self.domain_tools["plotting"])        return tools

向量搜索工具發現

# 使用embedding的工具搜索from sentence_transformers import SentenceTransformerclass SemanticToolSearch:    def __init__(self, tool_descriptions):        self.model = SentenceTransformer('all-MiniLM-L6-v2')        self.tool_embeddings = self.model.encode(tool_descriptions)    def find_tools(self, query, top_k=5):        query_embedding = self.model.encode([query])        similarities = cosine_similarity(query_embedding, self.tool_embeddings)        return self.get_top_tools(similarities, top_k)

實際應用

Minion在生產環境

Minion框架已經在多個實際場景中證明了這種架構的價值：

案例1：大規模數據分析

金融科技公司使用minion分析數百萬條交易記錄，尋找異常模式：

async def detect_anomalies():    # LLM規劃：需要獲取數據、清洗、特征工程、異常檢測    # 執行代碼直接處理大數據集    transactions = await fetch_all_transactions(start_date, end_date)    # 1M+ records, 但不進入LLM context    df = pd.DataFrame(transactions)    df = clean_data(df)    features = engineer_features(df)    # 使用機器學習檢測異常    anomalies = detect_with_isolation_forest(features)    # 只返回異常摘要給LLM    return {        "total_transactions": len(df),        "anomalies_found": len(anomalies),        "top_anomalies": anomalies.head(10).to_dict()    }

結果：

處理100萬條記錄
LLM僅消耗~5K tokens（傳統方式需要500K+）
端到端延遲：30秒（vs 傳統方式的5分鐘+）

案例2：多源數據整合

SaaS公司使用minion整合來自多個API的客戶數據：

async def comprehensive_customer_analysis(customer_id):    # 并行獲取所有數據源    crm_data, support_tickets, usage_logs, billing_history = await asyncio.gather(        fetch_crm_data(customer_id),        fetch_support_tickets(customer_id),        fetch_usage_logs(customer_id),        fetch_billing_history(customer_id)    )    # 本地數據融合和分析    customer_profile = {        "health_score": calculate_health_score(...),        "churn_risk": predict_churn_risk(...),        "upsell_opportunities": identify_opportunities(...),        "support_sentiment": analyze_ticket_sentiment(support_tickets)    }    return customer_profile

案例3：自動化工作流

DevOps團隊使用minion自動化復雜的部署流程：

async def deploy_with_validation():    # 多步驟工作流，每步都有條件邏輯    # 1. 運行測試    test_results = await run_test_suite()    if test_results.failed > 0:        return {"status": "blocked", "reason": "tests failed"}    # 2. 構建和推送鏡像    image = await build_docker_image()    await push_to_registry(image)    # 3. 金絲雀部署    canary = await deploy_canary(image, percentage=10)    await asyncio.sleep(300)  # 監控5分鐘    metrics = await get_canary_metrics(canary)    if metrics.error_rate > 0.01:        await rollback_canary(canary)        return {"status": "rolled_back", "metrics": metrics}    # 4. 完整部署    await deploy_full(image)    return {"status": "success", "image": image.tag}

超越PTC

Minion的未來方向

雖然PTC是一個重要的進步，但minion的架構設計讓我們能夠探索更多可能性：

混合推理模式

在一個會話中智能切換：

# 簡單任務：直接工具調用if task.complexity < THRESHOLD:    result = await simple_tool_call(task)# 復雜任務：生成編排代碼else:    orchestration_code = await llm.generate_code(task)    result = await execute_code(orchestration_code)

增量計算和緩存

智能重用中間結果：

# 記憶化的數據獲取@lru_cache(maxsize=1000)async def cached_get_user_data(user_id):    return await fetch_user_data(user_id)# 增量更新而非全量重算async def update_analysis(new_data):    previous_state = load_checkpoint()    delta = compute_delta(previous_state, new_data)    updated_state = apply_delta(previous_state, delta)    return updated_state

多模型協作

不同模型處理不同階段：

# 規劃用強模型plan = await claude_opus.create_plan(user_request)# 代碼生成用專門模型code = await codegen_model.generate(plan)# 執行和監控result = await execute_with_monitoring(code)# 用戶交互用快速模型response = await claude_haiku.format_response(result)

開源的力量

社區驅動的創新

Minion作為開源項目（300+ GitHub stars），其發展得益于社區的貢獻和反饋。這種開放性帶來了：

快速迭代：社區發現問題和用例，推動快速改進
多樣化應用：用戶在我們未曾想象的場景中使用minion

相比之下，PTC雖然強大，但：

需要顯式配置（allowed_callers,defer_loading等）
依賴特定的API版本和beta功能
與Claude的生態系統緊密耦合

Minion的設計原則是provider-agnostic——你可以用任何LLM后端（Claude, GPT-4, 開源模型），架構優勢依然存在。

技術細節

實現對比

深入比較實現細節。

PTC的實現方式

# Anthropic的PTC需要特定配置{    "tools": [        {            "type": "code_execution_20250825",            "name": "code_execution"        },        {            "name": "get_team_members",            "allowed_callers": ["code_execution_20250825"],            ...        }    ]}# Claude生成工具調用{    "type": "server_tool_use",    "id": "srvtoolu_abc",    "name": "code_execution",    "input": {        "code": "team = get_team_members('engineering')\\\\\\\\n..."    }}

Minion的實現方式

# Minion的工具定義是標準Pythonclass MinionTools:    @tool    async def get_team_members(self, department: str):        """Get all members of a department"""        return await self.db.query(...)    @tool    async def get_expenses(self, user_id: str, quarter: str):        """Get expense records"""        return await self.expenses_api.fetch(...)# LLM生成的是完整的Python函數async def analyze_budget():    # 直接調用工具函數    team = await tools.get_team_members("engineering")    # 完整的Python語言能力    expenses_by_user = {        member.id: await tools.get_expenses(member.id, "Q3")        for member in team    }    # 任意復雜度的數據處理    analysis = perform_complex_analysis(expenses_by_user)    return analysis

關鍵區別：

PTC：工具調用通過特殊的API機制，有caller/callee關系
Minion：工具就是普通的Python async函數，LLM生成標準代碼

為什么這個架構如此重要？

隨著AI Agent向生產環境發展，業界面臨的核心挑戰是：

規模：處理百萬級數據，不能全塞進context

可靠性：生產系統需要確定性的錯誤處理

成本：token消耗直接影響商業可行性

性能：用戶體驗需要亞秒級響應

傳統的單次工具調用模式在這些維度上都遇到瓶頸。代碼編排模式（無論是PTC還是minion）提供了突破：

傳統模式：LLM <-> Tool <-> LLM <-> Tool <-> LLM          (慢)   (貴)   (脆弱)編排模式：LLM -> [Code: Tool+Tool+Tool+Processing] -> LLM          (快)   (省)   (可靠)

經過驗證的架構：PTC的發布證明了架構選擇的正確性——這不是投機性的設計，而是行業領先者獨立得出的結論。
先發優勢：在PTC成為官方特性之前，minion已經在生產環境積累了經驗和最佳實踐。
更廣泛的適用性：

支持多種LLM后端（Claude, GPT-4, 開源模型）；
靈活的部署選項（云端、本地、混合）；
豐富的Python生態系統集成。

社區和生態：300+stars代表的不僅是認可，還有潛在的用戶基礎和貢獻者社區。

結論

架構的必然收斂

Anthropic推出PTC不是偶然——這是agent架構演進的必然方向。當你需要構建能處理復雜任務、大規模數據、多步驟流程的生產級agent時，你會自然而然地得出這樣的結論：

LLM應該專注于它擅長的（理解和規劃），讓代碼處理它擅長的（執行和轉換）。

Minion從一開始就擁抱了這個理念，并將繼續推動這個方向：

?今天：完整的PTC式架構，生產環境驗證
明天：更智能的工具發現、更高效的狀態管理
未來：混合推理、增量計算、多模型協作

視頻演示

作者信息

鄭炳南，畢業于復旦大學物理系。擁有20多年軟件開發經驗，具有豐富的傳統軟件開發以及人工智能開發經驗，是開源社區的活躍貢獻者，參與貢獻metagpt、huggingface項目smolagents、mem0、crystal等項目，為ICLR 2025 oral paper《AFlow: Automating Agentic Workflow Generation》的作者之一。

參考資料：

https://github.com/femto/minion

https://github.com/femto/minion/blob/main/docs/advanced_tool_use.md

秒追ASI

?點贊、轉發、在看一鍵三連?

點亮星標，鎖定新智元極速推送！

特別聲明：以上內容(如有圖片或視頻亦包括在內)為自媒體平臺“網易號”用戶上傳并發布，本平臺僅提供信息存儲服務。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.