Demo 很漂亮,但它已經從回答走向行動。The demo is impressive, but it has moved from answering to acting.
Hook:如果它只是答錯,問題不大;如果它真的做錯,誰負責?Hook: a wrong answer is one thing; a wrong action needs ownership.
跟著同一個 agent 事故故事,逐步看見風險如何升級。Follow one agent incident story as risk escalates step by step.
不是概念口號,而是 workflow、tool contract、approval、trace 的落地做法。Not slogans: workflow fit, tool contracts, approval surfaces and trace design.
每章都有輕量練習,最後帶走可內部評審的 evidence kit。Each chapter includes a light exercise, ending in a review-ready evidence kit.
把 read、draft、write、send、admin 分開,避免「一開就全開」。Separate read, draft, write, send and admin so access is not all-or-nothing.
按金額、資料敏感度、影響人數、可逆性決定是否需要批准。Use money, data sensitivity, people affected and reversibility to decide approval.
記錄 request、source、tool call、approval、final action、rollback path。Log request, source, tool call, approval, final action and rollback path.
Hook:如果它只是答錯,問題不大;如果它真的做錯,誰負責?Hook: a wrong answer is one thing; a wrong action needs ownership.
練習:用 job clarity、context variation、tool risk、eval feasibility 判斷。Exercise: score job clarity, context variation, tool risk and eval feasibility.
練習:把 broad access 拆成 read、draft、approval、send tools。Exercise: rewrite broad access into read, draft, approval and send tools.
練習:重寫 approval screen,找出隱藏 prompt injection。Exercise: redesign approval and spot hidden prompt injection.
練習:寫 eval,最後做 ship / delay / block 的 board gate。Exercise: write evals, then make a ship / delay / block board decision.
Permission matrix、tool contract、approval threshold、injection tests、audit schema、incident mini-runbook。Permission matrix, tool contract, approval threshold, injection tests, audit schema and incident mini-runbook.
固定流程先用 workflow;只有需要判斷、工具選擇與情境彈性時,才升級為 agent。Use a workflow for fixed paths; use an agent when judgement, tool choice and situational flexibility matter.
每加一個工具或提高一級自治,都要有成功、拒絕、攻擊與復原測試。Every new tool or autonomy level needs success, refusal, attack and recovery tests.
工具介面要窄、可驗證、可 dry-run、可記錄,不把 shell / SQL / admin 直接交給模型。Tool interfaces should be narrow, verifiable, dry-run capable and logged, not raw shell, SQL or admin access.
Trace 不只是 debug;它是安全、合規、管理層信任與事故調查的產品功能。Trace is not just debugging; it is a product feature for security, compliance, executive trust and incident review.
Build 的核心不是把 prompt 寫得更禮貌,而是把 agent 放進一個可授權、可限制、可記錄、可復原的作業環境。模型可以推理;policy、工具閘口、sandbox、approval 與 audit 決定 blast radius。Build is not about prettier prompts. It places the agent inside an environment that can authorize, constrain, record and recover. The model reasons; policy, tool gateways, sandboxes, approvals and audits define blast radius.
Who asked, what goal, which authority.
Plans and requests capabilities.
Checks scope, role, data and risk.
Shows exact side effect before action.
Limits browser, files, code and network.
Executes through narrow tools only.
Logs request, sources, tool call and recovery path.
企業部署前要能回答以下問題。答不到,就不要把 agent 接到真實客戶、金錢、權限或 production 系統。Before enterprise deployment, the team must answer these questions. If it cannot, do not connect the agent to real customers, money, permissions or production systems.
Who does the agent represent, and what role or token does it use?
Which tools, data classes, systems and users are in scope?
Can the same run combine sensitive reads with external writes?
Which thresholds require human approval: money, users affected, data sensitivity, reversibility?
Are prompts, sources, tool calls, approvals, outputs and final actions traceable?
Can the organization stop, rollback, alert and investigate when the agent goes wrong?
Request → intent → approved context → draft → risk check → approval → action → log
Agent ready 不是它能完成任務,而是組織能證明它被允許做什麼、誰批准、如何回復。Agent-ready is not when it can complete the task. It is when the organisation can prove what it was allowed to do, who approved it and how to recover.