JARY Academy · Build

90-Min Agent Safety Intensive

90 分鐘故事式密集訓練:從一個漂亮的 agent demo,推進到權限、工具、審批、攻擊、trace 與 production gate。 A 90-minute story-led intensive: from an impressive agent demo to permissions, tools, approvals, attacks, traces and the production gate.

Build 是給專業人士的 corporate agentic security playbook。它不逐點唸安全清單,而是跟著一個 customer-success agent 從 demo、授權、工具、approval、prompt injection、sandbox,一路走到上線前董事會式 go/no-go 決策。 Build is a corporate agentic security playbook for professionals. It follows one customer-success agent from demo to authority, tools, approval, prompt injection, sandboxing and a board-style go/no-go launch decision.

開啟 Build Slide ModeOpen Build Slide Mode Smart Play
Story-Led Learning

跟著同一個 agent 事故故事,逐步看見風險如何升級。Follow one agent incident story as risk escalates step by step.

Implementation Practice

不是概念口號,而是 workflow、tool contract、approval、trace 的落地做法。Not slogans: workflow fit, tool contracts, approval surfaces and trace design.

Interactive Artifacts

每章都有輕量練習,最後帶走可內部評審的 evidence kit。Each chapter includes a light exercise, ending in a review-ready evidence kit.

What You Take Away

不是聽完就算,而是帶走一套 agent 安全評審工具。Not just a talk: a working agent security review toolkit.

Permission Matrix

把 read、draft、write、send、admin 分開,避免「一開就全開」。Separate read, draft, write, send and admin so access is not all-or-nothing.

Approval Thresholds

按金額、資料敏感度、影響人數、可逆性決定是否需要批准。Use money, data sensitivity, people affected and reversibility to decide approval.

Audit Schema

記錄 request、source、tool call、approval、final action、rollback path。Log request, source, tool call, approval, final action and rollback path.

Story Arc

不是逐點清單,而是一個 agent 上線前的完整故事。Not a checklist: the full story before an agent goes live.

開啟 40-slide modeOpen 40-slide mode
Act 1 · The Wow Moment

Demo 很漂亮,但它已經從回答走向行動。The demo is impressive, but it has moved from answering to acting.

Hook:如果它只是答錯,問題不大;如果它真的做錯,誰負責?Hook: a wrong answer is one thing; a wrong action needs ownership.

Act 2 · Agent or Workflow

固定流程先用 workflow;需要判斷才升級為 agent。Use a workflow for fixed paths; use an agent when judgement matters.

練習:用 job clarity、context variation、tool risk、eval feasibility 判斷。Exercise: score job clarity, context variation, tool risk and eval feasibility.

Act 3 · Authority and Tools

先定身份、token、scope,再設計窄工具。Define identity, token and scope before narrow tools.

練習:把 broad access 拆成 read、draft、approval、send tools。Exercise: rewrite broad access into read, draft, approval and send tools.

Act 4 · Approval and Injection

每個 side effect 要有人負責;不可信內容不能變成授權。Every side effect needs ownership; untrusted content cannot grant authority.

練習:重寫 approval screen,找出隱藏 prompt injection。Exercise: redesign approval and spot hidden prompt injection.

Act 5 · Trace, Evals, Gate

看不見過程,就無法證明安全。No trace, no proof of control.

練習:寫 eval,最後做 ship / delay / block 的 board gate。Exercise: write evals, then make a ship / delay / block board decision.

Evidence Kit

最後帶走一套可內部評審的 agent readiness kit。Leave with a review-ready agent readiness kit.

Permission matrix、tool contract、approval threshold、injection tests、audit schema、incident mini-runbook。Permission matrix, tool contract, approval threshold, injection tests, audit schema and incident mini-runbook.

Product Lead Lens Today

如果用今日一線 AI 產品團隊標準重做,重點會是 eval、trace、guardrail、tool contract。If a modern agent product lead designed this today, the focus would be evals, traces, guardrails and tool contracts.

Workflow Before Agent

固定流程先用 workflow;只有需要判斷、工具選擇與情境彈性時,才升級為 agent。Use a workflow for fixed paths; use an agent when judgement, tool choice and situational flexibility matter.

Evals Before Autonomy

每加一個工具或提高一級自治,都要有成功、拒絕、攻擊與復原測試。Every new tool or autonomy level needs success, refusal, attack and recovery tests.

Tool Contracts

工具介面要窄、可驗證、可 dry-run、可記錄,不把 shell / SQL / admin 直接交給模型。Tool interfaces should be narrow, verifiable, dry-run capable and logged, not raw shell, SQL or admin access.

Trace As Product UX

Trace 不只是 debug;它是安全、合規、管理層信任與事故調查的產品功能。Trace is not just debugging; it is a product feature for security, compliance, executive trust and incident review.

Industrial Control Plane

模型提出請求;控制平面決定能否變成行動。The model proposes. The control plane decides whether it becomes action.

Build 的核心不是把 prompt 寫得更禮貌,而是把 agent 放進一個可授權、可限制、可記錄、可復原的作業環境。模型可以推理;policy、工具閘口、sandbox、approval 與 audit 決定 blast radius。Build is not about prettier prompts. It places the agent inside an environment that can authorize, constrain, record and recover. The model reasons; policy, tool gateways, sandboxes, approvals and audits define blast radius.

User Intent

Who asked, what goal, which authority.

Agent Runtime

Plans and requests capabilities.

Policy Gateway

Checks scope, role, data and risk.

Approval

Shows exact side effect before action.

Sandbox

Limits browser, files, code and network.

Action

Executes through narrow tools only.

Trace

Logs request, sources, tool call and recovery path.

Approval Matrix

什麼時候可以略過批准?When can approval be skipped?

ActionDefaultControl
Read-only / 摘要Read-only / summarise通常可自動Usually autonomous範圍、來源、logScope, source, log
Draft / 準備Draft / prepare可自動Can be autonomous不可發送,不可承諾No send, no commitment
Internal reversible actionInternal reversible action按 policyPolicy-based低風險、可復原、有限額Low risk, reversible, thresholds
External send/share/payment/delete/deployExternal send/share/payment/delete/deploy需要批准Approval required展示影響、批准者、時間、最終動作Show impact, approver, time and final action
Deployment Gate

可以 demo 不等於可以上線。Demo-ready is not deployment-ready.

企業部署前要能回答以下問題。答不到,就不要把 agent 接到真實客戶、金錢、權限或 production 系統。Before enterprise deployment, the team must answer these questions. If it cannot, do not connect the agent to real customers, money, permissions or production systems.

Identity

Who does the agent represent, and what role or token does it use?

Scope

Which tools, data classes, systems and users are in scope?

Tool Sink Gates

Can the same run combine sensitive reads with external writes?

Approval

Which thresholds require human approval: money, users affected, data sensitivity, reversibility?

Professional Evidence

Are prompts, sources, tool calls, approvals, outputs and final actions traceable?

Recovery

Can the organization stop, rollback, alert and investigate when the agent goes wrong?

Controlled Workspace Pattern

Request → intent → approved context → draft → risk check → approval → action → log

Closing Principle

Agent ready 不是它能完成任務,而是組織能證明它被允許做什麼、誰批准、如何回復。Agent-ready is not when it can complete the task. It is when the organisation can prove what it was allowed to do, who approved it and how to recover.

以投影片模式展示 Build 安全 playbookPresent the Build safety playbook in slide mode