⚙ /v1/scrape · /v1/browse · /v1/document · /v1/execute

Tools — substrate, not resold APIs.

Open-source primitives for the agent's hands. Static scrape and URL-document parsing fetch public HTTP(S) through a bounded, DNS-pinned transport; local base64 parsing remains available. Playwright browse alone remains unsafe-flag gated, unfiltered, unsandboxed, and dependent on Redis. Host execute has a separate unsafe opt-in and no tenant boundary.

We charge for the infra surface — storage, compute time, queue, network egress. Not for a markup on a third-party SaaS we resell.

Scrape — static HTML

POST /v1/scrape Bearer required

Available through the bounded static-fetch path. Cheerio extraction accepts public HTTP(S) destinations, rejects any non-public DNS answer, pins the validated answers to the connection, checks the connected peer, and re-runs destination checks at every redirect. At most 1 MB of response bytes is read. A shared process gate admits 16 safe-net requests, queues at most 64 for one second, and holds each permit from before DNS through redirects; saturation returns retryable 503. Admission, DNS, redirects, and response transfer share a 15-second safe-net deadline. The gate is capacity protection shared with federation and custom-facilitator traffic, not a per-project limiter or fairness guarantee. HTML parsing then runs in a fresh, terminable child process: a parser slot waits at most two seconds and the admitted process has a two-second wall timeout, bounded framing, and tag-count/depth/tag-size ceilings. That is not one whole-request deadline. HTTPS verifies the remote certificate; HTTP is cleartext. AgentTool reads the remote bytes and returned page content remains untrusted. Returned content is normalized parsed-body text and extracted is normalized matching-node DOM text, not browser layout-derived text.

Billing: the default is 1 project credit per schema-valid admitted attempt and operators can override it with CREDIT_SCRAPE; the deployment's configured value is published by GET /public/plans and in its OpenAPI operation. The debit is reserved before destination-policy, transport, representation, or parser work, so those failures retain the charge. Schema-invalid and insufficient-credit requests do not debit.

Field	Type	Description
urlrequired	string	Public HTTP(S) URL to fetch through the bounded transport.
selectoroptional	string	One CSS selector whose parsed-DOM subtree union is returned as `extracted`; nested matches do not duplicate descendant text.
extract_linksoptional	boolean	Return up to 100 canonical absolute HTTP(S) links, deduplicated after URL parsing. Relative, malformed, and non-HTTP(S) values are omitted.

Browse — JS-rendered pages

POST /v1/browse Bearer required

Disabled by default. This Playwright path does not use the bounded static-fetch transport: browser navigation and subresources remain unfiltered, Chromium runs unsandboxed, and the worker is not a tenant boundary. It requires AGENTTOOL_ENABLE_UNSAFE_OUTBOUND_TOOLS=1 plus BullMQ/Redis workers; disabled workers return 503 redis_disabled.

Field	Type	Description
urlrequired	string	Target URL.
actionsoptional	array	Ordered `click`, `type`, `scroll`, `wait`, or `select` actions.
screenshotoptional	boolean	Include a full-page screenshot in the response (base64).
extractoptional	string	CSS selector, `text`, or `html`.
timeoutoptional	1000..60000	Browser timeout in milliseconds.

⊙

The route waits up to five seconds for completion. Faster jobs return inline; longer jobs return a job_id to poll or stream at /v1/jobs/:id.

Document — Readability + plain-text

POST /v1/document Bearer required

Mozilla Readability + plain-text conversion. Supply exactly one input. Local base64 has a 1,400,000-character request envelope, must be canonical padded base64, and must decode to at most 1,000,000 bytes; a usable canonical encoding therefore never exceeds 1,333,336 characters. URL input uses the same public-address checks, DNS pinning, connected-peer check, per-hop redirect validation, 1 MB byte limit, shared 16-active/64-queued one-second admission gate, and 15-second safe-net deadline as static scrape. Saturation returns retryable 503; this is shared process capacity, not per-project rate limiting. HTML/XHTML from either input is handed to the same fresh parser process with a bounded two-second slot wait, two-second process wall timeout, and structural ceilings; plain text does not build a DOM. These phases do not form one whole-request deadline. HTTPS verifies the certificate; HTTP is cleartext. Remote document bytes are server-readable and untrusted.

Billing: the default is 3 project credits per schema-valid admitted attempt and operators can override it with CREDIT_DOCUMENT; the deployment's configured value is published by GET /public/plans and in its OpenAPI operation. The debit is reserved before destination-policy, transport, representation, or parser work, so those failures retain the charge. Schema-invalid and insufficient-credit requests do not debit.

Field	Type	Description
urloptional	string	Public HTTP(S) URL to fetch through the bounded transport and parse.
base64optional	string	Canonical padded base64 parsed locally; decoded bytes must not exceed 1,000,000. Supply exactly one of `url` or `base64`.
content_typeoptional	string	Valid only with `base64`: `text/plain`, `text/html`, or `application/xhtml+xml`, optionally followed by MIME parameters such as `charset=utf-8`. Omitted base64 input defaults to `text/plain`. URL mode always uses the bounded upstream response header and rejects an override.

Execute — disabled-by-default legacy host runtime

POST /v1/execute Bearer required

Disabled by default. The route returns 503 unless the operator sets AGENTTOOL_ENABLE_UNSAFE_EXECUTE=1. That opt-in only enables the legacy path: JavaScript uses Node vm; Python and bash use same-container child processes. It does not create a tenant or hostile-code security boundary.

Field	Type	Description
languagerequired	"javascript" · "python" · "bash"	Interpreter to run.
coderequired	string	The script source.
timeout_msoptional	int	Default 10000, max 30000.
stdinoptional	string	Stdin to feed the process.

⊙

Current boundary. Production is fail-closed. If an operator opts in, vault auto-injection remains unavailable and child processes can reach the host filesystem and network. Do not place a bearer, private key, or untrusted code in this route.

Jobs (queued execution)

GET /v1/jobs/:id Bearer required

Poll or stream a queued browse job. Returns status, progress, and the result on completion. Execute is synchronous and does not use this queue.

What we dropped

Earlier versions of agenttool included /v1/search (Brave / SerpAPI proxy) and a Bright Data proxy network. Both were paid-third-party-API resale and were dropped. Static scrape and URL/local document parsing now use the bounded transport above. Full Playwright browse remains an unfiltered, unsandboxed, Redis-backed opt-in; unisolated execute remains a separate disabled-by-default opt-in.

Provider-key injection is not available in /v1/execute. When explicitly enabled, Python or bash calls originate on AgentTool infrastructure and are not an end-to-end-private path.

Tools — substrate, not resold APIs.

Scrape — static HTML

Browse — JS-rendered pages

Document — Readability + plain-text

Execute — disabled-by-default legacy host runtime

Jobs (queued execution)

What we dropped

What to read next