Documentation Index
Fetch the complete documentation index at: https://stagehand-docs-ignore-selectors-extract.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Extract
See how to use extract() to extract structured data from web pages
Method Signatures
- TypeScript
Parameters
Natural language description of what data to extract. If omitted with no schema, returns raw page text.
Zod schema defining the structure of data to extract. Ensures type safety and validation. The return type is automatically inferred from the schema.
Configure the AI model to use for this action. Can be either:
- A string in the format
"provider/model"(e.g.,openai/gpt-5,google/gemini-2.5-flash) - An object with detailed configuration
Maximum time in milliseconds to wait for the extraction to complete. Default varies by configuration.
Optional selector (XPath, CSS selector, etc.) to limit extraction scope to a specific part of the page. Reduces token usage and improves accuracy.
Optional list of selectors to exclude from the extracted snapshot before extraction runs. Each selector removes all matching elements and their descendants.
ignoreSelectors applies to all matches for each selector. selector keeps its single-target scoping behavior.Optional: Specify which page to perform the extraction on. Supports multiple browser automation libraries:
- Stagehand Page: Native Stagehand Page objects
- Playwright: Playwright Page objects
- Puppeteer: Puppeteer Page objects
- Patchright: Patchright Page objects
Override the instance-level Defaults to the value set on the Stagehand constructor (which itself defaults to
serverCache setting for this request. When true, enables server-side caching. When false, disables it.Only applies when
env is "BROWSERBASE". Has no effect in local environments.true).Built-in Support
Iframe and Shadow DOM interactions are supported out of the box. Stagehand automatically handles iframe traversal and shadow DOM elements without requiring additional configuration or flags.
Response Types
- With Schema
- String Only
- No Parameters
Returns:
Promise<z.infer<T> & { cacheStatus?: "HIT" | "MISS" }> where T is your schemaThe returned object will be strictly typed according to your Zod schema definition. The optional cacheStatus field indicates whether the result was served from the server-side cache ("HIT") or computed fresh ("MISS"). Only present when running with env: "BROWSERBASE" and server-side caching is enabled.Code Examples
- Single Object
- Arrays
- URLs
- Scoped
- Schema-less
- Advanced
Additional Examples
- Custom Model
- Multi-Page
Error Types
The following errors may be thrown by theextract() method:
- StagehandError - Base class for all Stagehand-specific errors
- ZodSchemaValidationError - Extracted data does not match the provided Zod schema
- StagehandDomProcessError - Error occurred while processing the DOM
- StagehandEvalError - Error occurred while evaluating JavaScript in the page context
- StagehandIframeError - Unable to resolve iframe for the target element
- ContentFrameNotFoundError - Unable to obtain content frame for the selector
- XPathResolutionError - XPath does not resolve in the current page or frames
- StagehandShadowRootMissingError - No shadow root present on the resolved host element
- LLMResponseError - Error in LLM response processing
- MissingLLMConfigurationError - No LLM API key or client configured
- UnsupportedModelError - The specified model is not supported for this operation
- InvalidAISDKModelFormatError - Model string does not follow the required
provider/modelformat

