Back to all posts

Owl Browser 1.0.11: WebMCP Support for Web Pages

5 min read
Akram H. S.
Akram H. S.Founder & CTO

Today we're shipping WebMCP support in Owl Browser 1.0.11. Web pages can now declare structured tools that your AI agent discovers and calls directly. No more reverse-engineering forms, clicking buttons, and scraping HTML to get structured data back.

What Is WebMCP?

WebMCP is a W3C draft standard (February 2026) that defines a browser API called `navigator.modelContext`. It lets web pages register callable tools (functions with a name, description, and JSON Schema input) that AI agents can discover and invoke. Think of it as MCP, but for web pages instead of servers.

There are two ways a page can declare tools:

  • Imperative API: JavaScript calls `navigator.modelContext.registerTool()` with a name, description, input schema, and execute callback.
  • Declarative HTML: A `<form>` element with a `toolname` attribute is automatically detected and converted into a callable tool. Form inputs become the tool's parameters.

The Problem WebMCP Solves

Without WebMCP, an AI agent interacting with a flight booking site has to: find the origin input field, type "SFO", find the destination field, type "JFK", find the date picker, navigate it, click "Search", wait for results to load, then scrape the HTML to extract flight data. Every site has different selectors, different DOM structures, and different loading patterns. It's fragile and slow.

With WebMCP, the same site exposes a `search_flights` tool. The agent calls it with `{origin: "SFO", destination: "JFK", date: "2026-03-15"}` and gets clean JSON back: `{flights: [{airline: "United", price: 450, departure: "08:30"}, ...]}`. One tool call instead of a dozen DOM interactions.

How It Works in Owl Browser

Owl Browser acts as a WebMCP-to-MCP bridge. When a page loads, the browser injects a polyfill that implements `navigator.modelContext`. As pages register tools (via JavaScript or `<form toolname>` elements), the browser captures the tool definitions. Your agent then discovers and calls them through 4 new API tools.

Step 1: Navigate and Discover

Navigate to a page with `wait_until` set. The response now includes a `webmcp_tools` array with the full tool definitions, names, descriptions, and input schemas right in the navigate response:

curl -X POST http://localhost:8080/execute/browser_navigate \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"context_id": "ctx_1", "url": "https://airline.com", "wait_until": "load"}'

# Response:
# {"success": true, "status": "ok",
#  "message": "Navigated to: https://airline.com",
#  "url": "https://airline.com",
#  "webmcp_tools": [
#    {"name": "search_flights",
#     "description": "Search for available flights between two airports",
#     "inputSchema": {
#       "type": "object",
#       "properties": {
#         "origin": {"type": "string"},
#         "destination": {"type": "string"},
#         "date": {"type": "string"}
#       },
#       "required": ["origin", "destination", "date"]
#     }},
#    {"name": "book_seat", ...}
#  ]}

One call, zero extra steps. Your agent navigates, and immediately knows what tools the page offers and how to call them. If `webmcp_tools` is absent, it's a regular page with no WebMCP support.

Step 2: Call a Tool

Execute a page-declared tool with structured input and get structured output:

curl -X POST http://localhost:8080/execute/browser_webmcp_call_tool \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"context_id": "ctx_1", "tool_name": "search_flights",
       "input": {"origin": "SFO", "destination": "JFK", "date": "2026-03-15"}}'

# Response:
# {"flights": [{"airline": "United", "price": 450}, ...]}

The tool executes inside the page's JavaScript context. The page's callback runs, and the result flows back to your agent as clean JSON. No DOM manipulation, no scraping.

The New Tools

Four new tools are available in the API, MCP server, and SDKs:

  • `browser_webmcp_get_tools`: List all WebMCP tools declared by the page for a specific context. Returns tool names, descriptions, and input schemas.
  • `browser_webmcp_call_tool`: Execute a page-declared tool by name with JSON input. Returns the tool's JSON output.
  • `browser_webmcp_refresh_tools`: Re-scan the page for WebMCP tools. Useful after SPAs update their content or forms are dynamically added.
  • `browser_webmcp_get_all_tools`: List WebMCP tools across all active contexts. Useful for multi-tab workflows where different pages expose different tools.

Updated Navigate Response

The `browser_navigate` tool now returns a `webmcp_tools` array when `wait_until` is set to `load`, `networkidle`, `fullscroll`, or `domcontentloaded`. This array contains the full tool definitions, name, description, and input schema, so your agent can immediately call any tool without a separate discovery step. The field is omitted when the page has no WebMCP tools, keeping responses clean for regular pages.

Both Imperative and Declarative

Owl Browser supports both WebMCP registration methods. Pages using JavaScript can call `navigator.modelContext.registerTool()` to register tools with custom execute callbacks. Pages using plain HTML can add `toolname` and `tooldescription` attributes to `<form>` elements. The browser automatically converts form inputs into a JSON Schema and generates an execute callback that fills and submits the form.

A MutationObserver watches for dynamically added forms, so single-page applications that render forms after initial load are fully supported.

Agent Workflow

Here's how a typical AI agent workflow looks with WebMCP:

  • 1. Create context and navigate to the target page with `wait_until: "load"`
  • 2. Check the response: if `webmcp_tools` is present, the page supports WebMCP and the tools are ready to use
  • 3. Call `browser_webmcp_call_tool` with the tool name and structured input
  • 4. Receive structured JSON output: no parsing needed
  • 5. If the page changes (SPA navigation), call `browser_webmcp_refresh_tools` to re-scan

For pages without WebMCP support, your agent continues using the existing browser automation tools: click, type, extract text, screenshots. WebMCP is additive; it doesn't replace the existing toolkit. It extends it for pages that opt in.

Owl Browser connecting web pages to AI agents

What's Next

WebMCP is a W3C draft standard that's still evolving. As more websites adopt it, your AI agents will be able to interact with the web through structured APIs instead of brittle DOM scraping. We'll continue tracking the spec and updating Owl Browser's implementation as it matures. If you're building a website that AI agents interact with, consider adding WebMCP support. Your users' agents will thank you.

Want to automate seamlessly?

Owl Browser bypasses all sophisticated bot detections effortlessly.

Get Started Now