Business

How WebMCP Lets Developers Control AI Agents With JavaScript

By Chris J. Preimesberger,Richard MacManus

Copyright thenewstack

How WebMCP Lets Developers Control AI Agents With JavaScript

This year, MCP (Model Context Protocol) has become the glue that connects AI to the web. Following MCP-UI and NLWeb, we now have another new MCP-related open standard: WebMCP, an initiative of Microsoft and Google.

To find out more about WebMCP, I conducted an email interview with Kyle Pflug, group product manager for the web platform at Microsoft Edge.

What Is WebMCP?

I first heard about WebMCP from Pflug’s colleague Patrick Brosset, who introduced WebMCP in a blog post as “a proposal to let you, web developers, control how AI agents interact with your web pages.” He went on to explain that WebMCP is a set of JavaScript functions: “Web MCP lets you list the actions (called “tools”) that an AI agent can perform on a page, as JavaScript functions registered via a browser API.”

Essentially, it’s like setting up an MCP server for your website or app, except that the functionality WebMCP gives is client-side rather than from the server. As the project’s ReadMe puts it: “Web pages that use WebMCP can be thought of as Model Context Protocol (MCP) servers that implement tools in client-side script instead of on the backend.”

“MCP provided a natural starting point, but requires new platform-layer tools to cover human-in-the-loop agentic browsing scenarios…”
– Kyle Pflug, group product manager, Microsoft Edge

I asked Pflug what was the inspiration for the project.

“As AI assistants from many providers increasingly pursue ‘agentic’ use cases that take actions on web pages, we saw a need for web developers to be able to partner with these agents and facilitate their interactions with pages more intentionally,” he replied. “MCP provided a natural starting point, but requires new platform-layer tools to cover human-in-the-loop agentic browsing scenarios, which we increasingly see browser and agent providers investing in.”

W3C Involvement and A Second WebMCP Protocol

The project on GitHub is listed as a subgroup of the W3C Web Machine Learning Working Group. So, where did the idea originate, within W3C or was it brought to them after the fact by Microsoft and/or Google?

“Like many proposals, WebMCP started as a set of independent explainers across the community,” Pflug replied. “Microsoft proposed ‘WebModel Context’ in our public Microsoft Edge Explainers repository, and the Chrome team had a very similar proposal for ‘Script Tools’. Following early discussions with the Chrome team and the W3C WebML Working Group, we agreed to move forward with a single unified WebMCP proposal at the W3C.”

It’s worth noting that there is another related protocol, predating the W3C project, which also uses the name WebMCP. It was created independently by Alex Nahas — his project is called MCP-B, but the underlying protocol is named WebMCP.

Essentially, MCP-B is a Chrome extension. Nahas’ WebMCP protocol sounds very similar to the one being proposed by Microsoft and Google. But after contacting Nahas, he confirmed that he is now working with the new W3C-affiliated WebMCP group and intends to support their version of WebMCP.

“…use cases and technologies between ideas like MCP-B, Script Tools, and Web Model Context helps validate the need for these kinds of capabilities on the web.”

Pflug confirmed that Nahas has joined their group.

“Community projects like MCP-B are integral to development of web standards and the MCP-B team has been actively involved in WebML WG discussions,” he said, “and the overlap in use cases and technologies between ideas like MCP-B, Script Tools, and Web Model Context helps validate the need for these kinds of capabilities on the web.”

Pflug added that with WebMCP, “we’re proposing that these capabilities are ‘built in’ to the browser without requiring extensions, which is essential to widespread adoption on the web.”

Is the Website or the Browser the MCP Server?

In his initial blog post, Brosset wrote that the idea is “to make the browser be an MCP server too.” But how will this functionality work for a browser user — for example, will the MCP server have to be tied to a user’s browser profile?

“The core concept is to allow web developers to define ‘tools’ for their website in JavaScript, analogous to the tools that would be provided by a traditional MCP server,” Pflug replied. “You could imagine these being exposed to agents in the browser or host operating system, or even for first-party agents hosted on the same site. Our proposal is primarily focused on enabling developers to define these tools. How this will be implemented by specific user agents and exposed to AI assistants will likely vary across browsers, and is not specified by the current proposals.”

“Our proposal enables web pages to expose MCP tools to agents, analogous to the tools exposed by a traditional MCP server, but without requiring a separate server component.”

So is it the browser that is the MCP server (as Brosset wrote), or the website? Or both? The GitHub project states it’s the webpage: “Web pages that use WebMCP can be thought of as Model Context Protocol (MCP) servers.”

“The MCP terminology can be a bit confusing here,” Pflug said. “Our proposal enables web pages to expose MCP tools to agents, analogous to the tools exposed by a traditional MCP server, but without requiring a separate server component. In addition to simplifying implementation and allowing some code reuse, this is a natural fit for human-in-the-loop scenarios since it runs within the browsing context and can simplify things like state and auth — which can be tricky in more traditional CUA approaches for browsing agents.”

He added that the group expects “some sites may use both WebMCP and MCP servers,” because they serve different scenarios. “Traditional MCP servers are excellent for when browsing context is not needed or when the agent will be mostly interacting with cloud endpoints.”

Understanding AI Agent Vendor Involvement

So for WebMCP to work, will the AI agent vendors (OpenAI, Anthropic, et al) need to add any functionality to their products — or does it all just work via the MCP protocol, which they all support already?

“Our intent is that any agent that can call MCP tools would be able to leverage WebMCP tools exposed by sites,” Pflug replied. “WebMCP is not opinionated about how user agents or operating systems might expose these tools to agents, but our intent is that this could be implemented by any browser for use with any combination of built-in or third-party agents, in addition to the developer’s own use (such as an in-page agent).”

“Our intent is that any agent which can call MCP tools would be able to leverage WebMCP tools exposed by sites.”

Since WebMCP is a JavaScript API, I wondered if it resembles any of the existing web APIs?

“We’re not aware of any directly comparable web APIs today,” said Pflug. “While it is possible today for web developers to write tools in JavaScript, there is no standardized approach, which means AI agents would need to build website-specific implementations to interact with the site. WebMCP proposes a standardized approach for the web so agents can reliably call tools provided by developers.”

WebMCP vs. NLWeb: Which Protocol Should You Use?

As mentioned above, there have been a few new open standards this year that use MCP to bridge LLMs to the Web. NLWeb seems to be the closest to WebMCP, at least in that they’re both protocols to use with a website or web application (MCP-UI is specifically for use within AI agents). Also, Microsoft is heavily involved in both WebMCP and NLWeb.

Trying to get a sense of how widely WebMCP might be deployed on the web, I asked Pflug whether a content website (e.g. for media) would be more likely to use WebMCP or NLWeb?

“It really depends on what the site is trying to do,” he replied. “Depending on the conversation experiences you want to enable, you might choose one, the other, or both. NLWeb is a full-stack framework that helps you reimagine your site with a conversational interface and includes MCP server capabilities, structured search, schema-based grounding, etc. WebMCP proposes a lightweight, client-side enhancement that standardizes how websites expose JavaScript tools to agents in the browsing context.”

WebMCP “might be a particularly great fit for highly interactive experiences.”

He added that WebMCP “might be a particularly great fit for highly interactive experiences, where agents naively navigating complex or multi-step UI can be less efficient or more brittle.” Whereas the need “may be less clear for a content website that’s not trying to actively facilitate agentic interactions.”

That said, Pflug can also imagine cases “where the site might want to, for example, use WebMCP to enable agents to navigate a subscription sign-up flow more efficiently.”

The Future of WebMCP and Next Steps

Finally, I asked what’s the main priority for the WebMCP project for the rest of 2025?

“We’re in the early stages of design and will be focusing on deeper conversations with web developers to flesh out our understanding of their use cases for WebMCP, and iterating based on community discussions and feedback,” Pflug replied.

He said they’re interested in hearing from developers and publishers not just on the technical design, “but also in terms of tooling, business model concerns, and other factors that may need solutions in WebMCP — or other platform approaches to how agents browse and take actions on websites.”

The group’s goal is to work towards an early developer preview in Chromium, so that developers can try it out and provide feedback.