about coding brokers is that they’ll solely be used to carry out coding or programming. Nonetheless, they’re much extra generalized brokers and are able to doing all workplace duties basically, although with various levels of success.
One space, nonetheless, that has obtained a variety of consideration is looking utilizing internet browsers with coding brokers equivalent to Claude Code and OpenAI’s Codex.
The brokers have grow to be extremely proficient at navigating the online, which is tremendous helpful for lots of various duties.
Net looking can, in fact, be helpful in many alternative conditions, equivalent to fetching info on the Web or filling in varieties for you. Nonetheless, it’s price noting that a number of the use circumstances can break the phrases of service, so you need to undoubtedly concentrate on this. The principle utilization space I’ll cowl at the moment is certainly absolutely authorized, and it covers navigating purposes you’re creating your self with the coding brokers to check and confirm implementations.
Beforehand, I’ve talked loads about creating verifiable duties everytime you ask coding brokers to carry out actions for you. Giving coding brokers entry to your browser to check implementations is a vital a part of this verifiability.

Why coding brokers ought to use your browser
To start with, I’d wish to cowl why you need to care about operating browsers along with your coding brokers. Browsers are an necessary interface people use to work together with the world. By way of your browser, you’ll be able to carry out a variety of completely different actions, equivalent to studying up on info, filling in purposes, and so forth.
Provided that that is such an necessary interface for people to work together with the world, a variety of consideration and analysis has been focused in the direction of successfully navigating browsers. There are quite a few firms on the market focusing on browser navigation, and likewise all of the frontier labs supply such an integration into their merchandise, equivalent to OpenAI’s Codex and Anthropic’s Claude Code.
Think about in case you’re telling a coding agent to implement a design following an HTML design file. The coding agent is, in fact, good at front-end code and may begin implementing it immediately; nonetheless, if the coding agent can’t navigate the browser, it’s inconceivable for the coding agent to confirm its personal work.
This vastly will increase the prospect {that a} coding agent will make errors and never implement the precise design that you simply needed to implement.
Fortunately, there’s a quite simple repair to this downside. Give your coding agent entry to the browser. Enable it to take screenshots of the design it has carried out itself and evaluate it to the screenshots of the design you needed it to implement. The coding agent can then proceed iterating till the carried out code appears precisely just like the design file.
This protects you, because the programmer, a variety of time because you don’t need to repeatedly confirm and instruct the coding agent on errors that it’s made when doing the design implementation. This once more means that you can carry out a variety of different completely different duties and be extra productive as an engineer.
The way it works
Earlier than transferring on to the way to navigate browsers with Claude Code, I additionally need to have a easy part protecting the way it works.
In principle, it’s fairly easy to navigate the browser. The coding agent navigates by opening up the browser, in fact, the place it has entry to a couple actions:
- Take screenshot
- Click on (coordinate-based)
- Enter textual content
These are the three predominant actions the coding agent performs, that are principally all of the actions you should work together with a browser:
- The coding agent must take screenshots as a result of that’s the way it finds out what’s on every web page and figures out the place to click on.
- The coding agent additionally wants to have the ability to click on completely different locations on the web site, for instance, click on buttons or click on enter fields.
That is coordinate-based.
So if the coding agent desires to click on in a selected location, it outputs the next textual content:
click on(x=0.754, y=0.328)
It principally makes use of the press perform and provides the coordinates the place it desires to click on. The coordinates are usually normalized to be in a set vary, equivalent to between 0 and 1.
Then, as soon as the agent has clicked a selected location, it will probably enter textual content to do every part it desires to do on the browser. The coding agent can, in fact, additionally carry out completely different sorts of clicks, equivalent to right-click to get extra choices on the web page.
This loop then iterates. The coding agent takes a screenshot, chooses which motion to carry out, checks if it has achieved its aim or not, and repeats. It takes a screenshot once more, picks an motion, checks if it achieved a aim, and continues. The agent merely continues like this till it has achieved its aim within the browser.
navigate browsers with Claude Code
Subsequent, I need to cowl precisely the way to navigate browsers utilizing Claude Code, and the rules I’ll cowl right here principally apply to any coding agent. I’m not going to cowl strategies that can’t simply be generalized to principally another coding agent.
Firstly, in case you’re utilizing Claude Code, it has a built-in Chrome integration which you’ll merely allow by writing the command beneath when you’re within the Claude Code window.
/chrome
Codex additionally has a corresponding command.
This very merely offers Claude entry to open Chrome in your pc and use it to confirm duties.
I feel the Chrome implementation in Claude works alright, nevertheless it’s not optimum.
I’ve a greater expertise utilizing the Playwright MCP, which you’ll merely set up in Claude Code by telling Claude Code to put in it:
Set up the Playwright MCP to work together with the browser
After Claude has put in it, you should restart Claude Code, and also you’ll have entry to the Playwright MCP. In my expertise, Claude is simpler at finishing duties if it makes use of the Playwright MCP as an alternative of interacting with the /chrome implementation that’s already current in baseline Claude Code.
In fact, you probably have another coding agent, you are able to do precisely the identical: inform it to put in the Playwright MCP. The agent will set up the MCP, and you’ll restart the agent, and it’ll have entry to Playwright.
How do I make my agent check my implementation
Now that you simply’ve carried out the Playwright MCP and given your agent entry to work together with the browser, you should use it to check your implementations.
Each time your agent has carried out one thing (for instance, carried out a brand new design from a design file), merely inform the agent to confirm its work end-to-end by going by way of it in Chrome with the Playwright MCP and verifying its personal work.
It’s additionally helpful to inform the agent to not cease and are available again to you earlier than it’s verified its work end-to-end. Verifying the work end-to-end, on this case, means actually interacting with the browser and seeing if one thing works.
I usually additionally use the /aim characteristic, which is accessible in each Codex and Claude Code, which is principally a means that the agent continues working in the direction of a job till it’s achieved. I’ll then usually write one thing like:
/aim proceed engaged on the duty, implementing till you have
absolutely carried out it and examined and verified it finish to finish by interacting
with the browser utilizing the playwright MCP, taking screenshots, and
verifying your work, solely come again to me as soon as you have each carried out
and absolutely examined the implementation efficiently.
It will make the agent proceed working in the direction of the aim and verifying it, and solely come again to you as soon as it’s verified its work. This has saved me an unimaginable period of time and is very helpful in case you solely need the agent to implement designs.
Conclusion
On this article, I lined the way to apply Claude Code to confirm work in your browser. I first mentioned why coding brokers can and may work together along with your browser. Then I took you thru how browser navigation truly works with coding brokers, which is definitely a fairly easy idea. Lastly, I went particularly into how one can navigate browsers utilizing Claude Code or different coding brokers.
I consider browser navigation will nonetheless stay necessary as a result of a variety of the methods people work together with the world are by way of a browser. Nonetheless, it’s price noting that coding brokers are nonetheless far simpler at utilizing APIs and MCPs, so in case you can work together with a service by way of these means as an alternative, you need to principally at all times do this.
Additionally, try Successfully Run Many Claude Code Brokers in Parallel.
👉 My free eBook and Webinar:
🚀 10x Your Engineering with LLMs (Free 3-Day E-mail Course)
📚 Get my free Imaginative and prescient Language Fashions e book
💻 My webinar on Imaginative and prescient Language Fashions
👉 Discover me on socials:
💌 Substack















