Why every other browser automation is broken for agents
Here's the dirty secret about Playwright, Selenium, and even Anthropic's Computer Use: they have no memory. Every single run is a blank slate. Your agent shows up at amazon.com, has to figure out where the search bar is, has to discover that the "Add to Cart" button has a delayed render, has to learn that the captcha shows up on the third refresh. Then the run ends. The next run? Same agent. Same site. Same discoveries. Same wasted tokens.
That's why your "simple" agent task takes 4 minutes and burns $0.40 in LLM calls. The agent isn't slow. The framework forces it to be a goldfish.
Computer Use is even worse. It works off screenshots, which means every action requires a vision call, which means latency stacks. Watching it click through a 5-step form feels like watching paint dry while someone reads you the paint can ingredient list.
You don't need a smarter model. You need a browser that remembers.
What Browser Harness actually is
Browser Harness is a thin layer between your LLM and Chrome. That's it. One websocket connection straight to Chrome via the Chrome DevTools Protocol (CDP). No middleman framework. No abstraction tax. The entire core is about 1,000 lines of code across four files.
The architecture matters because every layer you add between the agent and the browser is another place latency hides. Playwright wraps CDP. Frameworks wrap Playwright. Agent libraries wrap frameworks. By the time your LLM "clicks" a button, you've gone through five layers of translation. Browser Harness rips all of that out.
What's left is the agent talking to Chrome almost directly, with a small helper module the agent can edit itself during the run. When it discovers something useful — a selector, a wait condition, a multi-step flow — it writes that knowledge into agent_helpers.py. Next run, that code is already there. The harness literally improves itself while you watch.
The self-healing trick (the whole reason this exists)
This is the part that makes Browser Harness different from every other tool in the space.
When the agent hits a missing capability — say, "I need to click the third product card on this search page" — it doesn't fail. It doesn't ask you. It writes a helper function on the spot. Something like:
That function gets saved into the workspace. The next time the agent goes to Amazon search, the function is already there. It doesn't re-discover the selector. It doesn't re-burn tokens figuring it out. It just calls the helper and moves on.
Stack this across hundreds of runs and the agent gets sharper every single time. Errors become skills. Edge cases become handled. The framework you start with on day one is not the framework you have on day thirty.
This is what "self-healing" actually means. Not retries. Not error catching. Permanent learning.
The community skills library (you start with everyone else's wins)
You're not starting from zero. The agent-workspace/domain-skills/folder is full of pre-built playbooks for the most common sites on the internet — LinkedIn, Amazon, GitHub, and dozens more. These weren't hand-written. They were generated by other people's agents during real runs, then submitted as pull requests.
That's a big deal. Hand-written browser automation rots fast. Sites change selectors, redesign flows, swap captcha providers, and your script breaks. Community skills here are agent-generated against the current version of each site, so they reflect what actually works in the browser today, not what worked when some engineer wrote a tutorial 18 months ago.
When you start a fresh project, you inherit all of that. Your agent shows up at LinkedIn already knowing where the search bar is, how the infinite scroll loads, where the rate limits kick in. You get a Day 100 agent on Day 1.
Install it in 10 seconds
Open Claude Code (or Cursor, or any agent IDE) and paste this:
Claude reads the install.md, configures the harness, sets up the agent workspace, and gets Chrome talking to your LLM over remote debugging. Whole thing is 2-3 minutes.
Prefer to do it manually? Paste this into your terminal:
Then open the repo, read install.md, and point your agent at SKILL.md for daily usage.
Browser Use Cloud (the free tier is actually useful)
The same team runs a hosted version called Browser Use Cloud. Free tier gives you 3 concurrent browsers, stealth mode (so sites don't flag you as a bot), captcha solving built in, and proxy rotation. No credit card to start.
You don't need it to use Browser Harness — local Chrome works fine. But if you're running multiple agents in parallel, or hitting sites that block headless browsers, the cloud version skips a week of infrastructure pain. The stealth + captcha combo alone is worth it for anything in the "scrape behind a login wall" category.
For local builds and prototypes, run it on your machine. When you're ready to scale or run agents on a schedule without your laptop being on, flip it to the cloud.
When to use it, when to skip it
Use it for:
- Agents that hit the same sites repeatedly (your CRM, LinkedIn, an e-commerce dashboard)
- Long-running workflows where speed compounds (10 minutes saved per run × 1,000 runs)
- Anything where Computer Use felt too slow or too expensive in tokens
- Building your own private skill library for niche sites the community hasn't covered
Skip it for:
- One-off scrapes you'll never run again (overkill, just use Playwright)
- Tasks where the "thinking" matters more than the clicking (pure research, summarization)
- Sites that aggressively rotate selectors and DOM structure (the skill library decays faster than it improves)
Rule of thumb: if your agent is going to visit a site more than 5 times in its life, Browser Harness pays for itself in saved tokens and saved minutes by run 3.