ArgentOSDocs

Browser Automation

The browser tool — navigate, interact with, and extract content from web pages.

Overview

The browser tool provides web browsing capabilities to your agent. It can navigate to URLs, interact with page elements, extract content, and take screenshots. This is powered by a headless browser (Playwright) running on the host system.

Capabilities

ActionDescription
navigateOpen a URL in the browser
clickClick on an element
typeType text into an input field
extractExtract text content from the page
screenshotTake a screenshot of the page
scrollScroll the page
waitWait for an element or condition

Usage Examples

The agent can browse a website and extract relevant information:

{
  "tool": "browser",
  "input": {
    "action": "navigate",
    "url": "https://news.ycombinator.com"
  }
}

Followed by:

{
  "tool": "browser",
  "input": {
    "action": "extract",
    "selector": ".titleline"
  }
}

Form Interaction

{
  "tool": "browser",
  "input": {
    "action": "type",
    "selector": "#search-input",
    "text": "ArgentOS documentation"
  }
}

Configuration

{
  "agents": {
    "defaults": {
      "browser": {
        "headless": true,
        "timeout": 30000,
        "viewport": { "width": 1280, "height": 720 }
      }
    }
  }
}

Prerequisites

The browser tool requires Playwright to be installed:

npx playwright install chromium

Security Considerations

  • The browser runs with the same permissions as the ArgentOS process
  • Be cautious with agents that have both browser and exec tools -- they can be powerful in combination
  • Consider restricting allowed domains in production environments
  • Cookie and session data persists between browser actions within a session

Limitations

  • Browser state does not persist across gateway restarts
  • JavaScript-heavy SPAs may require explicit wait conditions
  • File downloads through the browser require separate handling
  • The browser runs headless by default; there is no visual UI to observe