feat(skill): add agent browser cdp flow skills and helpers

This commit is contained in:
gameloader
2026-03-15 14:02:55 +08:00
commit 50e16615a8
6 changed files with 977 additions and 0 deletions

3
.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
__pycache__/
*.pyc
.DS_Store

View File

@@ -0,0 +1,117 @@
---
name: agent-browser-browserless-openai-signup-smoke
description: Connect Agent Browser to Browserless or Browser Use CDP and automate OpenAI web flows. Use when Codex needs to (1) smoke-test ChatGPT signup from `chatgpt.com` through email/password/email-verification, or (2) drive an OpenAI OAuth authorize URL through login, email verification, Codex consent, and capture the final `localhost` callback URL. Includes Browserless troubleshooting notes, a Browser Use fallback workflow, and reusable Python helper scripts.
---
# Agent Browser CDP Flows
## Overview
Use this skill for two validated Agent Browser workflows:
- Browserless stealth CDP for ChatGPT signup smoke tests.
- Browser Use cloud browser + Agent Browser CDP attachment for OpenAI OAuth callback capture.
Prefer the Browser Use workflow for the OAuth callback task. It was more stable than Browserless for the final `localhost` redirect.
## Quick Start
### Browserless signup smoke
- Export `BROWSERLESS_TOKEN`.
- Run `python3 skills/agent-browser-browserless-openai-signup-smoke/scripts/openai_signup_smoke.py`.
- Use a non-deliverable but syntactically valid test email such as an `example.com` address.
- Prefer a fresh email alias on each run; reusing the same fake address can route the flow to `log-in/password` instead of the signup password page.
- Stop after the email-verification page unless the user explicitly asks for more.
Example:
```bash
export BROWSERLESS_TOKEN='...'
python3 skills/agent-browser-browserless-openai-signup-smoke/scripts/openai_signup_smoke.py \
--email 'agent-browser-smoke@example.com' \
--password 'TempPass!20260315' \
--output /tmp/openai-signup-smoke.json
```
### Browser Use OAuth callback capture
- Install `browser-use-sdk` in a local environment.
- Export `BROWSER_USE_API_KEY`.
- Run `python3 skills/agent-browser-browserless-openai-signup-smoke/scripts/browseruse_oauth_callback.py` with the OAuth authorize URL, login email, and password.
- Omit `--code` to stop at the email-verification page.
- Add `--code ... --approve-consent` to continue through the Codex consent page and capture the final `localhost` callback URL.
Example: stop at verification
```bash
export BROWSER_USE_API_KEY='...'
python3 skills/agent-browser-browserless-openai-signup-smoke/scripts/browseruse_oauth_callback.py \
--oauth-url 'https://auth.openai.com/oauth/authorize?...' \
--email 'user@example.com' \
--password 'secret' \
--output /tmp/browseruse-oauth.json
```
Example: capture callback after receiving the email code
```bash
export BROWSER_USE_API_KEY='...'
python3 skills/agent-browser-browserless-openai-signup-smoke/scripts/browseruse_oauth_callback.py \
--oauth-url 'https://auth.openai.com/oauth/authorize?...' \
--email 'user@example.com' \
--password 'secret' \
--code '123456' \
--approve-consent \
--trace-path /tmp/browseruse-oauth-trace.zip \
--output /tmp/browseruse-oauth.json
```
## Workflow A: Browserless Signup Smoke
1. Build the Browserless stealth websocket URL:
`wss://production-sfo.browserless.io/chrome/stealth?token=...`
2. Connect Agent Browser to that websocket with a dedicated session name.
3. Open `https://chatgpt.com` and wait briefly for the homepage to stabilize.
4. Click `Sign up for free` with a role-based locator instead of brittle element references.
5. Fill a format-valid test email and submit.
6. Fill a test password and submit.
7. Read back the resulting URL, title, and interactive snapshot.
8. Treat `https://auth.openai.com/email-verification` / `Check your inbox - OpenAI` as a successful smoke-test result.
## Workflow B: Browser Use OAuth Callback Capture
1. Create a Browser Use cloud browser with the SDK.
2. Read `cdp_url` from the Browser Use session, then fetch `/json/version` to get `webSocketDebuggerUrl`.
3. Connect Agent Browser to that websocket URL.
4. Open the OpenAI OAuth authorize URL.
5. Log in with the provided email and password.
6. If no verification code is available yet, stop on `https://auth.openai.com/email-verification`.
7. When a verification code is available, submit it and wait for `https://auth.openai.com/sign-in-with-chatgpt/codex/consent`.
8. Before pressing the consent `Continue` button, route `http://localhost:1455/*` so the callback URL survives even if the upstream app is not listening.
9. Optionally start `agent-browser trace` before the consent click.
10. Click consent `Continue`, then read the full callback URL from the current page URL.
## Expected Observations
- Browserless plain CDP can stall on `Just a moment...`; Browserless stealth is better for signup smoke tests.
- Browser Uses direct `agent-browser -p browseruse` provider path can fail, while Browser Use SDK session creation plus explicit CDP attachment works.
- Browser Use-backed Agent Browser runs were more stable than Browserless for the OAuth callback flow.
- A successful callback capture can still render a proxy or upstream error page; if the current URL begins with `http://localhost:1455/auth/callback?...`, the callback was captured successfully.
- Reused or stale verification codes show `Incorrect code` on the email-verification page.
See `references/observations.md` for the concrete results captured during testing.
## Troubleshooting
- If the title is `Just a moment...`, reconnect with the Browserless stealth endpoint.
- If Agent Browser falls back to `chrome://new-tab-page/`, discard that run and start a fresh end-to-end session.
- Use fresh verification codes quickly; they expire and cannot be replayed reliably.
- Browser Use browser creation currently accepts `timeout <= 240`.
- If the final page shows `502 Bad Gateway` or `localhost refused to connect`, inspect the current URL before assuming failure.
## Resources
- `scripts/openai_signup_smoke.py` runs the Browserless signup smoke flow and emits a machine-readable summary.
- `scripts/browseruse_oauth_callback.py` creates a Browser Use cloud browser, attaches Agent Browser via CDP, and captures OpenAI OAuth callback URLs.
- `references/observations.md` records the tested Browserless and Browser Use behavior.

View File

@@ -0,0 +1,4 @@
interface:
display_name: "Agent Browser CDP Flows"
short_description: "Signup smoke and OAuth callback flows"
default_prompt: "Use this skill to drive OpenAI web flows with Agent Browser over Browserless or Browser Use CDP, including ChatGPT signup smoke tests and OAuth callback capture through the Codex consent page."

View File

@@ -0,0 +1,55 @@
# Observations
## Browserless signup smoke
- Browserless plain CDP websocket `wss://production-sfo.browserless.io/?token=...` loaded `chatgpt.com` into `Just a moment...` during Agent Browser testing.
- Browserless stealth websocket `wss://production-sfo.browserless.io/chrome/stealth?token=...` loaded the normal ChatGPT homepage and allowed interaction.
- From `https://chatgpt.com/`, `Sign up for free` opened the signup modal without navigating away from the page.
- The signup modal exposed:
- `textbox "Email address"`
- `button "Continue"`
- social-login buttons
### Validation outcomes
- Invalid string `not-an-email`
- Field validity failed.
- Native validation reported a missing `@`.
- The `Continue` button was still clickable.
- The flow did not advance.
- Valid fake email `browserless-smoke-20260315@example.com`
- Field validity passed.
- Submitting advanced to `https://auth.openai.com/create-account/password`.
- Title became `Create a password - OpenAI`.
- Reusing the same fake email on later runs can route to `https://auth.openai.com/log-in/password` instead.
- Valid fake email + fake password `TempPass!20260315`
- Submitting the password advanced to `https://auth.openai.com/email-verification`.
- Title became `Check your inbox - OpenAI`.
- The page exposed a `Code` input, `Continue`, and `Resend email`.
## Browser Use OAuth callback capture
- `agent-browser -p browseruse ...` failed with `Failed to connect to CDP on port 9222` in this environment.
- Creating a Browser Use cloud browser with the SDK and then attaching Agent Browser to the returned CDP websocket worked.
- The Browser Use browser session returned:
- a `live_url`
- an HTTP `cdp_url`
- a websocket debugger URL discoverable from `cdp_url/json/version`
- After OpenAI login and password submission, the flow reached `https://auth.openai.com/email-verification` reliably.
- A valid email code advanced to `https://auth.openai.com/sign-in-with-chatgpt/codex/consent`.
- Routing `http://localhost:1455/*` before the final consent click preserved the callback URL.
- The final page could still show upstream/proxy errors, but the current URL contained the complete callback in the form:
- `http://localhost:1455/auth/callback?code=...&scope=...&state=...`
## Direct auth URL behavior
- Opening `https://auth.openai.com/log-in-or-create-account` directly did not show the signup form during testing.
- It showed `Your session has ended` plus a `Log in` link instead.
## Recommended stopping points
- For signup smoke tests, treat arrival at `https://auth.openai.com/email-verification` as success.
- For OAuth callback capture, treat arrival at `http://localhost:1455/auth/callback?...` as success even if the page body shows a local proxy or upstream error.
- Do not finish account creation or enter additional secrets unless the user explicitly asks and the action is allowed.

View File

@@ -0,0 +1,406 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
import os
import shutil
import subprocess
import sys
import time
from dataclasses import dataclass
from pathlib import Path
from typing import Any
try:
import httpx
from browser_use_sdk import BrowserUse
except ImportError as exc: # pragma: no cover
raise SystemExit(
"Missing dependency. Install `browser-use-sdk` first, e.g. `uv pip install browser-use-sdk`."
) from exc
DEFAULT_PROXY_COUNTRY = "us"
DEFAULT_BROWSER_TIMEOUT = 240
DEFAULT_WAIT_MS = 7000
DEFAULT_LOCALHOST_PATTERN = "http://localhost:1455/*"
DEFAULT_CALLBACK_FRAGMENT = "http://localhost:1455/auth/callback"
DEFAULT_LOGIN_EMAIL_SELECTOR = "input[name='username'], input[type='email']"
DEFAULT_PASSWORD_SELECTOR = "input[type='password']"
DEFAULT_CODE_SELECTOR = "input"
DEFAULT_SUBMIT_SELECTOR = "button[type='submit']"
DEFAULT_CONSENT_URL_FRAGMENT = "/sign-in-with-chatgpt/codex/consent"
DEFAULT_VERIFICATION_URL_FRAGMENT = "/email-verification"
@dataclass
class BrowserUseSession:
id: str
live_url: str | None
cdp_http_url: str
websocket_url: str
timeout_at: str | None
@dataclass
class CommandResult:
command: list[str]
returncode: int
stdout: str
stderr: str
parsed: Any | None = None
class AgentBrowserRunner:
def __init__(self, binary: str, session: str, verbose: bool = False) -> None:
self.binary = binary
self.session = session
self.verbose = verbose
def run(self, *args: str, expect_json: bool = True, retries: int = 0) -> CommandResult:
attempt = 0
while True:
command = [self.binary, "--session", self.session]
if expect_json:
command.append("--json")
command.extend(args)
if self.verbose:
print("+", " ".join(command), file=sys.stderr)
completed = subprocess.run(
command,
check=False,
capture_output=True,
text=True,
)
stdout = completed.stdout.strip()
stderr = completed.stderr.strip()
parsed = None
if expect_json and stdout:
try:
parsed = json.loads(stdout)
except json.JSONDecodeError as exc:
raise RuntimeError(
f"Failed to parse JSON from agent-browser for command {args!r}: {stdout}"
) from exc
result = CommandResult(
command=command,
returncode=completed.returncode,
stdout=stdout,
stderr=stderr,
parsed=parsed,
)
failed = completed.returncode != 0 or (
expect_json and isinstance(parsed, dict) and not parsed.get("success", False)
)
if not failed:
return result
if attempt < retries and self._is_transient_failure(result):
attempt += 1
time.sleep(1.0)
continue
raise RuntimeError(self._format_error(result))
@staticmethod
def _format_error(result: CommandResult) -> str:
stdout = f"\nstdout: {result.stdout}" if result.stdout else ""
stderr = f"\nstderr: {result.stderr}" if result.stderr else ""
return (
f"agent-browser command failed ({result.returncode}): {' '.join(result.command)}"
f"{stdout}{stderr}"
)
@staticmethod
def _is_transient_failure(result: CommandResult) -> bool:
haystack = f"{result.stdout}\n{result.stderr}".lower()
transient_markers = (
"cdp response channel closed",
"target closed",
"websocket",
"socket closed",
"connection closed",
"econnreset",
"broken pipe",
)
return any(marker in haystack for marker in transient_markers)
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Create a Browser Use cloud browser, attach Agent Browser via CDP, and drive an OpenAI OAuth flow.",
)
parser.add_argument("--browser-use-api-key", default=os.getenv("BROWSER_USE_API_KEY"))
parser.add_argument("--oauth-url", required=True)
parser.add_argument("--email", required=True)
parser.add_argument("--password", required=True)
parser.add_argument("--code")
parser.add_argument("--approve-consent", action="store_true")
parser.add_argument("--session", default="browseruse-oauth")
parser.add_argument("--agent-browser-bin", default=shutil.which("agent-browser") or "agent-browser")
parser.add_argument("--proxy-country-code", default=DEFAULT_PROXY_COUNTRY)
parser.add_argument("--browser-timeout", type=int, default=DEFAULT_BROWSER_TIMEOUT)
parser.add_argument("--wait-ms", type=int, default=DEFAULT_WAIT_MS)
parser.add_argument("--localhost-pattern", default=DEFAULT_LOCALHOST_PATTERN)
parser.add_argument("--callback-fragment", default=DEFAULT_CALLBACK_FRAGMENT)
parser.add_argument("--login-email-selector", default=DEFAULT_LOGIN_EMAIL_SELECTOR)
parser.add_argument("--password-selector", default=DEFAULT_PASSWORD_SELECTOR)
parser.add_argument("--code-selector", default=DEFAULT_CODE_SELECTOR)
parser.add_argument("--submit-selector", default=DEFAULT_SUBMIT_SELECTOR)
parser.add_argument("--trace-path", type=Path)
parser.add_argument("--output", type=Path)
parser.add_argument("--stop-browser", action="store_true")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--verbose", action="store_true")
return parser
def extract_data(result: CommandResult) -> Any:
if isinstance(result.parsed, dict):
return result.parsed.get("data")
return result.parsed
def create_browser_use_session(api_key: str, proxy_country_code: str, timeout: int) -> BrowserUseSession:
client = BrowserUse(api_key=api_key)
browser = client.browsers.create(proxy_country_code=proxy_country_code, timeout=timeout)
cdp_http_url = browser.cdp_url.rstrip("/")
version = httpx.get(f"{cdp_http_url}/json/version", timeout=30)
version.raise_for_status()
websocket_url = version.json()["webSocketDebuggerUrl"]
return BrowserUseSession(
id=browser.id,
live_url=browser.live_url,
cdp_http_url=browser.cdp_url,
websocket_url=websocket_url,
timeout_at=str(browser.timeout_at),
)
def run_step(
runner: AgentBrowserRunner,
command: tuple[str, ...],
expect_json: bool = True,
retries: int | None = None,
) -> CommandResult:
if retries is None:
retries = 2 if command and command[0] in {"connect", "open", "wait", "get", "snapshot"} else 0
return runner.run(*command, expect_json=expect_json, retries=retries)
def get_url(runner: AgentBrowserRunner) -> str | None:
result = run_step(runner, ("get", "url"))
data = extract_data(result) or {}
return data.get("url")
def run_until_success(
runner: AgentBrowserRunner,
command: tuple[str, ...],
timeout_ms: int,
expect_json: bool = True,
) -> CommandResult:
deadline = time.time() + timeout_ms / 1000
last_error: Exception | None = None
while time.time() < deadline:
try:
return run_step(runner, command, expect_json=expect_json)
except RuntimeError as exc:
last_error = exc
time.sleep(1.0)
raise RuntimeError(f"Timed out waiting for command {command!r}. Last error: {last_error}")
def poll_url_contains(runner: AgentBrowserRunner, fragment: str, timeout_ms: int) -> str:
deadline = time.time() + timeout_ms / 1000
last_url = None
while time.time() < deadline:
last_url = get_url(runner)
if last_url and fragment in last_url:
return last_url
time.sleep(1.0)
raise RuntimeError(f"Timed out waiting for URL containing {fragment!r}. Last URL: {last_url!r}")
def append_result(results: list[dict[str, Any]], command: tuple[str, ...], result: CommandResult, expect_json: bool) -> None:
entry: dict[str, Any] = {
"command": list(command),
"returncode": result.returncode,
}
if expect_json:
entry["data"] = extract_data(result)
else:
entry["stdout"] = result.stdout
results.append(entry)
def stop_browser_use_session(api_key: str, browser_id: str) -> None:
client = BrowserUse(api_key=api_key)
client.browsers.stop(browser_id)
def run_flow(args: argparse.Namespace) -> dict[str, Any]:
if args.dry_run:
return {
"mode": "dry-run",
"oauth_url": args.oauth_url,
"session": args.session,
"email": args.email,
"approve_consent": args.approve_consent,
"has_code": bool(args.code),
"trace_path": str(args.trace_path) if args.trace_path else None,
}
if not args.browser_use_api_key:
raise SystemExit("Pass --browser-use-api-key or set BROWSER_USE_API_KEY.")
browser_session = create_browser_use_session(
api_key=args.browser_use_api_key,
proxy_country_code=args.proxy_country_code,
timeout=args.browser_timeout,
)
runner = AgentBrowserRunner(args.agent_browser_bin, args.session, verbose=args.verbose)
results: list[dict[str, Any]] = []
trace_started = False
try:
result = run_step(runner, ("connect", browser_session.websocket_url))
append_result(results, ("connect", browser_session.websocket_url), result, True)
result = run_step(runner, ("open", args.oauth_url))
append_result(results, ("open", args.oauth_url), result, True)
result = run_step(runner, ("wait", str(args.wait_ms)))
append_result(results, ("wait", str(args.wait_ms)), result, True)
result = run_until_success(runner, ("fill", args.login_email_selector, args.email), timeout_ms=args.wait_ms)
append_result(results, ("fill", args.login_email_selector, args.email), result, True)
result = run_step(runner, ("click", args.submit_selector))
append_result(results, ("click", args.submit_selector), result, True)
result = run_step(runner, ("wait", "3000"))
append_result(results, ("wait", "3000"), result, True)
result = run_until_success(runner, ("fill", args.password_selector, args.password), timeout_ms=args.wait_ms)
append_result(results, ("fill", args.password_selector, "********"), result, True)
result = run_step(runner, ("click", args.submit_selector))
append_result(results, ("click", args.submit_selector), result, True)
result = run_step(runner, ("wait", str(args.wait_ms)))
append_result(results, ("wait", str(args.wait_ms)), result, True)
current_url = get_url(runner)
status = "unknown"
callback_url = None
if current_url and DEFAULT_VERIFICATION_URL_FRAGMENT in current_url:
status = "verification_reached"
if args.code:
result = run_until_success(runner, ("fill", args.code_selector, args.code), timeout_ms=args.wait_ms)
append_result(results, ("fill", args.code_selector, "******"), result, True)
result = run_step(runner, ("click", args.submit_selector))
append_result(results, ("click", args.submit_selector), result, True)
result = run_step(runner, ("wait", str(args.wait_ms)))
append_result(results, ("wait", str(args.wait_ms)), result, True)
current_url = get_url(runner)
if current_url and DEFAULT_CONSENT_URL_FRAGMENT in current_url:
status = "consent_reached"
if args.approve_consent:
result = run_step(runner, ("network", "requests", "--clear"))
append_result(results, ("network", "requests", "--clear"), result, True)
result = run_step(runner, ("network", "route", args.localhost_pattern, "--body", '{"ok":true}'))
append_result(results, ("network", "route", args.localhost_pattern, "--body", '{"ok":true}'), result, True)
if args.trace_path:
result = run_step(runner, ("trace", "start"))
append_result(results, ("trace", "start"), result, True)
trace_started = True
result = run_step(runner, ("click", args.submit_selector))
append_result(results, ("click", args.submit_selector), result, True)
result = run_step(runner, ("wait", str(args.wait_ms)))
append_result(results, ("wait", str(args.wait_ms)), result, True)
current_url = get_url(runner)
if current_url and args.callback_fragment in current_url:
callback_url = current_url
status = "callback_captured"
url_result = run_step(runner, ("get", "url"))
title_result = run_step(runner, ("get", "title"))
text_result = run_step(runner, ("get", "text", "body"))
snapshot_result = run_step(runner, ("snapshot", "-i", "-c"), expect_json=False)
append_result(results, ("get", "url"), url_result, True)
append_result(results, ("get", "title"), title_result, True)
append_result(results, ("get", "text", "body"), text_result, True)
append_result(results, ("snapshot", "-i", "-c"), snapshot_result, False)
trace_output = None
if trace_started and args.trace_path:
trace_stop = run_step(runner, ("trace", "stop", str(args.trace_path)))
append_result(results, ("trace", "stop", str(args.trace_path)), trace_stop, True)
trace_output = str(args.trace_path)
final_url = (extract_data(url_result) or {}).get("url")
final_title = (extract_data(title_result) or {}).get("title")
final_text = (extract_data(text_result) or {}).get("text")
return {
"mode": "live",
"status": status,
"browser_use": {
"id": browser_session.id,
"live_url": browser_session.live_url,
"cdp_http_url": browser_session.cdp_http_url,
"websocket_url": browser_session.websocket_url,
"timeout_at": browser_session.timeout_at,
},
"session": args.session,
"email": args.email,
"oauth_url": args.oauth_url,
"callback_url": callback_url or final_url if final_url and args.callback_fragment in final_url else None,
"final_url": final_url,
"final_title": final_title,
"final_text_excerpt": final_text[:500] if final_text else None,
"final_snapshot": snapshot_result.stdout,
"trace_path": trace_output,
"results": results,
}
finally:
if args.stop_browser:
stop_browser_use_session(args.browser_use_api_key, browser_session.id)
def main() -> int:
args = build_parser().parse_args()
if not shutil.which(args.agent_browser_bin) and args.agent_browser_bin == "agent-browser":
raise SystemExit("agent-browser is not installed or not on PATH.")
summary = run_flow(args)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(json.dumps(summary, indent=2, ensure_ascii=False) + "\n")
print(json.dumps(summary, indent=2, ensure_ascii=False))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,392 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
import os
import shutil
import subprocess
import sys
import time
from dataclasses import dataclass
from datetime import datetime
from pathlib import Path
from typing import Any
DEFAULT_HOST = "production-sfo.browserless.io"
DEFAULT_CHATGPT_URL = "https://chatgpt.com"
DEFAULT_SIGNUP_LABEL = "Sign up for free"
DEFAULT_EMAIL_SELECTOR = "input[name='email']"
DEFAULT_PASSWORD_SELECTOR = "input[type='password']"
DEFAULT_SUBMIT_SELECTOR = "button[type='submit']"
DEFAULT_WAIT_MS = 5000
def build_default_email() -> str:
return f"agent-browser-smoke-{datetime.now().strftime('%Y%m%d-%H%M%S')}@example.com"
@dataclass
class CommandResult:
command: list[str]
returncode: int
stdout: str
stderr: str
parsed: Any | None = None
class AgentBrowserRunner:
def __init__(self, binary: str, session: str, verbose: bool = False) -> None:
self.binary = binary
self.session = session
self.verbose = verbose
def run(self, *args: str, expect_json: bool = True, retries: int = 0) -> CommandResult:
attempt = 0
while True:
command = [self.binary, "--session", self.session]
if expect_json:
command.append("--json")
command.extend(args)
if self.verbose:
print("+", " ".join(command), file=sys.stderr)
completed = subprocess.run(
command,
check=False,
capture_output=True,
text=True,
)
stdout = completed.stdout.strip()
stderr = completed.stderr.strip()
parsed = None
if expect_json and stdout:
try:
parsed = json.loads(stdout)
except json.JSONDecodeError as exc:
raise RuntimeError(
f"Failed to parse JSON from agent-browser for command {args!r}: {stdout}"
) from exc
result = CommandResult(
command=command,
returncode=completed.returncode,
stdout=stdout,
stderr=stderr,
parsed=parsed,
)
failed = completed.returncode != 0 or (
expect_json and isinstance(parsed, dict) and not parsed.get("success", False)
)
if not failed:
return result
if attempt < retries and self._is_transient_failure(result):
attempt += 1
time.sleep(1.0)
continue
raise RuntimeError(self._format_error(result))
@staticmethod
def _format_error(result: CommandResult) -> str:
stdout = f"\nstdout: {result.stdout}" if result.stdout else ""
stderr = f"\nstderr: {result.stderr}" if result.stderr else ""
return (
f"agent-browser command failed ({result.returncode}): {' '.join(result.command)}"
f"{stdout}{stderr}"
)
@staticmethod
def _is_transient_failure(result: CommandResult) -> bool:
haystack = f"{result.stdout}\n{result.stderr}".lower()
transient_markers = (
"cdp response channel closed",
"target closed",
"websocket",
"socket closed",
"connection closed",
"econnreset",
)
return any(marker in haystack for marker in transient_markers)
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Smoke-test the ChatGPT signup flow through Agent Browser over Browserless stealth CDP.",
)
parser.add_argument("--token", default=os.getenv("BROWSERLESS_TOKEN"))
parser.add_argument("--ws-url", help="Override the Browserless websocket URL.")
parser.add_argument("--host", default=DEFAULT_HOST)
parser.add_argument("--session", default="browserless-signup-smoke")
parser.add_argument("--agent-browser-bin", default=shutil.which("agent-browser") or "agent-browser")
parser.add_argument("--chatgpt-url", default=DEFAULT_CHATGPT_URL)
parser.add_argument("--signup-label", default=DEFAULT_SIGNUP_LABEL)
parser.add_argument("--email", default=build_default_email())
parser.add_argument(
"--password",
default=f"TempPass!{datetime.now().strftime('%Y%m%d')}",
)
parser.add_argument("--email-selector", default=DEFAULT_EMAIL_SELECTOR)
parser.add_argument("--password-selector", default=DEFAULT_PASSWORD_SELECTOR)
parser.add_argument("--submit-selector", default=DEFAULT_SUBMIT_SELECTOR)
parser.add_argument("--wait-ms", type=int, default=DEFAULT_WAIT_MS)
parser.add_argument("--output", type=Path, help="Optional path for a JSON summary.")
parser.add_argument("--dry-run", action="store_true")
parser.add_argument("--verbose", action="store_true")
return parser
def build_ws_url(args: argparse.Namespace) -> str:
if args.ws_url:
return args.ws_url
if not args.token:
raise SystemExit("Pass --token or set BROWSERLESS_TOKEN.")
return f"wss://{args.host}/chrome/stealth?token={args.token}"
def plan_commands(args: argparse.Namespace, ws_url: str) -> list[tuple[tuple[str, ...], bool]]:
return [
(("connect", ws_url), True),
(("open", args.chatgpt_url), True),
(("wait", str(args.wait_ms)), True),
(("find", "role", "button", "click", "--name", args.signup_label), True),
(("fill", args.email_selector, args.email), True),
(("click", args.submit_selector), True),
(("get", "url"), True),
(("fill", args.password_selector, args.password), True),
(("click", args.submit_selector), True),
(("get", "url"), True),
(("get", "title"), True),
(("get", "text", "body"), True),
(("snapshot", "-i", "-c"), False),
]
def run_step(
runner: AgentBrowserRunner,
command: tuple[str, ...],
expect_json: bool = True,
) -> CommandResult:
retries = 2 if command and command[0] in {"connect", "open", "wait", "get", "snapshot"} else 0
return runner.run(*command, expect_json=expect_json, retries=retries)
def run_until_success(
runner: AgentBrowserRunner,
command: tuple[str, ...],
timeout_ms: int,
expect_json: bool = True,
poll_interval_s: float = 1.0,
) -> CommandResult:
deadline = time.time() + (timeout_ms / 1000)
last_error: Exception | None = None
while time.time() < deadline:
try:
return run_step(runner, command, expect_json=expect_json)
except RuntimeError as exc:
last_error = exc
time.sleep(poll_interval_s)
raise RuntimeError(f"Timed out waiting for command {command!r}. Last error: {last_error}")
def extract_json_data(result: CommandResult) -> Any:
if isinstance(result.parsed, dict):
return result.parsed.get("data")
return result.parsed
def get_current_url(runner: AgentBrowserRunner) -> str | None:
result = run_step(runner, ("get", "url"))
data = extract_json_data(result) or {}
return data.get("url")
def poll_url_contains(
runner: AgentBrowserRunner,
fragment: str,
timeout_ms: int,
poll_interval_s: float = 1.0,
) -> str:
deadline = time.time() + (timeout_ms / 1000)
last_url = None
while time.time() < deadline:
last_url = get_current_url(runner)
if last_url and fragment in last_url:
return last_url
time.sleep(poll_interval_s)
raise RuntimeError(f"Timed out waiting for URL containing {fragment!r}. Last URL: {last_url!r}")
def poll_url_contains_any(
runner: AgentBrowserRunner,
fragments: tuple[str, ...],
timeout_ms: int,
poll_interval_s: float = 1.0,
) -> str:
deadline = time.time() + (timeout_ms / 1000)
last_url = None
while time.time() < deadline:
last_url = get_current_url(runner)
if last_url and any(fragment in last_url for fragment in fragments):
return last_url
time.sleep(poll_interval_s)
raise RuntimeError(f"Timed out waiting for URL containing one of {fragments!r}. Last URL: {last_url!r}")
def run_flow(args: argparse.Namespace) -> dict[str, Any]:
ws_url = build_ws_url(args)
commands = plan_commands(args, ws_url)
if args.dry_run:
return {
"mode": "dry-run",
"session": args.session,
"ws_url": ws_url,
"commands": [
{
"expect_json": expect_json,
"command": [args.agent_browser_bin, "--session", args.session]
+ (["--json"] if expect_json else [])
+ list(command),
}
for command, expect_json in commands
],
}
runner = AgentBrowserRunner(
binary=args.agent_browser_bin,
session=args.session,
verbose=args.verbose,
)
results: list[dict[str, Any]] = []
final_url = None
final_title = None
final_text = None
final_snapshot = None
def append_result(command: tuple[str, ...], result: CommandResult, expect_json: bool = True) -> None:
entry: dict[str, Any] = {
"command": command,
"returncode": result.returncode,
}
if expect_json:
entry["data"] = extract_json_data(result)
else:
entry["stdout"] = result.stdout
results.append(entry)
connect_result = run_step(runner, ("connect", ws_url))
append_result(("connect", ws_url), connect_result)
open_result = run_step(runner, ("open", args.chatgpt_url))
append_result(("open", args.chatgpt_url), open_result)
wait_home_result = run_step(runner, ("wait", str(args.wait_ms)))
append_result(("wait", str(args.wait_ms)), wait_home_result)
signup_result = run_until_success(
runner,
("find", "role", "button", "click", "--name", args.signup_label),
timeout_ms=args.wait_ms,
)
append_result(("find", "role", "button", "click", "--name", args.signup_label), signup_result)
email_fill_result = run_until_success(
runner,
("fill", args.email_selector, args.email),
timeout_ms=args.wait_ms,
)
append_result(("fill", args.email_selector, args.email), email_fill_result)
submit_email_result = run_step(runner, ("click", args.submit_selector))
append_result(("click", args.submit_selector), submit_email_result)
password_url = poll_url_contains_any(
runner,
("create-account/password", "log-in/password"),
timeout_ms=args.wait_ms * 6,
)
password_url_result = run_step(runner, ("get", "url"))
append_result(("get", "url"), password_url_result)
password_fill_result = run_until_success(
runner,
("fill", args.password_selector, args.password),
timeout_ms=args.wait_ms,
)
append_result(("fill", args.password_selector, args.password), password_fill_result)
submit_password_result = run_step(runner, ("click", args.submit_selector))
append_result(("click", args.submit_selector), submit_password_result)
final_url = poll_url_contains(
runner,
"email-verification",
timeout_ms=args.wait_ms * 6,
)
url_result = run_step(runner, ("get", "url"))
append_result(("get", "url"), url_result)
title_result = run_step(runner, ("get", "title"))
append_result(("get", "title"), title_result)
text_result = run_step(runner, ("get", "text", "body"))
append_result(("get", "text", "body"), text_result)
snapshot_result = run_step(runner, ("snapshot", "-i", "-c"), expect_json=False)
append_result(("snapshot", "-i", "-c"), snapshot_result, expect_json=False)
final_title = (extract_json_data(title_result) or {}).get("title")
final_text = (extract_json_data(text_result) or {}).get("text")
final_snapshot = snapshot_result.stdout
reached_verification = bool(final_url and "email-verification" in final_url)
return {
"mode": "live",
"session": args.session,
"ws_url": ws_url,
"chatgpt_url": args.chatgpt_url,
"email": args.email,
"password_masked": "*" * len(args.password),
"reached_email_verification": reached_verification,
"final_url": final_url,
"final_title": final_title,
"final_text_excerpt": final_text[:500] if final_text else None,
"final_snapshot": final_snapshot,
"results": results,
}
def main() -> int:
args = build_parser().parse_args()
if not shutil.which(args.agent_browser_bin) and args.agent_browser_bin == "agent-browser":
raise SystemExit("agent-browser is not installed or not on PATH.")
summary = run_flow(args)
if args.output:
args.output.parent.mkdir(parents=True, exist_ok=True)
args.output.write_text(json.dumps(summary, indent=2, ensure_ascii=False) + "\n")
print(json.dumps(summary, indent=2, ensure_ascii=False))
return 0
if __name__ == "__main__":
raise SystemExit(main())