Guide
Public read
Integrations

Evals

Evidence-backed QA, scorecards, release gates, suites, and decisions.

Open Evals
browser validation lane

Validate Evals in BrowserOps

Preview the plan publicly or launch the trusted-domain browser check using the server-side Vercel PLATPHORM_API_KEY when no key is entered.

Preview JSON
live passed

Live Public Connectivity

Evals is reachable for public discovery. Use Browserbase-backed protected runs for browser evidence.

Live JSON

health

200passed

trace echo not confirmed

apiDocs

200passed

trace echo not confirmed

llms

200passed

trace echo not confirmed

mcp

200passed

trace echo not confirmed

handoffReceiver

200passed

trace echo not confirmed

Browserbase Runtime

Protected BrowserOps runs for Evals use browserbase. Browserbase is configured, project is configured, and session metadata records BrowserOps run ID, journey ID, and target service without storing target URLs in Browserbase metadata.

tight integration contract

Evals Handoff Lane

Evals consumes BrowserOps run evidence for scorecards and release gates; missing completed reports or artifacts are degraded downstream evidence gaps, not fake Evals data.

evidence boundary

Evals Evidence Gap Handling

BrowserOps adapter output can be real and still degraded when completed BrowserOps report or artifact evidence is unavailable. BrowserOps treats that as a downstream evidence gap, not fake scorecard data.

Capability Mapping

BrowserOps may call

  • submit_browser_evidence
  • read_release_gate_decision

BrowserOps may receive

  • eval_browser_task_requested
  • release_gate_browser_check_requested

Transport and Boundary

Preferred transports
openapi, mcp, webhook, trace
API docs
https://evals.platphormnews.com/api/docs
MCP status
not advertised
Auth boundary
Public scorecards may be readable; suite execution and release decisions require PLATPHORM_API_KEY.

Handoff Examples

browseropstoevalsprotected_required

Evals remains the scorecard and release gate; BrowserOps provides browser proof for scoring and reports missing completed artifacts as degraded evidence gaps.