Skip to main content

About This Site

This site is a public-facing home for ClaimEval. It exists to communicate the benchmark’s goals, share the concept, and publish updates as we approach a full release. The structure is intentionally simple and inspired by the clarity that SWE-bench brought to software benchmarks: concise problem framing, transparent methodology, and runnable evaluation artifacts.

Why have a site?

  • Provide a single source of truth for what ClaimEval measures and why it matters.
  • Explain how the Adjudication Sandbox and payer-aligned metrics work.
  • Share minimum deliverables (CLI, harness, inference abstraction, dataset loader, deterministic scorer, leaderboard) so reviewers know what to expect.
  • Announce roadmap milestones (UC v1, ER v1, leaderboard availability) and collect interest.

How it’s structured

  • Home: hero and CTAs into the docs.
  • Concept: public summary of scope, inputs/outputs, metrics, and governance.
  • Benchmark Architecture (coming): sandbox, scoring, splits.
  • Deliverables & Roadmap: launch bar and upcoming milestones.
  • Blog: updates, including why ClaimEval matters.

Stack and hosting

  • Docusaurus, deployed via Cloudflare Pages.
  • Build: npm run build; output: build/; install: npm install; Node 18 or 20.

What’s next

  • Add Benchmark Architecture and Deliverables pages.
  • Wire a simple contact CTA (mailto or form handler).
  • Publish leaderboard and harness details as they solidify.