About This Site
This site is a public-facing home for ClaimEval. It exists to communicate the benchmark’s goals, share the concept, and publish updates as we approach a full release. The structure is intentionally simple and inspired by the clarity that SWE-bench brought to software benchmarks: concise problem framing, transparent methodology, and runnable evaluation artifacts.
Why have a site?
- Provide a single source of truth for what ClaimEval measures and why it matters.
- Explain how the Adjudication Sandbox and payer-aligned metrics work.
- Share minimum deliverables (CLI, harness, inference abstraction, dataset loader, deterministic scorer, leaderboard) so reviewers know what to expect.
- Announce roadmap milestones (UC v1, ER v1, leaderboard availability) and collect interest.
How it’s structured
- Home: hero and CTAs into the docs.
- Concept: public summary of scope, inputs/outputs, metrics, and governance.
- Benchmark Architecture (coming): sandbox, scoring, splits.
- Deliverables & Roadmap: launch bar and upcoming milestones.
- Blog: updates, including why ClaimEval matters.
Stack and hosting
- Docusaurus, deployed via Cloudflare Pages.
- Build:
npm run build; output:build/; install:npm install; Node 18 or 20.
What’s next
- Add Benchmark Architecture and Deliverables pages.
- Wire a simple contact CTA (mailto or form handler).
- Publish leaderboard and harness details as they solidify.