A11yReady
AI-Powered Document Accessibility Platform
Platform Screenshots

Branded upload landing with drag-and-drop support, document type auto-detection, and per-department logo selection

Reviewer compares the original PDF and the accessible HTML side by side, then approves and exports as HTML+PDF, HTML only, or PDF only

Quality gate surfaces specific structural issues — missing content, light text contrast, table count mismatches, semantic timeline items — with one-click fix workflow

Reviewer types a plain-English fix instruction or clicks a suggestion; system regenerates the affected section without rerunning the whole pipeline

Searchable conversion history with per-document quality score, review status, and full audit trail across all uploads
Overview
End-to-end platform replacing 1–4 hours of manual Adobe Acrobat work with a 30–90 second AI pipeline. Documents flow through LangGraph orchestration: a 34-cluster triage classifier routes each PDF to its appropriate path (budget program offers, hearing agendas, election statistics, fillable forms, Zoom transcripts, and 29 more). Known clusters use deterministic Jinja2 templates for consistent output; unknown clusters fall back to LLM-based generation. A three-agent polish gate runs before review: (1) deterministic HTML fixes, (2) cluster-specific common-sense rule packs that enforce doc-type conventions, and (3) an optional Gemini 2.5 Pro Layout Verifier that compares rendered output against PDF page images for visual fidelity. Validation runs axe-core via Playwright on every document. Reviewers get a side-by-side compare interface with AI-powered fix suggestions — one-click to regenerate a section without rerunning the full pipeline. After 3+ approvals of similar documents, the system builds structural templates that deterministically correct heading hierarchies on future extractions — continuous learning across all 34 document types. All processing stays inside the County's GCP project; no data leaves the environment. ~63,000 lines across 95+ Python files.
Validation Pipeline
Triage classification
34-cluster classifier routes the document to the appropriate extraction method, template renderer, and meeting type via a centralized ClusterDispatch registry
Vision + content extraction
Document AI Layout Parser runs first; for complex docs, Gemini vision provides full page analysis; Claude handles content extraction merged with PyMuPDF text for accurate heading detection
HTML generation + axe-core validation
Claude generates the accessible HTML; Playwright runs axe-core against it for WCAG 2.1 AA compliance with iterative refinement on failure
Three-agent polish gate
deterministic HTML fixes, cluster-specific common-sense rule packs (e.g. meeting section ordering, form groupings), and an optional Gemini 2.5 Pro Layout Verifier that compares rendered output against source PDF page images, with a quorum merger that boosts corroborated findings
Human review with AI assistance
reviewer sees original PDF and accessible HTML side by side, gets AI-generated fix suggestions, and can regenerate individual sections without rerunning the full pipeline; approvals feed structural templates back into future extractions
Impact & Results
Key Features
Deliverables & Documentation
Leadership Explainer
Plain-language guide for champions to explain the system to non-technical stakeholders, including objection handling and demo scripts
Operations Security Handbook
Security architecture documentation covering data isolation, access controls, and compliance posture
Champion Quick Reference
One-page reference card with key stats, validation steps, and ready-to-use talking points
Validation Pipeline Infographic
Visual walkthrough of the polish gate and validation pipeline for stakeholder presentations
Reviewer Onboarding Guide
Step-by-step onboarding for human reviewers covering the side-by-side workflow and AI fix-suggestion conventions
Technical Turnover Binder
Complete system documentation for long-term maintenance and knowledge transfer
Next Project
Content Automation (Linkit)