All Projects

A11yReady

AI-Powered Document Accessibility Platform

LangGraphGemini 2.5Claude Sonnet 4.5Vertex AIDocument AIPlaywrightaxe-corePython 3.11FlaskGoogle Cloud Run
95%+
WCAG Compliance
34
Document Clusters
95%
Time Saved

Platform Screenshots

multnomah-county-accessibility.app
a11yReady — Upload Interface

Branded upload landing with drag-and-drop support, document type auto-detection, and per-department logo selection

multnomah-county-accessibility.app
Side-by-Side Human Review with Approve & Download

Reviewer compares the original PDF and the accessible HTML side by side, then approves and exports as HTML+PDF, HTML only, or PDF only

multnomah-county-accessibility.app
AI-Powered Suggested Fixes

Quality gate surfaces specific structural issues — missing content, light text contrast, table count mismatches, semantic timeline items — with one-click fix workflow

multnomah-county-accessibility.app
Reviewer Fix Workflow

Reviewer types a plain-English fix instruction or clicks a suggestion; system regenerates the affected section without rerunning the whole pipeline

multnomah-county-accessibility.app
Conversion History

Searchable conversion history with per-document quality score, review status, and full audit trail across all uploads

Overview

End-to-end platform replacing 1–4 hours of manual Adobe Acrobat work with a 30–90 second AI pipeline. Documents flow through LangGraph orchestration: a 34-cluster triage classifier routes each PDF to its appropriate path (budget program offers, hearing agendas, election statistics, fillable forms, Zoom transcripts, and 29 more). Known clusters use deterministic Jinja2 templates for consistent output; unknown clusters fall back to LLM-based generation. A three-agent polish gate runs before review: (1) deterministic HTML fixes, (2) cluster-specific common-sense rule packs that enforce doc-type conventions, and (3) an optional Gemini 2.5 Pro Layout Verifier that compares rendered output against PDF page images for visual fidelity. Validation runs axe-core via Playwright on every document. Reviewers get a side-by-side compare interface with AI-powered fix suggestions — one-click to regenerate a section without rerunning the full pipeline. After 3+ approvals of similar documents, the system builds structural templates that deterministically correct heading hierarchies on future extractions — continuous learning across all 34 document types. All processing stays inside the County's GCP project; no data leaves the environment. ~63,000 lines across 95+ Python files.

Validation Pipeline

1

Triage classification

34-cluster classifier routes the document to the appropriate extraction method, template renderer, and meeting type via a centralized ClusterDispatch registry

2

Vision + content extraction

Document AI Layout Parser runs first; for complex docs, Gemini vision provides full page analysis; Claude handles content extraction merged with PyMuPDF text for accurate heading detection

3

HTML generation + axe-core validation

Claude generates the accessible HTML; Playwright runs axe-core against it for WCAG 2.1 AA compliance with iterative refinement on failure

4

Three-agent polish gate

deterministic HTML fixes, cluster-specific common-sense rule packs (e.g. meeting section ordering, form groupings), and an optional Gemini 2.5 Pro Layout Verifier that compares rendered output against source PDF page images, with a quorum merger that boosts corroborated findings

5

Human review with AI assistance

reviewer sees original PDF and accessible HTML side by side, gets AI-generated fix suggestions, and can regenerate individual sections without rerunning the full pipeline; approvals feed structural templates back into future extractions

Impact & Results

30–90s
Processing Time
per PDF vs. 1–4 hours manual Acrobat work
$0.50–$2
Cost per Document
AI processing; Excel exports at near-zero
95%+
First-Pass WCAG
axe-core 2.1 AA compliance before human review
34
Document Clusters
classified types with deterministic templates
95%
Time Reduction
from 1–4 hours to 30–90 seconds per document
Zero
Data Exposure
all processing within county Google Cloud project

Key Features

30–90 second PDF conversion vs. 1–4 hours of manual Acrobat remediation
34-cluster triage classifier with centralized dispatch registry — adding a new doc type = one ClusterDispatch entry
Three-agent polish gate: deterministic HTML fixes + cluster rule packs + Gemini 2.5 Pro Layout Verifier with quorum merger
axe-core via Playwright running real automated WCAG 2.1 AA checks on every document
Side-by-side reviewer interface with AI-powered fix suggestions and one-click section regeneration
Continuous learning — 3+ reviewer approvals build structural templates that auto-correct heading hierarchies on future docs
Excel program-offer auto-approval — Questica 3-tab and multi-sheet exports converted at 100% accuracy in seconds
Regeneration mode for image-heavy slides — rebuilds the document semantically rather than tracing source layout
Google Drive Picker integration — single-file or whole-folder upload directly from Drive
Zero data leaves county cloud — all processing inside Multnomah County GCP project

Deliverables & Documentation

Leadership Explainer

Plain-language guide for champions to explain the system to non-technical stakeholders, including objection handling and demo scripts

Operations Security Handbook

Security architecture documentation covering data isolation, access controls, and compliance posture

Champion Quick Reference

One-page reference card with key stats, validation steps, and ready-to-use talking points

Validation Pipeline Infographic

Visual walkthrough of the polish gate and validation pipeline for stakeholder presentations

Reviewer Onboarding Guide

Step-by-step onboarding for human reviewers covering the side-by-side workflow and AI fix-suggestion conventions

Technical Turnover Binder

Complete system documentation for long-term maintenance and knowledge transfer