Chandra | AI Native Landscape

Chandra is a high-accuracy OCR model that handles complex tables, forms, handwriting, and full layout recognition. It converts images and PDFs into structured HTML, Markdown, or JSON outputs while preserving layout information such as headers, footers, tables, checkboxes, and mathematical notation, making it suitable for the most demanding document digitization tasks.

Document Conversion Capabilities

Converts documents to Markdown, HTML, or JSON with detailed layout metadata preserved
Strong support for complex forms with checkboxes and intricate table structures
Handles mathematical notation, headers, footers, and full-page layout recognition
Preserves semantic relationships between document elements during conversion

Recognition Strengths

High-accuracy handwriting recognition for notes, exams, and archival materials
Supports 40+ languages with both local inference via HuggingFace and remote inference using a vLLM server
Robust performance on legal contracts, invoices, and forms with complex layouts
Suitable for the most demanding document digitization tasks

Deployment Options

CLI package via chandra-ocr for scripted and batch processing workflows
Interactive Streamlit demo for quick evaluation and testing
vLLM Docker image for production-grade remote inference deployments
Apache-2.0 license with commercial licensing and hosted API options available through the project website