Comet Lab Atlas

An open-source OCR and layout-analysis toolkit covering text, tables, and reading order in many languages.

Surya is a modern OCR toolkit that does text recognition, line and layout detection, reading-order analysis, and table recognition across a wide range of languages.

It comes from the same team behind Marker and is built to be both accurate and fast on a GPU.

Where it's ideally used

A fit when you want modern, multilingual OCR with strong layout and reading-order detection.

Where it doesn't fit

Like any OCR engine, it is one stage — not a complete ingestion or RAG pipeline on its own.