Skip to content

Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.

Notifications You must be signed in to change notification settings

inuwamobarak/nougat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Nougat: Revolutionizing OCR for Scientific Documents

nuogat

About Nougat

Nougat is an advanced Transformer-based OCR model that simplifies the process of converting complex scientific documents, often stored in PDF format, into a common and machine-readable Markdown format. Developed by a team of experts, Nougat leverages state-of-the-art architecture and training techniques to make scientific knowledge more accessible and usable.

Key Features

  • Transformer Architecture: Nougat uses a Swin Transformer as a vision encoder and an mBART-based text decoder, allowing for end-to-end transcription of scientific PDFs.

  • End-to-End Training: With Nougat, there's no need for complex pipelines. The model takes raw pixels as input and generates Markdown text as output, simplifying the entire OCR process.

  • Bridging the Gap: Nougat not only transcribes scientific documents but also bridges the gap between human-readable content and machine-readable text, making it easier to access and utilize scientific knowledge.

    git clone https://github.com/inuwamobarak/nougat.git