Getting Started with Handwritten Text Recognition in PyTorch (Python)

A half-day hands-on workshop providing a technical introduction to deep-learning-based handwritten text recognition (HTR).

Time and place: June 11, 2024 12:30 PM – 4:00 PM, DSC Oasen

Add to calendar

What This Workshop is About

This workshop provides a technical introduction to deep-learning-based handwritten text recognition (HTR). We will begin with a general introduction to the topic and outline a typical HTR pipeline. Afterwards, each pipeline's steps will be discussed in detail, including hands-on exercises using PyTorch.

The presented examples will focus on a handwritten document scenario, however the general pipeline and most of the steps are directly applicable to printed material as well. Differences and possible adaptations will be pointed out along the way.

While transformer-based HTR models are gaining popularity, this workshop will focus on the more classical but still state-of-the-art approach of Connectionist Temporal Classification. The workshop will conclude with thoughts on reproducibility and experiences from training HTR models on computing centre resources.

Instructions and Setup

Find instructions and detailed course information on the GitHub course website.

Learning Outcomes

knowledge of technical details of the recognition pipeline and HTR in general
practised implementing an end-to-end HTR pipeline in PyTorch

Prerequisites

Participants should already be familiar with general deep-learning concepts and should have experience with programming in Python.

Target Audience

Researchers, research support staff, research software engineers, and developers who work with handwritten documents and would like to implement their own HTR pipelines, e.g., to cater to specific pre-processing requirements.

Required Material

A laptop with a preconfigured Python environment. A detailed list of requirements will be provided a few weeks before the workshop.

Demo data will be provided, but participants may bring data from their projects as long as it has been annotated and segmented into text lines (e.g. using Transkribus).

Register

Programme

We will take a fifteen-minute break halfway through the workshop.

Victuals

We will serve hot and cold beverages & nibbles.

About Raphaela Heil

Raphaela Heil is a software engineer at the Popular Movements' Archive Uppsala, working with automatic text recognition and document image processing. She recently defended her thesis "Document Image Processing for Handwritten Text Recognition: Deep Learning-based Transliteration of Astrid Lindgren’s Stenographic Manuscripts" (2023). Raphaela is a Carpentries and CodeRefinery certified instructor.

About BærUt!

BærUt! is a competence hub at the University of Oslo for promoting digital scholarly editions (DSEs). Our ambition is to consolidate expertise and knowledge in the field, gather researchers and practitioners, and, in the long run, create the foundation for a common platform for digital editions. We work closely with researchers, developers, and cultural institutions to digitize historical and cultural text documents and ensure these resources are accessible and useful for academic use.

Questions?

Send an email to BærUt! project leader, Annika Rockenberger.

Published Feb. 29, 2024 11:10 AM - Last modified Feb. 29, 2024 2:25 PM