Digital Humanities Workshop:Introduction to Optical Character Recognition (OCR)

Event Status
Scheduled
digital humanities workshop series graphic - black and white image of fog-shrouded mountain - with event title overlayed

This workshop introduces the basics of optical character recognition (OCR), which allows for full-text searching and other types of text manipulation of a digitized document. Attendees will learn how to use Google Docs to create a basic machine-readable text from an image file and be introduced to Tesseract for OCR through exercises in Google Colab. This workshop is open to researchers interested in OCR for any language. It is strongly recommended that attendees: 1) prepare a digitized, highly legible sample image file for trying out the tools, and 2) have a Google account to do the exercises fully and save their work. 

Presenter: Dale J. Correa, Mercedes Morris, Natalya Stanke 

This workshop will be a hybrid event. 

PCL Scholars Lab Data Lab or Zoom Registration: https://utexas.zoom.us/meeting/register/tJMkc-mrrDksGtGE9Nk6BaSPaYMQV7gYjuoZ#/registration

Location: Scholars Lab, Perry-Castañeda Library, Virtual

Date and Time
March 22, 2024, noon to 1 p.m.
Location
Perry-Castañeda Library