Document Type

Article

Publication Date

January 2009

Abstract

Procedures were developed to partially automate the captioning process by estimating caption time codes using plain-text transcripts and audio recordings. Signal analysis is performed on the audio to measure pause location and duration, zero crossing rate (ZCR), and obtain frequency domain data. Algorithms were developed to match pauses in the audio to the ends of sentences in the transcript based on the observation that pause durations are greater at ends of sentences than within sentences. We have observed that ZCR peaks correspond to consonants in speech and that continuous wavelet transforms (CWT) work well for distinguishing between groups of consonants. These measurements will be used to develop algorithms to match selected phonemes in the audio to text in the transcript to supplement the pause matching results.

Comments

This paper from the 2009 PReMI conference is available in final published form in the book Pattern Recognition and Machine Intelligence, 2009, 978-3-642-11164-8.

Recommended Citation

Harvey, Daniel P. II and Liu, Peter Ping, "Algorithms to Automate Estimation of Time Codes for Captioning Digital Media" (2009). Faculty Research & Creative Activity. 12.
https://thekeep.eiu.edu/tech_fac/12

Download

Included in

Technology and Innovation Commons

COinS

Faculty Research & Creative Activity

Algorithms to Automate Estimation of Time Codes for Captioning Digital Media

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Find

Browse

Graduate Students

Authors

Student Scholar Selfies

Connect

Faculty Research & Creative Activity

Algorithms to Automate Estimation of Time Codes for Captioning Digital Media

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Share

Find

Browse

Graduate Students

Authors

Student Scholar Selfies

Connect