Syllabification by Categorization
Faculty Sponsor
Paul De Palma depalma@gonzaga.edu
Session Type
Traditional Paper Presentation
Research Project Abstract
The division of words into syllables exists as a very difficult problem to solve; a problem truly suited for machine learning. Syllables play an important role in speech recognition, spoken document recovery and speech synthesis, which creates a need for the process to be automated. The goal is to create a low cost, novel and language agnostic system to syllabify words. The approach uses a hybrid genetic algorithm to create fine-tuned categories of phones. These categories are then used in a sequence classifier, a hidden Markov model, to find the syllable boundaries. The technique shows promising preliminary results when trained on English words, hitting a high of about 93% in the most recent runs. We discuss the approach we’ve taken and future research to be done.
Session Number
RS3
Location
Weyerhaeuser 205
Abstract Number
RS3-b
Syllabification by Categorization
Weyerhaeuser 205
The division of words into syllables exists as a very difficult problem to solve; a problem truly suited for machine learning. Syllables play an important role in speech recognition, spoken document recovery and speech synthesis, which creates a need for the process to be automated. The goal is to create a low cost, novel and language agnostic system to syllabify words. The approach uses a hybrid genetic algorithm to create fine-tuned categories of phones. These categories are then used in a sequence classifier, a hidden Markov model, to find the syllable boundaries. The technique shows promising preliminary results when trained on English words, hitting a high of about 93% in the most recent runs. We discuss the approach we’ve taken and future research to be done.