Syllabification by Categorization

Session Type

Traditional Paper Presentation

Research Project Abstract

The division of words into syllables exists as a very difficult problem to solve; a problem truly suited for machine learning. Syllables play an important role in speech recognition, spoken document recovery and speech synthesis, which creates a need for the process to be automated. The goal is to create a low cost, novel and language agnostic system to syllabify words. The approach uses a hybrid genetic algorithm to create fine-tuned categories of phones. These categories are then used in a sequence classifier, a hidden Markov model, to find the syllable boundaries. The technique shows promising preliminary results when trained on English words, hitting a high of about 93% in the most recent runs. We discuss the approach we’ve taken and future research to be done.

Session Number

RS3

Location

Weyerhaeuser 205

Abstract Number

RS3-b

This document is currently not available here.

COinS
 
Apr 28th, 9:15 AM Apr 28th, 10:45 AM

Syllabification by Categorization

Weyerhaeuser 205

The division of words into syllables exists as a very difficult problem to solve; a problem truly suited for machine learning. Syllables play an important role in speech recognition, spoken document recovery and speech synthesis, which creates a need for the process to be automated. The goal is to create a low cost, novel and language agnostic system to syllabify words. The approach uses a hybrid genetic algorithm to create fine-tuned categories of phones. These categories are then used in a sequence classifier, a hidden Markov model, to find the syllable boundaries. The technique shows promising preliminary results when trained on English words, hitting a high of about 93% in the most recent runs. We discuss the approach we’ve taken and future research to be done.