Language databases of the major languages of the Philippines will play a big role as sources of information in developing teaching materials and computer-based classroom applications. With very little available spoken corpora for the major languages of the Philippines, researchers from the Electrical and Electronics Engineering Institute of the University of the Philippines Diliman (EEEI-UPD) study the interface of linguistics, engineering and other sciencesto enhance mother tongue-based multilingual education.
“We have developed databases of 10 spoken languages in the Philippines aside from Filipino. These include Tagalog, Cebuano, Ilokano, Hiligaynon or Ilonggo, Waray-Waray, Kapampangan, Tausug, Northern Bicolano, Pangasinense, and a code mixed language of Filipino and English,” they share. “These databases aim to preserve the important heritage of the country and can be used for training and developing speech-based software applications to assist differently-abled speakers as well as those studying foreign languages.”
The outputs of the project the development of the speech acquisition software, complete translation of materials in 10 major languages, protocols for recording, documentation of data preparation and data post-processing. Collected speech corpora are composed of 200 speakers of Tagalog, 200 speakers of Cebuano, 200 speakers of Hiligaynon, 200 speakers of Kapampangan, 200 speakers of Bicolano, 200 speakers of Waray, 200 speakers of Ilokano, 157 speakers of Tausug and 24 speakers of Pangasinense.
The project is a component of the Interdisciplinary Signal Processing for Pinoys (ISIP) Programfunded by the Philippine Council for Industry, Energy and Emerging Technology Research and Development of the Department of Science and Technology (DOST-PCIEERD) and the University of the Philippines Diliman. For more information, please contact the Program Leader, Dr. Rhandley Cajote of the Digital Signal Processing Laboratory of the EEEI-UPD.