Cybertools, database will help analyze Sinhala language - Cornell
November 1, 2011 07:53 am
Kamal de Abrew, Cornell Ph.D. ‘81, and a professor at the American National College in Sri Lanka, tests a child in the acquisition of the Sinhala language.
A new generation of cybertools
developed at Cornell will help researchers share and analyze rare Sri Lankan
language recordings important for studying language acquisition in children.
The Sinhala language, only
spoken on the island of Sri Lanka, is “very precious” because of the unique way
it is structured, said project leader Barbara Lust, professor of human
development in the College of Human Ecology and director of the Cornell Language
Acquisition Lab. It provides an invaluable opportunity to research which
aspects of language acquisition are universal or biologically programmed and
which are culturally determined, she said.
The electronic tools, which
were developed in collaboration with Cornell’s language acquisition program and
other U.S. and international researchers, will be particularly helpful to
researchers studying the world’s nearly 7,000 languages, Lust said. “They may
facilitate widespread archiving and research collaboration across languages and
teach a new generation of students the collaborative process of data collection
and data management, aided by these tools.”
Lust’s research team has
partnered with Cornell’s Albert R. Mann Library, where the Sinhala child
language data archive is being used to test Internet-based data sharing tools.
Using specialized software, semantic Web technologies and sound metadata
systems and standards, the library’s DataStaR project is designed to help
researchers store, share and combine their data more easily, making it
available for further analysis.
Of interest not only to
language scholars but also to psychologists, anthropologists and scholars of
South Asian studies, the Sinhala child language data include more than 150
hours of audio recordings from about 450 Sri Lankan children aged 2 to 6 years
of age, collected between 1980 and 1989. The audio samples, including both
natural speech and experimentally elicited sentences, are accompanied by
transcriptions in Sinhala and English.
“This project would never have
developed without the world-renown program of Sinhala linguistic study
developed here at Cornell by Professor Emeritus James Gair and all the students
who were part of that project,” said Lust, also crediting Cornell
undergraduates as being critical to the archiving process, and Maria Blume,
Ph.D. ‘02, now assistant professor at University of Texas at El Paso, who led
much of the cybertool development and is now developing a similar Spanish
database.
The project was supported by
the National Science Foundation and the American Institute of Sri Lankan
Studies, the Cornell Chronicle reports.