A Low-Resourced Peruvian Language Identification Model

Linares A.E.; Oncevay-Marcos A.

Publicación:

A Low-Resourced Peruvian Language Identification Model

dc.contributor.author	Linares A.E.	es_PE
dc.contributor.author	Oncevay-Marcos A.	es_PE
dc.date.accessioned	2024-05-30T23:13:38Z
dc.date.available	2024-05-30T23:13:38Z
dc.date.issued	2017
dc.description.abstract	Due to the linguistic revitalization in Peru´ through the last years, there is a growing interest to reinforce the bilingual education in the country and to increase the research focused in its native languages. From the computer science perspective, one of the first steps to support the languages study is the implementation of an automatic language identification tool using machine learning methods. Therefore, this work focuses in two steps: (1) the building of a digital and annotated corpus for 16 Peruvian native languages extracted from documents in web repositories, and (2) the fit of a supervised learning model for the language identification task using features identified from related studies in the state of the art, such as ngrams. The obtained results were promising (97% in average precision), and it is expected to take advantage of the corpus and the model for more complex tasks in the future.
dc.description.sponsorship	Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - Concytec
dc.identifier.scopus	2-s2.0-85040614941
dc.identifier.uri	https://hdl.handle.net/20.500.12390/488
dc.language.iso	eng
dc.publisher	CEUR-WS
dc.relation.ispartof	CEUR Workshop Proceedings
dc.rights	info:eu-repo/semantics/openAccess
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Learning systems
dc.subject	Big data	es_PE
dc.subject	Education	es_PE
dc.subject	Information management	es_PE
dc.subject	Automatic language identification	es_PE
dc.subject	Bilingual education	es_PE
dc.subject	Complex task	es_PE
dc.subject.ocde	https://purl.org/pe-repo/ocde/ford#6.02.00
dc.title	A Low-Resourced Peruvian Language Identification Model
dc.type	info:eu-repo/semantics/conferenceObject
dspace.entity.type	Publication

Colecciones

1.1 Eventos institucionales
6.1 Proyectos de investigación científica

Publicación: A Low-Resourced Peruvian Language Identification Model

context-menu.actions.label

Archivos

Colecciones

Publicación:

A Low-Resourced Peruvian Language Identification Model