Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

Romero P.E.; Castillo-Vilcahuaman C.

Publicación:

Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

dc.contributor.author	Romero P.E.	es_PE
dc.contributor.author	Castillo-Vilcahuaman C.	es_PE
dc.date.accessioned	2024-05-30T23:13:38Z
dc.date.available	2024-05-30T23:13:38Z
dc.date.issued	2021
dc.description.abstract	Genetic diversity is an important component of biodiversity, and it is crucial for current efforts to protect and sustainably manage several organisms and habitats. As far as we know, there is only one work describing Peruvian genetic information stored in public databases. We aimed to update this previous work searching in four public databases that stored digital sequence information: Nucleotide, BioProject, PATRIC, BOLD. With this information, we comment on the contribution of Peruvian institutions during recent years. In Nucleotide, the largest database, Bacteria are the most sequenced organisms by Peruvian institutions (70.60%), pathogenic bacteria such as Pasteurella multocida, Neisseria meningitidis, and Vibrio parahaemolyticus were the most abundant. We found no sequence records from the Archaea domain. In BioProject, the most common sequence belongs to Salmonella enterica subsp. enterica serovar Infantis. In PATRIC, a database of pathogenic agents, Mycobacterium tuberculosis and Yersinia pestis had the highest number of entries. Finally, in BOLD, an exclusively Eukaryotic database, Chordata (Aves and Actinopterygii), Angiospermae, and Arthropoda (Insecta, and Arachnida) were the most frequent records. Our results would indicate research preferences of Peruvian institutions, focusing on infectious diseases and some Eukaryotic phyla. Although there has been a significant increase of DNA information submitted by Peruvian institutions since the last report, the genetic diversity reflected in these databases remains inconsistent with the diversity in the country. More efforts must be made to obtain genetic information from more underestimated taxonomic groups and to promote more genetic research in regional Peruvian institutions. © Los autores.
dc.description.sponsorship	Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - Concytec
dc.identifier.doi	https://doi.org/10.15381/RPB.V28I1.17867
dc.identifier.scopus	2-s2.0-85103044937
dc.identifier.uri	https://hdl.handle.net/20.500.12390/2399
dc.language.iso	eng
dc.publisher	Facultad de Ciencias Biologicas, Universidad Nacional Mayor de San Marcos
dc.relation.ispartof	Revista Peruana de Biologia
dc.rights	info:eu-repo/semantics/openAccess
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Public databases
dc.subject	Biodiversity	es_PE
dc.subject	Data mining	es_PE
dc.subject	Genetic diversity	es_PE
dc.subject	Peru	es_PE
dc.subject.ocde	http://purl.org/pe-repo/ocde/ford#3.04.03
dc.title	Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]
dc.type	info:eu-repo/semantics/article
dspace.entity.type	Publication

Colecciones

1.1 Eventos institucionales
6.1 Proyectos de investigación científica

Publicación: Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]

context-menu.actions.label

Archivos

Colecciones

Publicación:

Data mining of DNA sequences submitted by Peruvian institutions to public genetic databases [Minería de datos de secuencias de DNA enviadas a bases de datos genéticas públicas por instituciones peruanas]