Publicación:
Web Scraping versus Twitter API: A Comparison for a Credibility Analysis

dc.contributor.author Dongo I. es_PE
dc.contributor.author Cadinale Y. es_PE
dc.contributor.author Aguilera A. es_PE
dc.contributor.author Martínez F. es_PE
dc.contributor.author Quintero Y. es_PE
dc.contributor.author Barrios S. es_PE
dc.date.accessioned 2024-05-30T23:13:38Z
dc.date.available 2024-05-30T23:13:38Z
dc.date.issued 2020
dc.description.abstract Twitter is one of the most popular information source available on the Web. Thus, there exist many studies focused on analyzing the credibility of the shared information. Most proposals use either Twitter API or web scraping to extract the data to perform such analysis. Both extraction techniques have advantages and disadvantages. In this work, we present a study to evaluate their performance and behavior. The motivation for this research comes from the necessity to know ways to extract online information in order to analyze in real-time the credibility of the content posted on the Web. To do so, we develop a framework which offers both alternatives of data extraction and implements a previously proposed credibility model. Our framework is implemented as a Google Chrome extension able to analyze tweets in real-time. Results report that both methods produce identical credibility values, when a robust normalization process is applied to the text (i.e., tweet). Moreover, concerning the time performance, web scraping is faster than Twitter API, and it is more flexible in terms of obtaining data; however, web scraping is very sensitive to website changes. © 2020 ACM.
dc.description.sponsorship Consejo Nacional de Ciencia, Tecnología e Innovación Tecnológica - Concytec
dc.identifier.doi https://doi.org/10.1145/3428757.3429104
dc.identifier.scopus 2-s2.0-85100336680
dc.identifier.uri https://hdl.handle.net/20.500.12390/2460
dc.language.iso eng
dc.publisher Association for Computing Machinery
dc.relation.ispartof ACM International Conference Proceeding Series
dc.rights info:eu-repo/semantics/openAccess
dc.subject Web Scraping
dc.subject API es_PE
dc.subject Credibility es_PE
dc.subject Twitter es_PE
dc.subject.ocde http://purl.org/pe-repo/ocde/ford#2.02.04
dc.title Web Scraping versus Twitter API: A Comparison for a Credibility Analysis
dc.type info:eu-repo/semantics/article
dspace.entity.type Publication
Archivos