Web Scraping versus Twitter API: A Comparison for a Credibility Analysis

Dongo I.; Cadinale Y.; Aguilera A.; Martínez F.; Quintero Y.; Barrios S.

Publicación:

Web Scraping versus Twitter API: A Comparison for a Credibility Analysis

Fecha

2020

Autores

Dongo I.

Cadinale Y.

Aguilera A.

Martínez F.

Quintero Y.

Barrios S.

Editor

Association for Computing Machinery

Abstracto

Twitter is one of the most popular information source available on the Web. Thus, there exist many studies focused on analyzing the credibility of the shared information. Most proposals use either Twitter API or web scraping to extract the data to perform such analysis. Both extraction techniques have advantages and disadvantages. In this work, we present a study to evaluate their performance and behavior. The motivation for this research comes from the necessity to know ways to extract online information in order to analyze in real-time the credibility of the content posted on the Web. To do so, we develop a framework which offers both alternatives of data extraction and implements a previously proposed credibility model. Our framework is implemented as a Google Chrome extension able to analyze tweets in real-time. Results report that both methods produce identical credibility values, when a robust normalization process is applied to the text (i.e., tweet). Moreover, concerning the time performance, web scraping is faster than Twitter API, and it is more flexible in terms of obtaining data; however, web scraping is very sensitive to website changes. © 2020 ACM.

Palabras clave

Web Scraping, API, Credibility, Twitter

URI

https://hdl.handle.net/20.500.12390/2460

Colecciones

1.1 Eventos institucionales
6.1 Proyectos de investigación científica

Página completa del artículo

Publicación:

Web Scraping versus Twitter API: A Comparison for a Credibility Analysis

Fecha

Autores

Título de la revista

Revista ISSN

Título del volumen

Editor

Proyectos de investigación

Unidades organizativas

Número de la revista

Abstracto

Descripción

Palabras clave

Citación

URI

Colecciones

Publicación: Web Scraping versus Twitter API: A Comparison for a Credibility Analysis

context-menu.actions.label

Fecha

Autores

Título de la revista

Revista ISSN

Título del volumen

Editor

Proyectos de investigación

Unidades organizativas

Número de la revista

Abstracto

Descripción

Palabras clave

Citación

URI

Colecciones

Publicación:

Web Scraping versus Twitter API: A Comparison for a Credibility Analysis