With the amount of data available on social networks, new methodologies for the analysis of information are needed. Some methods allow the users to combine different types of data in order to extract relevant information. In this context, the present paper shows the application of a model via a platform in order to group together information generated by Twitter users, thus facilitating the detection of trends and data related to particular pathologies. In order to implement the model, an analyzing tool that uses the Levenshtein distance was developed, to determine exactly what is required to convert a text into the following texts: 'gripa'-"flu", "dolor de cabeza"-"headache", 'dolor de estomago'-"stomachache", 'fiebre'-"fever" and 'tos'-"cough" in the area of Bogotá.
Inhaltsverzeichnis (Table of Contents)
- Abstract
- Introduction
- Obtaining the information
- Applying the Levenshtein Distance
- Experimentation
- Conclusions
- References
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This paper explores the potential of Twitter data for analyzing the behavior of specific pathologies in a given geographic location. By combining data mining techniques with a custom Levenshtein distance analyzer, the authors aim to demonstrate the feasibility of using social media data to identify trends and patterns related to common illnesses.
- Social media data analysis for health research
- Application of the Levenshtein distance for text analysis
- Identifying patterns and trends in Twitter data related to specific pathologies
- Exploring the relationship between sentiment analysis and pathology prevalence
- Potential for utilizing artificial intelligence and data mining for predicting pathology behavior
Zusammenfassung der Kapitel (Chapter Summaries)
- Abstract: The paper introduces the concept of using social media data, specifically Twitter, to analyze and understand the behavior of pathologies. The authors describe their methodology, which involves applying a model and a custom Levenshtein distance analyzer to group together Twitter information related to specific illnesses. The study focuses on the "gripa" (flu) pathology in Bogotá, Colombia.
- Introduction: This section emphasizes the importance of social media data in understanding public opinion and its implications for various fields, including psychology and economics. The paper introduces the concept of web mining and gatekeeping functions, which are key elements in analyzing social network data. The authors highlight the need for a specific tool to analyze the Levenshtein distance and its connection to sentiment analysis.
- Obtaining the information: This chapter explains the process of collecting Twitter data using a Python script and the Twitter API. The authors detail how they link tweets to a specific city code using GeoPlanet and provide code examples for retrieving tweets related to the pathology "gripa" in Bogotá.
- Applying the Levenshtein Distance: This section focuses on the use of the Levenshtein distance algorithm to analyze the collected Twitter data. The authors explain the algorithm's principles and its application in identifying patterns related to specific pathologies. They provide a snapshot of the Twitter corpus used in their analysis and illustrate the process of cleaning and analyzing the data.
- Experimentation: This chapter describes the experiments conducted using the collected data. The authors applied techniques of clustering, relationship analysis, and sentiment analysis to understand the behavior of the data related to the "gripa" pathology. They present visual representations of their findings, including graphs and diagrams, demonstrating the patterns and relationships identified.
Schlüsselwörter (Keywords)
The main keywords and focus topics of this paper include social media analysis, Twitter data, pathology behavior, Levenshtein distance, sentiment analysis, data mining, artificial intelligence, web mining, gatekeeping functions, and public opinion.
- Quote paper
- Dennis Salcedo (Author), Alejandro León (Author), 2015, Behavior of Users Talking about Pathologies and Diseases on Twitter, Munich, GRIN Verlag, https://www.hausarbeiten.de/document/302920