Search CORE

2,670 research outputs found

A Semi-automatic Method for Efficient Detection of Stories on Social Media

Author: Roy Deb
Vosoughi Soroush
Publication venue
Publication date: 01/05/2016
Field of study

Twitter has become one of the main sources of news for many people. As real-world events and emergencies unfold, Twitter is abuzz with hundreds of thousands of stories about the events. Some of these stories are harmless, while others could potentially be life-saving or sources of malicious rumors. Thus, it is critically important to be able to efficiently track stories that spread on Twitter during these events. In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter. We ran a user study with 25 participants, demonstrating that compared to more conventional methods, our tool can increase the speed and the accuracy with which users can track stories about real-world events.Comment: ICWSM'16, May 17-20, Cologne, Germany. In Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, German

arXiv.org e-Print Archive

DSpace@MIT

Reading the Source Code of Social Ties

Author: Aiello Luca Maria
Schifanella Rossano
State Bogdan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Though online social network research has exploded during the past years, not much thought has been given to the exploration of the nature of social links. Online interactions have been interpreted as indicative of one social process or another (e.g., status exchange or trust), often with little systematic justification regarding the relation between observed data and theoretical concept. Our research aims to breach this gap in computational social science by proposing an unsupervised, parameter-free method to discover, with high accuracy, the fundamental domains of interaction occurring in social networks. By applying this method on two online datasets different by scope and type of interaction (aNobii and Flickr) we observe the spontaneous emergence of three domains of interaction representing the exchange of status, knowledge and social support. By finding significant relations between the domains of interaction and classic social network analysis issues (e.g., tie strength, dyadic interaction over time) we show how the network of interactions induced by the extracted domains can be used as a starting point for more nuanced analysis of online social data that may one day incorporate the normative grammar of social interaction. Our methods finds applications in online social media services ranging from recommendation to visual link summarization.Comment: 10 pages, 8 figures, Proceedings of the 2014 ACM conference on Web (WebSci'14

arXiv.org e-Print Archive

CiteSeerX

Crossref

Analysing Human Mobility Patterns of Hiking Activities through Complex Network Theory

Author: Eguíluz Víctor
Guerrero Carlos
Juiz Carlos
Lera Isaac
Pérez Toni
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/05/2017
Field of study

The exploitation of high volume of geolocalized data from social sport tracking applications of outdoor activities can be useful for natural resource planning and to understand the human mobility patterns during leisure activities. This geolocalized data represents the selection of hike activities according to subjective and objective factors such as personal goals, personal abilities, trail conditions or weather conditions. In our approach, human mobility patterns are analysed from trajectories which are generated by hikers. We propose the generation of the trail network identifying special points in the overlap of trajectories. Trail crossings and trailheads define our network and shape topological features. We analyse the trail network of Balearic Islands, as a case of study, using complex weighted network theory. The analysis is divided into the four seasons of the year to observe the impact of weather conditions on the network topology. The number of visited places does not decrease despite the large difference in the number of samples of the two seasons with larger and lower activity. It is in summer season where it is produced the most significant variation in the frequency and localization of activities from inland regions to coastal areas. Finally, we compare our model with other related studies where the network possesses a different purpose. One finding of our approach is the detection of regions with relevant importance where landscape interventions can be applied in function of the communities.Comment: 20 pages, 9 figures, accepte

arXiv.org e-Print Archive

Directory of Open Access Journals

Digital.CSIC

FigShare

Knowledge Graph semantic enhancement of input data for improving AI

Author: Bhatt Shreyansh
Shalin Valerie
Sheth Amit
Zhao Jinjin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2020
Field of study

Intelligent systems designed using machine learning algorithms require a large number of labeled data. Background knowledge provides complementary, real world factual information that can augment the limited labeled data to train a machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for many practical applications, it is convenient and useful to organize this background knowledge in the form of a graph. Recent academic research and implemented industrial intelligent systems have shown promising performance for machine learning algorithms that combine training data with a knowledge graph. In this article, we discuss the use of relevant KGs to enhance input data for two applications that use machine learning -- recommendation and community detection. The KG improves both accuracy and explainability

arXiv.org e-Print Archive

CORE

A Multi-label Text Classification Framework: Using Supervised and Unsupervised Feature Selection Strategy

Author: Ma Long
Publication venue: ScholarWorks @ Georgia State University
Publication date: 08/08/2017
Field of study

Text classification, the task of metadata to documents, needs a person to take significant time and effort. Since online-generated contents are explosively growing, it becomes a challenge for manually annotating with large scale and unstructured data. Recently, various state-or-art text mining methods have been applied to classification process based on the keywords extraction. However, when using these keywords as features in the classification task, it is common that the number of feature dimensions is large. In addition, how to select keywords from documents as features in the classification task is a big challenge. Especially, when using traditional machine learning algorithms in big data, the computation time is very long. On the other hand, about 80% of real data is unstructured and non-labeled in the real world. The conventional supervised feature selection methods cannot be directly used in selecting entities from massive data. Usually, statistical strategies are utilized to extract features from unlabeled data for classification tasks according to their importance scores. We propose a novel method to extract key features effectively before feeding them into the classification assignment. Another challenge in the text classification is the multi-label problem, the assignment of multiple non-exclusive labels to documents. This problem makes text classification more complicated compared with a single label classification. For the above issues, we develop a framework for extracting data and reducing data dimension to solve the multi-label problem on labeled and unlabeled datasets. In order to reduce data dimension, we develop a hybrid feature selection method that extracts meaningful features according to the importance of each feature. The Word2Vec is applied to represent each document by a feature vector for the document categorization for the big dataset. The unsupervised approach is used to extract features from real online-generated data for text classification. Our unsupervised feature selection method is applied to extract depression symptoms from social media such as Twitter. In the future, these depression symptoms will be used for depression self-screening and diagnosis

ScholarWorks @ Georgia State University