304 research outputs found

    A Bi-Directional GRU Architecture for the Self-Attention Mechanism: An Adaptable, Multi-Layered Approach with Blend of Word Embedding

    Get PDF
    Sentiment analysis (SA) has become an essential component of natural language processing (NLP) with numerous practical applications to understanding “what other people think”. Various techniques have been developed to tackle SA using deep learning (DL); however, current research lacks comprehensive strategies incorporating multiple-word embeddings. This study proposes a self-attention mechanism that leverages DL and involves the contextual integration of word embedding with a time-dispersed bidirectional gated recurrent unit (Bi-GRU). This work employs word embedding approaches GloVe, word2vec, and fastText to achieve better predictive capabilities. By integrating these techniques, the study aims to improve the classifier’s capability to precisely analyze and categorize sentiments in textual data from the domain of movies. The investigation seeks to enhance the classifier’s performance in NLP tasks by addressing the challenges of underfitting and overfitting in DL. To evaluate the model’s effectiveness, an openly available IMDb dataset was utilized, achieving a remarkable testing accuracy of 99.70%

    INVESTIGATING CRIME-TO-TWITTER RELATIONSHIPS IN URBAN ENVIRONMENTS - FACILITATING A VIRTUAL NEIGHBORHOOD WATCH

    Get PDF
    Social networks offer vast potential for marketing agencies, as members freely provide private information, for instance on their current situation, opinions, tastes, and feelings. The use of social networks to feed into crime platforms has been acknowledged to build a kind of a virtual neighborhood watch. Current attempts that tried to automatically connect news from social networks with crime platforms have concentrated on documentation of past events, but neglected the opportunity to use Twitter data as a decision support system to detect future crimes. In this work, we attempt to unleash the wisdom of crowds materialized in tweets from Twitter. This requires to look at Tweets that have been sent within a vicinity of each other. Based on the aggregated Tweets traffic we correlate them with crime types. Apparently, crimes such as disturbing the peace or homicide exhibit different Tweet patterns before the crime has been committed. We show that these tweet patterns can strengthen the explanation of criminal activity in urban areas. On top of that, we go beyond pure explanatory approaches and use predictive analytics to provide evidence that Twitter data can improve the prediction of crimes

    Accurate and budget-efficient text, image, and video analysis systems powered by the crowd

    Full text link
    Crowdsourcing systems empower individuals and companies to outsource labor-intensive tasks that cannot currently be solved by automated methods and are expensive to tackle by domain experts. Crowdsourcing platforms are traditionally used to provide training labels for supervised machine learning algorithms. Crowdsourced tasks are distributed among internet workers who typically have a range of skills and knowledge, differing previous exposure to the task at hand, and biases that may influence their work. This inhomogeneity of the workforce makes the design of accurate and efficient crowdsourcing systems challenging. This dissertation presents solutions to improve existing crowdsourcing systems in terms of accuracy and efficiency. It explores crowdsourcing tasks in two application areas, political discourse and annotation of biomedical and everyday images. The first part of the dissertation investigates how workers' behavioral factors and their unfamiliarity with data can be leveraged by crowdsourcing systems to control quality. Through studies that involve familiar and unfamiliar image content, the thesis demonstrates the benefit of explicitly accounting for a worker's familiarity with the data when designing annotation systems powered by the crowd. The thesis next presents Crowd-O-Meter, a system that automatically predicts the vulnerability of crowd workers to believe \enquote{fake news} in text and video. The second part of the dissertation explores the reversed relationship between machine learning and crowdsourcing by incorporating machine learning techniques for quality control of crowdsourced end products. In particular, it investigates if machine learning can be used to improve the quality of crowdsourced results and also consider budget constraints. The thesis proposes an image analysis system called ICORD that utilizes behavioral cues of the crowd worker, augmented by automated evaluation of image features, to infer the quality of a worker-drawn outline of a cell in a microscope image dynamically. ICORD determines the need to seek additional annotations from other workers in a budget-efficient manner. Next, the thesis proposes a budget-efficient machine learning system that uses fewer workers to analyze easy-to-label data and more workers for data that require extra scrutiny. The system learns a mapping from data features to number of allocated crowd workers for two case studies, sentiment analysis of twitter messages and segmentation of biomedical images. Finally, the thesis uncovers the potential for design of hybrid crowd-algorithm methods by describing an interactive system for cell tracking in time-lapse microscopy videos, based on a prediction model that determines when automated cell tracking algorithms fail and human interaction is needed to ensure accurate tracking

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Sentiment Analysis for Social Media

    Get PDF
    Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

    A Data-driven Neural Network Architecture for Sentiment Analysis

    Get PDF
    The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text

    Development of Context-Aware Recommenders of Sequences of Touristic Activities

    Get PDF
    En els últims anys, els sistemes de recomanació s'han fet omnipresents a la xarxa. Molts serveis web, inclosa la transmissió de pel·lícules, la cerca web i el comerç electrònic, utilitzen sistemes de recomanació per facilitar la presa de decisions. El turisme és una indústria molt representada a la xarxa. Hi ha diversos serveis web (e.g. TripAdvisor, Yelp) que es beneficien de la integració de sistemes recomanadors per ajudar els turistes a explorar destinacions turístiques. Això ha augmentat la investigació centrada en la millora dels recomanadors turístics per resoldre els principals problemes als quals s'enfronten. Aquesta tesi proposa nous algorismes per a sistemes recomanadors turístics que aprenen les preferències dels turistes a partir dels seus missatges a les xarxes socials per suggerir una seqüència d'activitats turístiques que s'ajustin a diversos contextes i incloguin activitats afins. Per aconseguir-ho, proposem mètodes per identificar els turistes a partir de les seves publicacions a Twitter, identificant les activitats experimentades en aquestes publicacions i perfilant turistes similars en funció dels seus interessos, informació contextual i períodes d'activitat. Aleshores, els perfils d'usuari es combinen amb un algorisme de mineria de regles d'associació per capturar relacions implícites entre els punts d'interès de cada perfil. Finalment, es fa un rànquing de regles i un procés de selecció d'un conjunt d'activitats recomanables. Es va avaluar la precisió de les recomanacions i l'efecte del perfil d'usuari. A més, ordenem el conjunt d'activitats mitjançant un algorisme multi-objectiu per enriquir l'experiència turística. També realitzem una segona fase d'anàlisi dels fluxos turístics a les destinacions que és beneficiós per a les organitzacions de gestió de destinacions, que volen entendre la mobilitat turística. En general, els mètodes i algorismes proposats en aquesta tesi es mostren útils en diversos aspectes dels sistemes de recomanació turística.En los últimos años, los sistemas de recomendación se han vuelto omnipresentes en la web. Muchos servicios web, incluida la transmisión de películas, la búsqueda en la web y el comercio electrónico, utilizan sistemas de recomendación para ayudar a la toma de decisiones. El turismo es una industria altament representada en la web. Hay varios servicios web (e.g. TripAdvisor, Yelp) que se benefician de la inclusión de sistemas recomendadores para ayudar a los turistas a explorar destinos turísticos. Esto ha aumentado la investigación centrada en mejorar los recomendadores turísticos y resolver los principales problemas a los que se enfrentan. Esta tesis propone nuevos algoritmos para sistemas recomendadores turísticos que aprenden las preferencias de los turistas a partir de sus mensajes en redes sociales para sugerir una secuencia de actividades turísticas que se alinean con diversos contextos e incluyen actividades afines. Para lograr esto, proponemos métodos para identificar a los turistas a partir de sus publicaciones en Twitter, identificar las actividades experimentadas en estas publicaciones y perfilar turistas similares en función de sus intereses, contexto información y periodos de actividad. Luego, los perfiles de usuario se combinan con un algoritmo de minería de reglas de asociación para capturar relaciones entre los puntos de interés que aparecen en cada perfil. Finalmente, un proceso de clasificación de reglas y selección de actividades produce un conjunto de actividades recomendables. Se evaluó la precisión de las recomendaciones y el efecto de la elaboración de perfiles de usuario. Ordenamos además el conjunto de actividades utilizando un algoritmo multi-objetivo para enriquecer la experiencia turística. También llevamos a cabo un análisis de los flujos turísticos en los destinos, lo que es beneficioso para las organizaciones de gestión de destinos, que buscan entender la movilidad turística. En general, los métodos y algoritmos propuestos en esta tesis se muestran útiles en varios aspectos de los sistemas de recomendación turística.In recent years, recommender systems have become ubiquitous on the web. Many web services, including movie streaming, web search and e-commerce, use recommender systems to aid human decision-making. Tourism is one industry that is highly represented on the web. There are several web services (e.g. TripAdvisor, Yelp) that benefit from integrating recommender systems to aid tourists in exploring tourism destinations. This has increased research focused on improving tourism recommender systems and solving the main issues they face. This thesis proposes new algorithms for tourism recommender systems that learn tourist preferences from their social media data to suggest a sequence of touristic activities that align with various contexts and include affine activities. To accomplish this, we propose methods for identifying tourists from their frequent Twitter posts, identifying the activities experienced in these posts, and profiling similar tourists based on their interests, contextual information, and activity periods. User profiles are then combined with an association rule mining algorithm for capturing implicit relationships between points of interest apparent in each profile. Finally, a rule ranking and activity selection process produces a set of recommendable activities. The recommendations were evaluated for accuracy and the effect of user profiling. We further order the set of activities using a multi-objective algorithm to enrich the tourist experience. We also carry out a second-stage analysis of tourist flows at destinations which is beneficial to destination management organisations seeking to understand tourist mobility. Overall, the methods and algorithms proposed in this thesis are shown to be useful in various aspects of tourism recommender systems

    Big data-driven multimodal traffic management : trends and challenges

    Get PDF
    corecore