8,817 research outputs found

    Peeking into the other half of the glass : handling polarization in recommender systems.

    Get PDF
    This dissertation is about filtering and discovering information online while using recommender systems. In the first part of our research, we study the phenomenon of polarization and its impact on filtering and discovering information. Polarization is a social phenomenon, with serious consequences, in real-life, particularly on social media. Thus it is important to understand how machine learning algorithms, especially recommender systems, behave in polarized environments. We study polarization within the context of the users\u27 interactions with a space of items and how this affects recommender systems. We first formalize the concept of polarization based on item ratings and then relate it to the item reviews, when available. We then propose a domain independent data science pipeline to automatically detect polarization using the ratings rather than the properties, typically used to detect polarization, such as item\u27s content or social network topology. We perform an extensive comparison of polarization measures on several benchmark data sets and show that our polarization detection framework can detect different degrees of polarization and outperforms existing measures in capturing an intuitive notion of polarization. We also investigate and uncover certain peculiar patterns that are characteristic of environments where polarization emerges: A machine learning algorithm finds it easier to learn discriminating models in polarized environments: The models will quickly learn to keep each user in the safety of their preferred viewpoint, essentially, giving rise to filter bubbles and making them easier to learn. After quantifying the extent of polarization in current recommender system benchmark data, we propose new counter-polarization approaches for existing collaborative filtering recommender systems, focusing particularly on the state of the art models based on Matrix Factorization. Our work represents an essential step toward the new research area concerned with quantifying, detecting and counteracting polarization in human-generated data and machine learning algorithms.We also make a theoretical analysis of how polarization affects learning latent factor models, and how counter-polarization affects these models. In the second part of our dissertation, we investigate the problem of discovering related information by recommendation of tags on social media micro-blogging platforms. Real-time micro-blogging services such as Twitter have recently witnessed exponential growth, with millions of active web users who generate billions of micro-posts to share information, opinions and personal viewpoints, daily. However, these posts are inherently noisy and unstructured because they could be in any format, hence making them difficult to organize for the purpose of retrieval of relevant information. One way to solve this problem is using hashtags, which are quickly becoming the standard approach for annotation of various information on social media, such that varied posts about the same or related topic are annotated with the same hashtag. However hashtags are not used in a consistent manner and most importantly, are completely optional to use. This makes them unreliable as the sole mechanism for searching for relevant information. We investigate mechanisms for consolidating the hashtag space using recommender systems. Our methods are general enough that they can be used for hashtag annotation in various social media services such as twitter, as well as for general item recommendations on systems that rely on implicit user interest data such as e-learning and news sites, or explicit user ratings, such as e-commerce and online entertainment sites. To conclude, we propose a methodology to extract stories based on two types of hashtag co-occurrence graphs. Our research in hashtag recommendation was able to exploit the textual content that is available as part of user messages or posts, and thus resulted in hybrid recommendation strategies. Using content within this context can bridge polarization boundaries. However, when content is not available, is missing, or is unreliable, as in the case of platforms that are rich in multimedia and multilingual posts, the content option becomes less powerful and pure collaborative filtering regains its important role, along with the challenges of polarization

    Perceiving University Student's Opinions from Google App Reviews

    Full text link
    Google app market captures the school of thought of users from every corner of the globe via ratings and text reviews, in a multilinguistic arena. The potential information from the reviews cannot be extracted manually, due to its exponential growth. So, Sentiment analysis, by machine learning and deep learning algorithms employing NLP, explicitly uncovers and interprets the emotions. This study performs the sentiment classification of the app reviews and identifies the university student's behavior towards the app market via exploratory analysis. We applied machine learning algorithms using the TP, TF, and TF IDF text representation scheme and evaluated its performance on Bagging, an ensemble learning method. We used word embedding, Glove, on the deep learning paradigms. Our model was trained on Google app reviews and tested on Student's App Reviews(SAR). The various combinations of these algorithms were compared amongst each other using F score and accuracy and inferences were highlighted graphically. SVM, amongst other classifiers, gave fruitful accuracy(93.41%), F score(89%) on bigram and TF IDF scheme. Bagging enhanced the performance of LR and NB with accuracy of 87.88% and 86.69% and F score of 86% and 78% respectively. Overall, LSTM on Glove embedding recorded the highest accuracy(95.2%) and F score(88%).Comment: Accepted in Concurrency and Computation Practice and Experienc

    Analyzing the opinions and emotions of Internet customers using deep ensemble learning based on rbm

    Get PDF
    Background: The emotions and opinions of Internet users are critical, as they directly influence the provision of proper services. The aim of this study was analyzing the opinions and emotions of internet customers using deep ensemble learning based on rbm. Methods: Method of this study was based on the deep ensemble learning technique which uses a deep ensemble neural network based on Gaussian restricted Boltzmann machine and cost-sensitive tree the opinions and emotions of Internet customers were analyzed in terms of semantics and linguistics in virtual shops. To analyze behavior or emotions, existing algorithms were divided into groups of semantic approach, language approach and machine learning. The semantic, linguistic and group learning aspects (machine learning) were considered together. The opinions, feelings, and behaviors of Internet customers were analyzed. The proposed method was implemented in MATLAB software. To evaluate this method, conventional criteria that were /applied in data mining applications have been used including accuracy, recall, and F score. Results: Based on the experiments performed and by evaluating this method against individual and ensemble methods plus the approaches presented in data mining so far, it was revealed that the proposed model outperforms other methods regarding data mining assessment criteria. Conclusion: Based on social engineering, the suggested model is provided to forecast consumer behavior. In addition to analyzing customers' behavior which examined their emotions and feelings based on their opinions. The results of this study can be used by planners in the field of competitive internet markets

    Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation

    Get PDF
    Peer reviewe

    Biases in scholarly recommender systems: impact, prevalence, and mitigation

    Get PDF
    We create a simulated financial market and examine the effect of different levels of active and passive investment on fundamental market efficiency. In our simulated market, active, passive, and random investors interact with each other through issuing orders. Active and passive investors select their portfolio weights by optimizing Markowitz-based utility functions. We find that higher fractions of active investment within a market lead to an increased fundamental market efficiency. The marginal increase in fundamental market efficiency per additional active investor is lower in markets with higher levels of active investment. Furthermore, we find that a large fraction of passive investors within a market may facilitate technical price bubbles, resulting in market failure. By examining the effect of specific parameters on market outcomes, we find that that lower transaction costs, lower individual forecasting errors of active investors, and less restrictive portfolio constraints tend to increase fundamental market efficiency in the market

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF
    • …
    corecore