848 research outputs found

    Mining User Interests from Social Media

    Get PDF
    Social media users readily share their preferences, life events, sentiment and opinions, and implicitly signal their thoughts, feelings, and psychological behavior. This makes social media a viable source of information to accurately and effectively mine users' interests with the hopes of enabling more effective user engagement, better quality delivery of appropriate services and higher user satisfaction. In this tutorial, we cover five important aspects related to the effective mining of user interests: (1) the foundations of social user interest modeling, such as information sources, various types of representation models and temporal features, (2) techniques that have been adopted or proposed for mining user interests, (3) different evaluation methodologies and benchmark datasets, (4) different applications that have been taking advantage of user interest mining from social media platforms, and (5) existing challenges, open research questions and exciting opportunities for further work

    Exploring dynamics and semantics of user interests for user modeling on Twitter for link recommendations

    Get PDF
    User modeling for individual users on the Social Web plays an important role and is a fundamental step for personalization as well as recommendations. Recent studies have proposed different user modeling strategies considering various dimensions such as temporal dynamics and semantics of user interests. Although previous work proposed different user modeling strategies considering the temporal dynamics of user interests, there is a lack of comparative studies on those methods and therefore the comparative performance over each other is unknown. In terms of semantics of user interests, background knowledge from DBpedia has been explored to enrich user interest profiles so as to reveal more information about users. However, it is still unclear to what extent different types of information from DBpedia contribute to the enrichment of user interest profiles. In this paper, we propose user modeling strategies which use Concept Frequency - Inverse Document Frequency (CF-IDF) as a weighting scheme and incorporate either or both of the dynamics and semantics of user interests. To this end, we first provide a comparative study on different user modeling strategies considering the dynamics of user interests in previous literature to present their comparative performance. In addition, we investigate different types of information (i.e., categories, classes and connected entities via various properties) for entities from DBpedia and the combination of them for extending user interest profiles. Finally, we build our user modeling strategies incorporating either or both of the best performing methods in each dimension. Results show that our strategies outperform two baseline strategies significantly in the context of link recommendations on Twitter

    Inferring user interests in microblogging social networks: a survey

    Get PDF
    With the growing popularity of microblogging services such as Twitter in recent years, an increasing number of users are using these services in their daily lives. The huge volume of information generated by users raises new opportunities in various applications and areas. Inferring user interests plays a significant role in providing personalized recommendations on microblogging services, and also on third-party applications providing social logins via these services, especially in cold-start situations. In this survey, we review user modeling strategies with respect to inferring user interests from previous studies. To this end, we focus on four dimensions of inferring user interest profiles: (1) data collection, (2) representation of user interest profiles, (3) construction and enhancement of user interest profiles, and (4) the evaluation of the constructed profiles. Through this survey, we aim to provide an overview of state-of-the-art user modeling strategies for inferring user interest profiles on microblogging social networks with respect to the four dimensions. For each dimension, we review and summarize previous studies based on specified criteria. Finally, we discuss some challenges and opportunities for future work in this research domain

    An Information Diffusion-Based Recommendation Framework for Micro-Blogging

    Get PDF
    Micro-blogging is increasingly evolving from a daily chatting tool into a critical platform for individuals and organizations to seek and share real-time news updates during emergencies. However, seeking and extracting useful information from micro-blogging sites poses significant challenges due to the volume of the traffic and the presence of a large body of irrelevant personal messages and spam. In this paper, we propose a novel recommendation framework to overcome this problem. By analyzing information diffusion patterns among a large set of micro-blogs that play the role of emergency news providers, our approach selects a small subset as recommended emergency news feeds for regular users. We evaluate our diffusion-based recommendation framework on Twitter during the early outbreak of H1N1 Flu. The evaluation results show that our method results in more balanced and comprehensive recommendations compared to benchmark approaches

    Generic adaptation framework for unifying adaptive web-based systems

    Get PDF
    The Generic Adaptation Framework (GAF) research project first and foremost creates a common formal framework for describing current and future adaptive hypermedia (AHS) and adaptive webbased systems in general. It provides a commonly agreed upon taxonomy and a reference model that encompasses the most general architectures of the present and future, including conventional AHS, and different types of personalization-enabling systems and applications such as recommender systems (RS) personalized web search, semantic web enabled applications used in personalized information delivery, adaptive e-Learning applications and many more. At the same time GAF is trying to bring together two (seemingly not intersecting) views on the adaptation: a classical pre-authored type, with conventional domain and overlay user models and data-driven adaptation which includes a set of data mining, machine learning and information retrieval tools. To bring these research fields together we conducted a number GAF compliance studies including RS, AHS, and other applications combining adaptation, recommendation and search. We also performed a number of real systems’ case-studies to prove the point and perform a detailed analysis and evaluation of the framework. Secondly it introduces a number of new ideas in the field of AH, such as the Generic Adaptation Process (GAP) which aligns with a layered (data-oriented) architecture and serves as a reference adaptation process. This also helps to understand the compliance features mentioned earlier. Besides that GAF deals with important and novel aspects of adaptation enabling and leveraging technologies such as provenance and versioning. The existence of such a reference basis should stimulate AHS research and enable researchers to demonstrate ideas for new adaptation methods much more quickly than if they had to start from scratch. GAF will thus help bootstrap any adaptive web-based system research, design, analysis and evaluation

    FRECOMTWEET: PRODUCT RECOMMENDATION APPLICATION USING FRIENDSHIP CLOSENESS ON TWITTER

    Get PDF
    The information and communication technology development makes someone interact with each other easier. This convenience is used to exchange ideas, like using social media Twitter for product recommendations before buying it. It brings up a trend that consumers seek product recommendations through other people on social media. Social media, especially Twitter, has several features such as tweets, ReTweet and mentions to interact with other people. Users can describe the product, attach a link, and give a positive or negative rating in a tweet. These types of tweets can be used as an alternative to product recommendations. FrecomTweet is an Android-based product recommendation application that can detect close friendships based on the user’s ReTweet and mentions. This application also detects a product recommendation that appears in a conversation between users. This detection uses the keyword filtering method, which matches the conversation content with the markers in the database. If the conversation has a positive rating, it will recommend the user’s closest friends. This research uses a crawling method with the Twitter API streaming filter built using the CodeIgniter framework. The results of the black box test show that Twitter user conversations can be used as a product recommendation with a precision and recall value of 0.94 and 0.81, respectively

    Analysis and assessment of a knowledge based smart city architecture providing service APIs

    Get PDF
    Abstract The main technical issues regarding smart city solutions are related to data gathering, aggregation, reasoning, data analytics, access, and service delivering via Smart City APIs (Application Program Interfaces). Different kinds of Smart City APIs enable smart city services and applications, while their effectiveness depends on the architectural solutions to pass from data to services for city users and operators, exploiting data analytics, and presenting services via APIs. Therefore, there is a strong activity on defining smart city architectures to cope with this complexity, putting in place a significant range of different kinds of services and processes. In this paper, the work performed in the context of Sii-Mobility smart city project on defining a smart city architecture addressing a wide range of processes and data is presented. To this end, comparisons of the state of the art solutions of smart city architectures for data aggregation and for Smart City API are presented by putting in evidence the usage semantic ontologies and knowledge base in the data aggregation in the production of smart services. The solution proposed aggregate and re-conciliate data (open and private, static and real time) by using reasoning/smart algorithms for enabling sophisticated service delivering via Smart City API. The work presented has been developed in the context of the Sii-Mobility national smart city project on mobility and transport integrated with smart city services with the aim of reaching a more sustainable mobility and transport systems. Sii-Mobility is grounded on Km4City ontology and tools for smart city data aggregation, analytics support and service production exploiting smart city API. To this end, Sii-Mobility/Km4City APIs have been compared to the state of the art solutions. Moreover, the proposed architecture has been assessed in terms of performance, computational and network costs in terms of measures that can be easily performed on private cloud on premise. The computational costs and workloads of the data ingestion and data analytics processes have been assessed to identify suitable measures to estimate needed resources. Finally, the API consumption related data in the recent period are presented

    An Information Diffusion-Based Recommendation Framework for Micro-Blogging

    Full text link

    Data Mining Algorithms for Internet Data: from Transport to Application Layer

    Get PDF
    Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data

    Detecting, Modeling, and Predicting User Temporal Intention

    Get PDF
    The content of social media has grown exponentially in the recent years and its role has evolved from narrating life events to actually shaping them. Unfortunately, content posted and shared in social networks is vulnerable and prone to loss or change, rendering the context associated with it (a tweet, post, status, or others) meaningless. There is an inherent value in maintaining the consistency of such social records as in some cases they take over the task of being the first draft of history as collections of these social posts narrate the pulse of the street during historic events, protest, riots, elections, war, disasters, and others as shown in this work. The user sharing the resource has an implicit temporal intent: either the state of the resource at the time of sharing, or the current state of the resource at the time of the reader \clicking . In this research, we propose a model to detect and predict the user\u27s temporal intention of the author upon sharing content in the social network and of the reader upon resolving this content. To build this model, we first examine the three aspects of the problem: the resource, time, and the user. For the resource we start by analyzing the content on the live web and its persistence. We noticed that a portion of the resources shared in social media disappear, and with further analysis we unraveled a relationship between this disappearance and time. We lose around 11% of the resources after one year of sharing and a steady 7% every following year. With this, we turn to the public archives and our analysis reveals that not all posted resources are archived and even they were an average 8% per year disappears from the archives and in some cases the archived content is heavily damaged. These observations prove that in regards to archives resources are not well-enough populated to consistently and reliably reconstruct the missing resource as it existed at the time of sharing. To analyze the concept of time we devised several experiments to estimate the creation date of the shared resources. We developed Carbon Date, a tool which successfully estimated the correct creation dates for 76% of the test sets. Since the resources\u27 creation we wanted to measure if and how they change with time. We conducted a longitudinal study on a data set of very recently-published tweet-resource pairs and recording observations hourly. We found that after just one hour, ~4% of the resources have changed by ≥30% while after a day the change rate slowed to be ~12% of the resources changed by ≥40%. In regards to the third and final component of the problem we conducted user behavioral analysis experiments and built a data set of 1,124 instances manually assigned by test subjects. Temporal intention proved to be a difficult concept for average users to understand. We developed our Temporal Intention Relevancy Model (TIRM) to transform the highly subjective temporal intention problem into the more easily understood idea of relevancy between a tweet and the resource it links to, and change of the resource through time. On our collected data set TIRM produced a significant 90.27% success rate. Furthermore, we extended TIRM and used it to build a time-based model to predict temporal intention change or steadiness at the time of posting with 77% accuracy. We built a service API around this model to provide predictions and a few prototypes. Future tools could implement TIRM to assist users in pushing copies of shared resources into public web archives to ensure the integrity of the historical record. Additional tools could be used to assist the mining of the existing social media corpus by derefrencing the intended version of the shared resource based on the intention strength and the time between the tweeting and mining
    • …
    corecore