382 research outputs found

    Quality of Information in Mobile Crowdsensing: Survey and Research Challenges

    Full text link
    Smartphones have become the most pervasive devices in people's lives, and are clearly transforming the way we live and perceive technology. Today's smartphones benefit from almost ubiquitous Internet connectivity and come equipped with a plethora of inexpensive yet powerful embedded sensors, such as accelerometer, gyroscope, microphone, and camera. This unique combination has enabled revolutionary applications based on the mobile crowdsensing paradigm, such as real-time road traffic monitoring, air and noise pollution, crime control, and wildlife monitoring, just to name a few. Differently from prior sensing paradigms, humans are now the primary actors of the sensing process, since they become fundamental in retrieving reliable and up-to-date information about the event being monitored. As humans may behave unreliably or maliciously, assessing and guaranteeing Quality of Information (QoI) becomes more important than ever. In this paper, we provide a new framework for defining and enforcing the QoI in mobile crowdsensing, and analyze in depth the current state-of-the-art on the topic. We also outline novel research challenges, along with possible directions of future work.Comment: To appear in ACM Transactions on Sensor Networks (TOSN

    Rank Centrality: Ranking from Pair-wise Comparisons

    Full text link
    The question of aggregating pair-wise comparisons to obtain a global ranking over a collection of objects has been of interest for a very long time: be it ranking of online gamers (e.g. MSR's TrueSkill system) and chess players, aggregating social opinions, or deciding which product to sell based on transactions. In most settings, in addition to obtaining a ranking, finding `scores' for each object (e.g. player's rating) is of interest for understanding the intensity of the preferences. In this paper, we propose Rank Centrality, an iterative rank aggregation algorithm for discovering scores for objects (or items) from pair-wise comparisons. The algorithm has a natural random walk interpretation over the graph of objects with an edge present between a pair of objects if they are compared; the score, which we call Rank Centrality, of an object turns out to be its stationary probability under this random walk. To study the efficacy of the algorithm, we consider the popular Bradley-Terry-Luce (BTL) model (equivalent to the Multinomial Logit (MNL) for pair-wise comparisons) in which each object has an associated score which determines the probabilistic outcomes of pair-wise comparisons between objects. In terms of the pair-wise marginal probabilities, which is the main subject of this paper, the MNL model and the BTL model are identical. We bound the finite sample error rates between the scores assumed by the BTL model and those estimated by our algorithm. In particular, the number of samples required to learn the score well with high probability depends on the structure of the comparison graph. When the Laplacian of the comparison graph has a strictly positive spectral gap, e.g. each item is compared to a subset of randomly chosen items, this leads to dependence on the number of samples that is nearly order-optimal.Comment: 45 pages, 3 figure

    Econometrics meets sentiment : an overview of methodology and applications

    Get PDF
    The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

    Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities

    Get PDF
    One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.Comment: PhD thesis, Mar 201

    Comprehensive Evaluation of Matrix Factorization Models for Collaborative Filtering Recommender Systems

    Get PDF
    Matrix factorization models are the core of current commercial collaborative filtering Recommender Systems. This paper tested six representative matrix factorization models, using four collaborative filtering datasets. Experiments have tested a variety of accuracy and beyond accuracy quality measures, including prediction, recommendation of ordered and unordered lists, novelty, and diversity. Results show each convenient matrix factorization model attending to their simplicity, the required prediction quality, the necessary recommendation quality, the desired recommendation novelty and diversity, the need to explain recommendations, the adequacy of assigning semantic interpretations to hidden factors, the advisability of recommending to groups of users, and the need to obtain reliability values. To ensure the reproducibility of the experiments, an open framework has been used, and the implementation code is provided

    A Trust Management Framework for Decision Support Systems

    Get PDF
    In the era of information explosion, it is critical to develop a framework which can extract useful information and help people to make “educated” decisions. In our lives, whether we are aware of it, trust has turned out to be very helpful for us to make decisions. At the same time, cognitive trust, especially in large systems, such as Facebook, Twitter, and so on, needs support from computer systems. Therefore, we need a framework that can effectively, but also intuitively, let people express their trust, and enable the system to automatically and securely summarize the massive amounts of trust information, so that a user of the system can make “educated” decisions, or at least not blind decisions. Inspired by the similarities between human trust and physical measurements, this dissertation proposes a measurement theory based trust management framework. It consists of three phases: trust modeling, trust inference, and decision making. Instead of proposing specific trust inference formulas, this dissertation proposes a fundamental framework which is flexible and can be adapted by many different inference formulas. Validation experiments are done on two data sets: the Epinions.com data set and the Twitter data set. This dissertation also adapts the measurement theory based trust management framework for two decision support applications. In the first application, the real stock market data is used as ground truth for the measurement theory based trust management framework. Basically, the correlation between the sentiment expressed on Twitter and stock market data is measured. Compared with existing works which do not differentiate tweets’ authors, this dissertation analyzes trust among stock investors on Twitter and uses the trust network to differentiate tweets’ authors. The results show that by using the measurement theory based trust framework, Twitter sentiment valence is able to reflect abnormal stock returns better than treating all the authors as equally important or weighting them by their number of followers. In the second application, the measurement theory based trust management framework is used to help to detect and prevent from being attacked in cloud computing scenarios. In this application, each single flow is treated as a measurement. The simulation results show that the measurement theory based trust management framework is able to provide guidance for cloud administrators and customers to make decisions, e.g. migrating tasks from suspect nodes to trustworthy nodes, dynamically allocating resources according to trust information, and managing the trade-off between the degree of redundancy and the cost of resources

    Textual Analysis of Intangible Information

    Get PDF
    Traditionally, equity investors have relied upon the information reported in firms’ financial accounts to make their investment decisions. Due to the conservative nature of accounting standards, firms cannot value their intangible assets such as corporate culture, brand value and reputation. Investors’ efforts to collect such information have been hampered by the voluntary nature of Corporate Social Responsibility (CSR) reporting standards, which have resulted in the publication of inconsistent, stale and incomplete information across firms. In short, information on intangible assets is less salient to investors compared to accounting information because it is more costly to collect, process and analyse. In this thesis we design an automated approach to collect and quantify information on firms’ intangible assets by drawing upon techniques commonly adopted in the fields of Natural Language Processing (NLP) and Information Retrieval. The exploitation of unstructured data available on the Web holds promise for investors seeking to integrate a wider variety of information into their investment processes. The objectives of this research are: 1) to draw upon textual analysis methodologies to measure intangible information from a range of unstructured data sources, 2) to integrate intangible information and accounting information into an investment analysis framework, 3) evaluate the merits of unstructured data for the prediction of firms’ future earnings

    ADVERTISEMENT ALLOCATION AND TRUST MECHANISMS DESIGN IN SOCIAL NETWORKS

    Get PDF
    Social network sites (SNS), such as Facebook, Google+ and Twitter, have attracted hundreds of millions of users daily since their appearance. Within SNS, users connect to each other, express their identity, disseminate information and form cooperation by interacting with their connected peers. The increasing popularity and ubiquity of SNS usage and the invaluable user behaviors and connections give birth to many applications and business models. We look into several important problems within the social network ecosystem. The first one is the SNS advertisement allocation problem. The other two are related to trust mechanisms design in social network setting, including local trust inference and global trust evaluation. In SNS advertising, we study the problem of advertisement allocation from the ad platform's angle, and discuss its differences with the advertising model in the search engine setting. By leveraging the connection between social networks and hyperbolic geometry, we propose to solve the problem via approximation using hyperbolic embedding and convex optimization. A hyperbolic embedding method, \hcm, is designed for the SNS ad allocation problem, and several components are introduced to realize the optimization formulation. We show the advantages of our new approach in solving the problem compared to the baseline integer programming (IP) formulation. In studying the problem of trust mechanisms in social networks, we consider the existence of distrust (i.e. negative trust) relationships, and differentiate between the concept of local trust and global trust in social network setting. In the problem of local trust inference, we propose a 2-D trust model. Based on the model, we develop a semiring-based trust inference framework. In global trust evaluation, we consider a general setting with conflicting opinions, and propose a consensus-based approach to solve the complex problem in signed trust networks
    • …
    corecore