29,884 research outputs found

    On Cognitive Preferences and the Plausibility of Rule-based Models

    Get PDF
    It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus on plausibility and relation to interpretability, comprehensibility, and justifiabilit

    Data-driven design of intelligent wireless networks: an overview and tutorial

    Get PDF
    Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves

    A study on text-score disagreement in online reviews

    Get PDF
    In this paper, we focus on online reviews and employ artificial intelligence tools, taken from the cognitive computing field, to help understanding the relationships between the textual part of the review and the assigned numerical score. We move from the intuitions that 1) a set of textual reviews expressing different sentiments may feature the same score (and vice-versa); and 2) detecting and analyzing the mismatches between the review content and the actual score may benefit both service providers and consumers, by highlighting specific factors of satisfaction (and dissatisfaction) in texts. To prove the intuitions, we adopt sentiment analysis techniques and we concentrate on hotel reviews, to find polarity mismatches therein. In particular, we first train a text classifier with a set of annotated hotel reviews, taken from the Booking website. Then, we analyze a large dataset, with around 160k hotel reviews collected from Tripadvisor, with the aim of detecting a polarity mismatch, indicating if the textual content of the review is in line, or not, with the associated score. Using well established artificial intelligence techniques and analyzing in depth the reviews featuring a mismatch between the text polarity and the score, we find that -on a scale of five stars- those reviews ranked with middle scores include a mixture of positive and negative aspects. The approach proposed here, beside acting as a polarity detector, provides an effective selection of reviews -on an initial very large dataset- that may allow both consumers and providers to focus directly on the review subset featuring a text/score disagreement, which conveniently convey to the user a summary of positive and negative features of the review target.Comment: This is the accepted version of the paper. The final version will be published in the Journal of Cognitive Computation, available at Springer via http://dx.doi.org/10.1007/s12559-017-9496-

    StackInsights: Cognitive Learning for Hybrid Cloud Readiness

    Full text link
    Hybrid cloud is an integrated cloud computing environment utilizing a mix of public cloud, private cloud, and on-premise traditional IT infrastructures. Workload awareness, defined as a detailed full range understanding of each individual workload, is essential in implementing the hybrid cloud. While it is critical to perform an accurate analysis to determine which workloads are appropriate for on-premise deployment versus which workloads can be migrated to a cloud off-premise, the assessment is mainly performed by rule or policy based approaches. In this paper, we introduce StackInsights, a novel cognitive system to automatically analyze and predict the cloud readiness of workloads for an enterprise. Our system harnesses the critical metrics across the entire stack: 1) infrastructure metrics, 2) data relevance metrics, and 3) application taxonomy, to identify workloads that have characteristics of a) low sensitivity with respect to business security, criticality and compliance, and b) low response time requirements and access patterns. Since the capture of the data relevance metrics involves an intrusive and in-depth scanning of the content of storage objects, a machine learning model is applied to perform the business relevance classification by learning from the meta level metrics harnessed across stack. In contrast to traditional methods, StackInsights significantly reduces the total time for hybrid cloud readiness assessment by orders of magnitude

    Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

    Get PDF
    Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.Comment: 12 pages, 10 figures, Proceedings of the 2019 World Wide Web Conference (WWW '19

    Unveiling the multimedia unconscious: implicit cognitive processes and multimedia content analysis

    Get PDF
    One of the main findings of cognitive sciences is that automatic processes of which we are unaware shape, to a significant extent, our perception of the environment. The phenomenon applies not only to the real world, but also to multimedia data we consume every day. Whenever we look at pictures, watch a video or listen to audio recordings, our conscious attention efforts focus on the observable content, but our cognition spontaneously perceives intentions, beliefs, values, attitudes and other constructs that, while being outside of our conscious awareness, still shape our reactions and behavior. So far, multimedia technologies have neglected such a phenomenon to a large extent. This paper argues that taking into account cognitive effects is possible and it can also improve multimedia approaches. As a supporting proof-of-concept, the paper shows not only that there are visual patterns correlated with the personality traits of 300 Flickr users to a statistically significant extent, but also that the personality traits (both self-assessed and attributed by others) of those users can be inferred from the images these latter post as "favourite"
    • …
    corecore