21,098 research outputs found

    Accounting for Availability Biases in Information Visualization

    Get PDF
    International audienceThe availability heuristic is a strategy that people use to make quick decisions but often lead to systematic errors. We propose three ways that visualization could facilitate unbiased decision-making. First, visualizations can alter the way our memory stores the events for later recall, so as to improve users' long-term intuitions. Second, the known biases could lead to new visualization guidelines. Third, we suggest the design of decision-making tools that are inspired by heuristics, e.g. suggesting intuitive approximations, rather than target to present exhaustive comparisons of all possible outcomes, or automated solutions for choosing decisions

    Methods for Joint Normalization and Comparison of Hi-C data

    Get PDF
    The development of chromatin conformation capture technology has opened new avenues of study into the 3D structure and function of the genome. Chromatin structure is known to influence gene regulation, and differences in structure are now emerging as a mechanism of regulation between, e.g., cell differentiation and disease vs. normal states. Hi-C sequencing technology now provides a way to study the 3D interactions of the chromatin over the whole genome. However, like all sequencing technologies, Hi-C suffers from several forms of bias stemming from both the technology and the DNA sequence itself. Several normalization methods have been developed for normalizing individual Hi-C datasets, but little work has been done on developing joint normalization methods for comparing two or more Hi-C datasets. To make full use of Hi-C data, joint normalization and statistical comparison techniques are needed to carry out experiments to identify regions where chromatin structure differs between conditions. We develop methods for the joint normalization and comparison of two Hi-C datasets, which we then extended to more complex experimental designs. Our normalization method is novel in that it makes use of the distance-dependent nature of chromatin interactions. Our modification of the Minus vs. Average (MA) plot to the Minus vs. Distance (MD) plot allows for a nonparametric data-driven normalization technique using loess smoothing. Additionally, we present a simple statistical method using Z-scores for detecting differentially interacting regions between two datasets. Our initial method was published as the Bioconductor R package HiCcompare [http://bioconductor.org/packages/HiCcompare/](http://bioconductor.org/packages/HiCcompare/). We then further extended our normalization and comparison method for use in complex Hi-C experiments with more than two datasets and optional covariates. We extended the normalization method to jointly normalize any number of Hi-C datasets by using a cyclic loess procedure on the MD plot. The cyclic loess normalization technique can remove between dataset biases efficiently and effectively even when several datasets are analyzed at one time. Our comparison method implements a generalized linear model-based approach for comparing complex Hi-C experiments, which may have more than two groups and additional covariates. The extended methods are also available as a Bioconductor R package [http://bioconductor.org/packages/multiHiCcompare/](http://bioconductor.org/packages/multiHiCcompare/). Finally, we demonstrate the use of HiCcompare and multiHiCcompare in several test cases on real data in addition to comparing them to other similar methods (https://doi.org/10.1002/cpbi.76)

    Global estimation of child mortality using a Bayesian B-spline Bias-reduction model

    Full text link
    Estimates of the under-five mortality rate (U5MR) are used to track progress in reducing child mortality and to evaluate countries' performance related to Millennium Development Goal 4. However, for the great majority of developing countries without well-functioning vital registration systems, estimating the U5MR is challenging due to limited data availability and data quality issues. We describe a Bayesian penalized B-spline regression model for assessing levels and trends in the U5MR for all countries in the world, whereby biases in data series are estimated through the inclusion of a multilevel model to improve upon the limitations of current methods. B-spline smoothing parameters are also estimated through a multilevel model. Improved spline extrapolations are obtained through logarithmic pooling of the posterior predictive distribution of country-specific changes in spline coefficients with observed changes on the global level. The proposed model is able to flexibly capture changes in U5MR over time, gives point estimates and credible intervals reflecting potential biases in data series and performs reasonably well in out-of-sample validation exercises. It has been accepted by the United Nations Inter-agency Group for Child Mortality Estimation to generate estimates for all member countries.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS768 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Implications to the Audit Process of Auditing that uses Data Analytics Tools and New Business Models

    Get PDF
    Paper II is excluded from the dissertation until it is published.New advances in information technology have created a wave of technological innovations which affect the audit firms. Audit firms are now investing large sums of money to acquire and adopt data analytics tools. Using three studies in this dissertation, I investigated questions relating to the impact of digital tools in the audit process. These studies are briefly summarized below. The first study investigates whether the audit evidence from a process mining tool provides information that adds to the appropriateness (relevance) of the audit evidence collected by traditional analytical procedures. The results shows that auditors do perceive evidence from a process mining tool to express information that is relevant for both the planning and substantive stages of the audit even though the auditor’s risk assessment was higher in the substantive stage as compared to the planning stage. In addition, the results also shows that there was no difference in the auditor’s assessment of the relevance of the information presented in graph format and in written text format as both are considered equally relevant in the planning and substantive stages. The second study investigates the unintended consequences in auditor’s decision making of using digital tools with powerful visualization abilities in the audit process. Specifically, the study investigates whether auditors make their decisions based on the relevance of the information to the decision to be made when using both visual audit evidence and text evidence or their decision will be based on a bias. The results shows that when auditors are presented with different information presented in different formats (visual or text), they are most likely to use the piece of information presented in visual rather than using the piece of audit evidence which is relevant to the decision. The third paper analyses the fraud case of a financial technology company Wirecard using the fraud triangle as the theoretical framework. The results shows that of the three factors identified in the fraud triangle, opportunity was the most prevalent factor and rationalization was least observable.publishedVersio

    Crisis Analytics: Big Data Driven Crisis Response

    Get PDF
    Disasters have long been a scourge for humanity. With the advances in technology (in terms of computing, communications, and the ability to process and analyze big data), our ability to respond to disasters is at an inflection point. There is great optimism that big data tools can be leveraged to process the large amounts of crisis-related data (in the form of user generated data in addition to the traditional humanitarian data) to provide an insight into the fast-changing situation and help drive an effective disaster response. This article introduces the history and the future of big crisis data analytics, along with a discussion on its promise, challenges, and pitfalls

    Reasoning Under Uncertainty: Towards Collaborative Interactive Machine Learning

    Get PDF
    In this paper, we present the current state-of-the-art of decision making (DM) and machine learning (ML) and bridge the two research domains to create an integrated approach of complex problem solving based on human and computational agents. We present a novel classification of ML, emphasizing the human-in-the-loop in interactive ML (iML) and more specific on collaborative interactive ML (ciML), which we understand as a deep integrated version of iML, where humans and algorithms work hand in hand to solve complex problems. Both humans and computers have specific strengths and weaknesses and integrating humans into machine learning processes might be a very efficient way for tackling problems. This approach bears immense research potential for various domains, e.g., in health informatics or in industrial applications. We outline open questions and name future challenges that have to be addressed by the research community to enable the use of collaborative interactive machine learning for problem solving in a large scale
    • …
    corecore