16,328 research outputs found

    Inferring individual attributes from search engine queries and auxiliary information

    Full text link
    Internet data has surfaced as a primary source for investigation of different aspects of human behavior. A crucial step in such studies is finding a suitable cohort (i.e., a set of users) that shares a common trait of interest to researchers. However, direct identification of users sharing this trait is often impossible, as the data available to researchers is usually anonymized to preserve user privacy. To facilitate research on specific topics of interest, especially in medicine, we introduce an algorithm for identifying a trait of interest in anonymous users. We illustrate how a small set of labeled examples, together with statistical information about the entire population, can be aggregated to obtain labels on unseen examples. We validate our approach using labeled data from the political domain. We provide two applications of the proposed algorithm to the medical domain. In the first, we demonstrate how to identify users whose search patterns indicate they might be suffering from certain types of cancer. In the second, we detail an algorithm to predict the distribution of diseases given their incidence in a subset of the population at study, making it possible to predict disease spread from partial epidemiological data

    AladynPi Adaptive Neural Network Molecular Dynamics Simulation Code with Physically Informed Potential: Computational Materials Mini-Application

    Get PDF
    This report provides an overview and description of commands used in the Computational Materials mini-application, AladynPi. AladynPi is an extension of a previously released mini-application, Aladyn (https://github.com/nasa/aladyn; Yamakov, V.I., and Glaessgen, E.H., NASA/TM-2018-220104). Aladyn and AladynPi are basic molecular dynamics codes written in FORTRAN 2003, which are designed to demonstrate the use of adaptive neural networks (ANNs) in atomistic simulations. The role of ANNs is to efficiently reproduce the very complex energy landscape resulting from the atomic interactions in materials with the accuracy of the more expensive quantum mechanics-based calculations. The ANN is trained on a large set of atomic structures calculated using the density functional theory method. An input for the ANN is a set of structure coefficients, characterizing the local atomic environment of each atom, for which the atomic energy is obtained in the ANN inference process. In Aladyn, the ANN gives directly the energy of interatomic interactions. In AladynPi, the ANN gives optimized parameters for a predefined empirical function, known as bond-order-potential (BOP). The parameterized BOP function is then used to calculate the energy. AladynPi code is being released to serve as a training testbed for students and professors in academia to explore possible optimization algorithms for parallel computing on multicore central processing unit (CPU) computers or computers utilizing manycore architectures based on graphic processing units (GPUs). The effort is supported by the High Performance Computing incubator (HPCi) project at NASA Langley Research Center

    Soft computing for intelligent data analysis

    Get PDF
    Intelligent data analysis (IDA) is an interdisciplinary study concerned with the effective analysis of data. The paper briefly looks at some of the key issues in intelligent data analysis, discusses the opportunities for soft computing in this context, and presents several IDA case studies in which soft computing has played key roles. These studies are all concerned with complex real-world problem solving, including consistency checking between mass spectral data with proposed chemical structures, screening for glaucoma and other eye diseases, forecasting of visual field deterioration, and diagnosis in an oil refinery involving multivariate time series. Bayesian networks, evolutionary computation, neural networks, and machine learning in general are some of those soft computing techniques effectively used in these studies

    The Incremental Cooperative Design of Preventive Healthcare Networks

    Get PDF
    This document is the Accepted Manuscript version of the following article: Soheil Davari, 'The incremental cooperative design of preventive healthcare networks', Annals of Operations Research, first published online 27 June 2017. Under embargo. Embargo end date: 27 June 2018. The final publication is available at Springer via http://dx.doi.org/10.1007/s10479-017-2569-1.In the Preventive Healthcare Network Design Problem (PHNDP), one seeks to locate facilities in a way that the uptake of services is maximised given certain constraints such as congestion considerations. We introduce the incremental and cooperative version of the problem, IC-PHNDP for short, in which facilities are added incrementally to the network (one at a time), contributing to the service levels. We first develop a general non-linear model of this problem and then present a method to make it linear. As the problem is of a combinatorial nature, an efficient Variable Neighbourhood Search (VNS) algorithm is proposed to solve it. In order to gain insight into the problem, the computational studies were performed with randomly generated instances of different settings. Results clearly show that VNS performs well in solving IC-PHNDP with errors not more than 1.54%.Peer reviewe

    Probabilities and health risks: a qualitative approach

    Get PDF
    Health risks, defined in terms of the probability that an individual will suffer a particular type of adverse health event within a given time period, can be understood as referencing either natural entities or complex patterns of belief which incorporate the observer's values and knowledge, the position adopted in the present paper. The subjectivity inherent in judgements about adversity and time frames can be easily recognised, but social scientists have tended to accept uncritically the objectivity of probability. Most commonly in health risk analysis, the term probability refers to rates established by induction, and so requires the definition of a numerator and denominator. Depending upon their specification, many probabilities may be reasonably postulated for the same event, and individuals may change their risks by deciding to seek or avoid information. These apparent absurdities can be understood if probability is conceptualised as the projection of expectation onto the external world. Probabilities based on induction from observed frequencies provide glimpses of the future at the price of acceptance of the simplifying heuristic that statistics derived from aggregate groups can be validly attributed to individuals within them. The paper illustrates four implications of this conceptualisation of probability with qualitative data from a variety of sources, particularly a study of genetic counselling for pregnant women in a U.K. hospital. Firstly, the official selection of a specific probability heuristic reflects organisational constraints and values as well as predictive optimisation. Secondly, professionals and service users must work to maintain the facticity of an established heuristic in the face of alternatives. Thirdly, individuals, both lay and professional, manage probabilistic information in ways which support their strategic objectives. Fourthly, predictively sub-optimum schema, for example the idea of AIDS as a gay plague, may be selected because they match prevailing social value systems

    Fast search of sequences with complex symbol correlations using profile context-sensitive HMMS and pre-screening filters

    Get PDF
    Recently, profile context-sensitive HMMs (profile-csHMMs) have been proposed which are very effective in modeling the common patterns and motifs in related symbol sequences. Profile-csHMMs are capable of representing long-range correlations between distant symbols, even when these correlations are entangled in a complicated manner. This makes profile-csHMMs an useful tool in computational biology, especially in modeling noncoding RNAs (ncRNAs) and finding new ncRNA genes. However, a profile-csHMM based search is quite slow, hence not practical for searching a large database. In this paper, we propose a practical scheme for making the search speed significantly faster without any degradation in the prediction accuracy. The proposed method utilizes a pre-screening filter based on a profile-HMM, which filters out most sequences that will not be predicted as a match by the original profile-csHMM. Experimental results show that the proposed approach can make the search speed eighty times faster

    The Effects of Total Sleep Deprivation on Bayesian Updating

    Get PDF
    Recent evidence suggests that nearly 25% of U.S. adults (47 million) suffer from some level of sleep deprivation. The impact of this sleep deprivation on the U.S. economy includes direct medical expenses related to sleep deprivation and related disorders, the cost of accidents, and the cost of reduced worker productivity. Sleep research has examined the effects of sleep deprivation on a number of performance measures, but the effects of sleep deprivation on decision-making under uncertainty are largely unknown. In this article, subjects perform a decision task (Grether, 1980) in both a well-rested and experimentally sleep-deprived state. The experimental task allows us to explore the extent to which subjects weight prior odds versus new evidence (i.e., information) when forming subjective (posterior) beliefs of a particular event. Wellrested subjects display a tendency to overweight the evidence in forming subjective posterior probability estimates, which is inconsistent with Bayes rule but possibly consistent with use of a ‘representativeness’ heuristic. In his original Bayes rule experiment, Grether (1980) also found that typical student-subjects overweighted the evidence relative to the prior odds in making posterior assessments. Ironically, behavior following sleep-deprivation is more consistent with the use of Bayes rule, because this treatment significantly reduces the (over)weight that subjects place on the new evidence. Because choice accuracy is not significantly affected by sleep deprivation, the significant difference in estimated decision-model parameters may indicate that the brain compensates under adversity in certain risky choice decision environments.
    corecore