6 research outputs found

    Distance-based Analysis of Machine Learning Prediction Reliability for Datasets in Materials Science and Other Fields

    Full text link
    Despite successful use in a wide variety of disciplines for data analysis and prediction, machine learning (ML) methods suffer from a lack of understanding of the reliability of predictions due to the lack of transparency and black-box nature of ML models. In materials science and other fields, typical ML model results include a significant number of low-quality predictions. This problem is known to be particularly acute for target systems which differ significantly from the data used for ML model training. However, to date, a general method for characterization of the difference between the predicted and training system has not been available. Here, we show that a simple metric based on Euclidean feature space distance and sampling density allows effective separation of the accurately predicted data points from data points with poor prediction accuracy. We show that the metric effectiveness is enhanced by the decorrelation of the features using Gram-Schmidt orthogonalization. To demonstrate the generality of the method, we apply it to support vector regression models for various small data sets in materials science and other fields. Our method is computationally simple, can be used with any ML learning method and enables analysis of the sources of the ML prediction errors. Therefore, it is suitable for use as a standard technique for the estimation of ML prediction reliability for small data sets and as a tool for data set design

    Massive Multi-Agent Data-Driven Simulations of the GitHub Ecosystem

    Full text link
    Simulating and predicting planetary-scale techno-social systems poses heavy computational and modeling challenges. The DARPA SocialSim program set the challenge to model the evolution of GitHub, a large collaborative software-development ecosystem, using massive multi-agent simulations. We describe our best performing models and our agent-based simulation framework, which we are currently extending to allow simulating other planetary-scale techno-social systems. The challenge problem measured participant's ability, given 30 months of meta-data on user activity on GitHub, to predict the next months' activity as measured by a broad range of metrics applied to ground truth, using agent-based simulation. The challenge required scaling to a simulation of roughly 3 million agents producing a combined 30 million actions, acting on 6 million repositories with commodity hardware. It was also important to use the data optimally to predict the agent's next moves. We describe the agent framework and the data analysis employed by one of the winning teams in the challenge. Six different agent models were tested based on a variety of machine learning and statistical methods. While no single method proved the most accurate on every metric, the broadly most successful sampled from a stationary probability distribution of actions and repositories for each agent. Two reasons for the success of these agents were their use of a distinct characterization of each agent, and that GitHub users change their behavior relatively slowly

    Prediction of tinnitus treatment outcomes based on EEG sensors and TFI score using deep learning

    Get PDF
    Tinnitus is a hearing disorder that is characterized by the perception of sounds in the absence of an external source. Currently, there is no pharmaceutical cure for tinnitus, however, multiple therapies and interventions have been developed that improve or control associated distress and anxiety. We propose a new Artificial Intelligence (AI) algorithm as a digital prognostic health system that models electroencephalographic (EEG) data in order to predict patients’ responses to tinnitus therapies. The EEG data was collected from patients prior to treatment and 3-months following a sound-based therapy. Feature selection techniques were utilised to identify predictive EEG variables with the best accuracy. The patients’ EEG features from both the frequency and functional connectivity domains were entered as inputs that carry knowledge extracted from EEG into AI algorithms for training and predicting therapy outcomes. The AI models differentiated the patients’ outcomes into either therapy responder or non-responder, as defined by their Tinnitus Functional Index (TFI) scores, with accuracies ranging from 98%–100%. Our findings demonstrate the potential use of AI, including deep learning, for predicting therapy outcomes in tinnitus. The research suggests an optimal configuration of the EEG sensors that are involved in measuring brain functional changes in response to tinnitus treatments. It identified which EEG electrodes are the most informative sensors and how the EEG frequency and functional connectivity can better classify patients into the responder and non-responder groups. This has potential for real-time monitoring of patient therapy outcomes at home

    Identifying Complex Adaptive Systems Using Quantitative Approaches At A Midsized Biotechnology Firm

    Get PDF
    Rapid technological progress is becoming more challenging for organizations to implement and manage. The traditional, hierarchical leadership models are often inadequate to cope with continuous change, and the inability to keep up with the latest advances can quickly imperil a company. In particular, the field of biotechnology is currently experiencing revolutionary advances. Where hierarchical leadership models lapse, complexity theory and complexity leadership theory may provide an alternative leadership model for successful organizational adaptation. However, much of the research surrounding complexity theory remains academic. Historical data from a biotechnology company was analyzed during a computer hardware and software upgrade to detect the presence of a complex adaptive system, the fundamental component of complexity. Results showed that after the upgrade, animal care technicians did not significantly increase their collective efficiency; instead, they appeared to significantly increase their collective accuracy. This might indicate that the animal care technicians acted as a complex adaptive system in response to an environmental change. Insights into aggregate employee behavior through the lens of complexity theory might be useful to leadership seeking to successfully implement organizational change. Additionally, the adoption of complexity leadership doctrines by management might help create enhanced conditions to cultivate increased innovation and growth
    corecore