16,567 research outputs found

    t-Exponential Memory Networks for Question-Answering Machines

    Full text link
    Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describ- ing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances, by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques

    Methods of Technical Prognostics Applicable to Embedded Systems

    Get PDF
    Hlavní cílem dizertace je poskytnutí uceleného pohledu na problematiku technické prognostiky, která nachází uplatnění v tzv. prediktivní údržbě založené na trvalém monitorování zařízení a odhadu úrovně degradace systému či jeho zbývající životnosti a to zejména v oblasti komplexních zařízení a strojů. V současnosti je technická diagnostika poměrně dobře zmapovaná a reálně nasazená na rozdíl od technické prognostiky, která je stále rozvíjejícím se oborem, který ovšem postrádá větší množství reálných aplikaci a navíc ne všechny metody jsou dostatečně přesné a aplikovatelné pro embedded systémy. Dizertační práce přináší přehled základních metod použitelných pro účely predikce zbývající užitné životnosti, jsou zde popsány metriky pomocí, kterých je možné jednotlivé přístupy porovnávat ať už z pohledu přesnosti, ale také i z pohledu výpočetní náročnosti. Jedno z dizertačních jader tvoří doporučení a postup pro výběr vhodné prognostické metody s ohledem na prognostická kritéria. Dalším dizertačním jádrem je představení tzv. částicového filtrovaní (particle filtering) vhodné pro model-based prognostiku s ověřením jejich implementace a porovnáním. Hlavní dizertační jádro reprezentuje případovou studii pro velmi aktuální téma prognostiky Li-Ion baterii s ohledem na trvalé monitorování. Případová studie demonstruje proces prognostiky založené na modelu a srovnává možné přístupy jednak pro odhad doby před vybitím baterie, ale také sleduje možné vlivy na degradaci baterie. Součástí práce je základní ověření modelu Li-Ion baterie a návrh prognostického procesu.The main aim of the thesis is to provide a comprehensive overview of technical prognosis, which is applied in the condition based maintenance, based on continuous device monitoring and remaining useful life estimation, especially in the field of complex equipment and machinery. Nowadays technical prognosis is still evolving discipline with limited number of real applications and is not so well developed as technical diagnostics, which is fairly well mapped and deployed in real systems. Thesis provides an overview of basic methods applicable for prediction of remaining useful life, metrics, which can help to compare the different approaches both in terms of accuracy and in terms of computational/deployment cost. One of the research cores consists of recommendations and guide for selecting the appropriate forecasting method with regard to the prognostic criteria. Second thesis research core provides description and applicability of particle filtering framework suitable for model-based forecasting. Verification of their implementation and comparison is provided. The main research topic of the thesis provides a case study for a very actual Li-Ion battery health monitoring and prognostics with respect to continuous monitoring. The case study demonstrates the prognostic process based on the model and compares the possible approaches for estimating both the runtime and capacity fade. Proposed methodology is verified on real measured data.

    Input Prioritization for Testing Neural Networks

    Full text link
    Deep neural networks (DNNs) are increasingly being adopted for sensing and control functions in a variety of safety and mission-critical systems such as self-driving cars, autonomous air vehicles, medical diagnostics, and industrial robotics. Failures of such systems can lead to loss of life or property, which necessitates stringent verification and validation for providing high assurance. Though formal verification approaches are being investigated, testing remains the primary technique for assessing the dependability of such systems. Due to the nature of the tasks handled by DNNs, the cost of obtaining test oracle data---the expected output, a.k.a. label, for a given input---is high, which significantly impacts the amount and quality of testing that can be performed. Thus, prioritizing input data for testing DNNs in meaningful ways to reduce the cost of labeling can go a long way in increasing testing efficacy. This paper proposes using gauges of the DNN's sentiment derived from the computation performed by the model, as a means to identify inputs that are likely to reveal weaknesses. We empirically assessed the efficacy of three such sentiment measures for prioritization---confidence, uncertainty, and surprise---and compare their effectiveness in terms of their fault-revealing capability and retraining effectiveness. The results indicate that sentiment measures can effectively flag inputs that expose unacceptable DNN behavior. For MNIST models, the average percentage of inputs correctly flagged ranged from 88% to 94.8%

    Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks

    Get PDF
    Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at <uri>https://github.com/snu-adsl/DEEP-BO</uri>

    Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

    Full text link
    Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g. wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network.Comment: 24 pages, 4 figures, 6 table

    Learning from accidents : machine learning for safety at railway stations

    Get PDF
    In railway systems, station safety is a critical aspect of the overall structure, and yet, accidents at stations still occur. It is time to learn from these errors and improve conventional methods by utilizing the latest technology, such as machine learning (ML), to analyse accidents and enhance safety systems. ML has been employed in many fields, including engineering systems, and it interacts with us throughout our daily lives. Thus, we must consider the available technology in general and ML in particular in the context of safety in the railway industry. This paper explores the employment of the decision tree (DT) method in safety classification and the analysis of accidents at railway stations to predict the traits of passengers affected by accidents. The critical contribution of this study is the presentation of ML and an explanation of how this technique is applied for ensuring safety, utilizing automated processes, and gaining benefits from this powerful technology. To apply and explore this method, a case study has been selected that focuses on the fatalities caused by accidents at railway stations. An analysis of some of these fatal accidents as reported by the Rail Safety and Standards Board (RSSB) is performed and presented in this paper to provide a broader summary of the application of supervised ML for improving safety at railway stations. Finally, this research shows the vast potential of the innovative application of ML in safety analysis for the railway industry
    corecore