4,382 research outputs found

    NASA-ONERA Collaboration on Human Factors in Aviation Accidents and Incidents

    Get PDF
    This is the first annual report jointly prepared by NASA and ONERA on the work performed under the agreement to collaborate on a study of the human factors entailed in aviation accidents and incidents, particularly focused on the consequences of decreases in human performance associated with fatigue. The objective of this agreement is to generate reliable, automated procedures that improve understanding of the levels and characteristics of flight-crew fatigue factors whose confluence will likely result in unacceptable crew performance. This study entails the analyses of numerical and textual data collected during operational flights. NASA and ONERA are collaborating on the development and assessment of automated capabilities for extracting operationally significant information from very large, diverse (textual and numerical) databases; much larger than can be handled practically by human experts

    First Annual Report: NASA-ONERA Collaboration on Human Factors in Aviation Accidents and Incidents

    Get PDF
    This is the first annual report jointly prepared by NASA and ONERA on the work performed under the agreement to collaborate on a study of the human factors entailed in aviation accidents and incidents particularly focused on consequences of decreases in human performance associated with fatigue. The objective of this Agreement is to generate reliable, automated procedures that improve understanding of the levels and characteristics of flight-crew fatigue factors whose confluence will likely result in unacceptable crew performance. This study entails the analyses of numerical and textual data collected during operational flights. NASA and ONERA are collaborating on the development and assessment of automated capabilities for extracting operationally significant information from very large, diverse (textual and numerical) databases much larger than can be handled practically by human experts. This report presents the approach that is currently expected to be used in processing and analyzing the data for identifying decrements in aircraft performance and examining their relationships to decrements in crewmember performance due to fatigue. The decisions on the approach were based on samples of both the numerical and textual data that will be collected during the four studies planned under the Human Factors Monitoring Program (HFMP). Results of preliminary analyses of these sample data are presented in this report

    Open Source Software Evolution and Its Dynamics

    Get PDF
    This thesis undertakes an empirical study of software evolution by analyzing open source software (OSS) systems. The main purpose is to aid in understanding OSS evolution. The work centers on collecting large quantities of structural data cost-effectively and analyzing such data to understand software evolution dynamics (the mechanisms and causes of change or growth). We propose a multipurpose systematic approach to extracting program facts (e. g. , function calls). This approach is supported by a suite of C and C++ program extractors, which cover different steps in the program build process and handle both source and binary code. We present several heuristics to link facts extracted from individual files into a combined system model of reasonable accuracy. We extract historical sequences of system models to aid software evolution analysis. We propose that software evolution can be viewed as Punctuated Equilibrium (i. e. , long periods of small changes interrupted occasionally by large avalanche changes). We develop two approaches to study such dynamical behavior. One approach uses the evolution spectrograph to visualize file level changes to the implemented system structure. The other approach relies on automated software clustering techniques to recover system design changes. We discuss lessons learned from using these approaches. We present a new perspective on software evolution dynamics. From this perspective, an evolving software system responds to external events (e. g. , new functional requirements) according to Self-Organized Criticality (SOC). The SOC dynamics is characterized by the following: (1) the probability distribution of change sizes is a power law; and (2) the time series of change exhibits long range correlations with power law behavior. We present empirical evidence that SOC occurs in open source software systems

    Graph Neural Network-Based Anomaly Detection for River Network Systems

    Full text link
    Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology. Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging task due to the complexity and variability of the data, even under normal conditions. This paper presents a solution to the challenging task of anomaly detection for river network sensor data, which is essential for accurate and continuous monitoring. We use a graph neural network model, the recently proposed Graph Deviation Network (GDN), which employs graph attention-based forecasting to capture the complex spatio-temporal relationships between sensors. We propose an alternate anomaly scoring method, GDN+, based on the learned graph. To evaluate the model's efficacy, we introduce new benchmarking simulation experiments with highly-sophisticated dependency structures and subsequence anomalies of various types. We further examine the strengths and weaknesses of this baseline approach, GDN, in comparison to other benchmarking methods on complex real-world river network data. Findings suggest that GDN+ outperforms the baseline approach in high-dimensional data, while also providing improved interpretability. We also introduce software called gnnad

    Methods of Disambiguating and De-anonymizing Authorship in Large Scale Operational Data

    Get PDF
    Operational data from software development, social networks and other domains are often contaminated with incorrect or missing values. Examples include misspelled or changed names, multiple emails belonging to the same person and user profiles that vary in different systems. Such digital traces are extensively used in research and practice to study collaborating communities of various kinds. To achieve a realistic representation of the networks that represent these communities, accurate identities are essential. In this work, we aim to identify, model, and correct identity errors in data from open-source software repositories, which include more than 23M developer IDs and nearly 1B Git commits (developer activity records). Our investigation into the nature and prevalence of identity errors in software activity data reveals that they are different and occur at much higher rates than other domains. Existing techniques relying on string comparisons can only disambiguate Synonyms, but not Homonyms, which are common in software activity traces. Therefore, we introduce measures of behavioral fingerprinting to improve the accuracy of Synonym resolution, and to disambiguate Homonyms. Fingerprints are constructed from the traces of developers’ activities, such as, the style of writing in commit messages, the patterns in files modified and projects participated in by developers, and the patterns related to the timing of the developers’ activity. Furthermore, to address the lack of training data necessary for the supervised learning approaches that are used in disambiguation, we design a specific active learning procedure that minimizes the manual effort necessary to create training data in the domain of developer identity matching. We extensively evaluate the proposed approach, using over 16,000 OpenStack developers in 1200 projects, against commercial and most recent research approaches, and further on recent research on a much larger sample of over 2,000,000 IDs. Results demonstrate that our method is significantly better than both the recent research and commercial methods. We also conduct experiments to demonstrate that such erroneous data have significant impact on developer networks. We hope that the proposed approach will expedite research progress in the domain of software engineering, especially in applications for which graphs of social networks are critical

    Clinical and molecular investigation of rare congenital defects of the palate

    Get PDF
    Cleft palate (CP) affects around 1/1500 live births and, along with cleft lip, is one of the most common forms of birth defect. The studies presented here focus on unusual defects of the palate, especially to understand better the rarely reported but surprisingly common condition called submucous cleft palate (SMCP). The frequency and consequences of SMCP from a surgical perspective were first investigated based on the caseload of the North Thames Cleft Service at Great Ormond Street Hospital and St Andrew's Centre, Broomfield Hospital, Mid Essex Hospitals Trust. It was previously reported that up to 80% of individuals with unrepaired SMCP experience speech difficulties as a consequence of velopharyngeal insufficiency (VPI). Attempted repair of the palatal defect can sometimes give poor results, so controversies still exist about the correct choice of surgical technique to use. Over 23 years, 222 patients at The North Thames Cleft Service underwent operations to manage SMCP. Nearly half of them (42.8%) were diagnosed with 22q11.2 deletion syndrome (22q11.2 DS). The first operation was palate repair, with an exception of one case, followed by a second surgical intervention required in approximately half of the patients. A third procedure to manage VPI was carried out in 6% of patients. To better understand the histological anatomy of the palatal muscles in cleft patients, biopsies were taken from levator veli palatini (LVP) and/or palatopharyngeus (PP) muscles during surgical correction of CP. Muscles were compared from patients with SMCP to those with overt CP and also to controls. The controls consisted of descending PP muscle fibres from healthy children who underwent a tonsillectomy operation for obstructive sleep apnoea or recurrent chronic tonsillitis. Fifty-seven biopsy samples were available from children between 10 months to 9 years of age. Individual biopsy samples were also available from patients with achondroplasia, Apert, Cornelia de Lange and Kabuki syndromes. The study showed a prevalence of fast fibres in both muscles in all CP types. However, in both SMCP LVP and SMCP 22q11.2 DS LVP, this trend was reversed in favour of slow fibres. Single cases with syndromes did not reveal any obvious differences compared to more common cleft types. Mutations in TBX22 are a frequent genetic cause of cleft palate and SMCP. The functional role of the encoded TBX22 transcription factor was investigated in a mouse model with SMCP. Cell lineage-specific fluorescence activated cell sorting of a conditional allele of Tbx22, was used to look at the RNA-Seq transcriptome in developing palatal shelves, with a view to identify downstream target genes. Eleven up regulated genes reached statistical significance after multiple testing correction in cranial mesoderm (CM) derived cells when comparing Tbx22null/Y and WT samples (Cspg4, Foxp2, Reln, Bmpr1b, Adgrb3, Sox6, Zim1, Scarna13, Fat1, Notch3, Peg3). Eleven genes were down regulated in the same comparison (Nr2f2, Lars2, Ahr, Aplnr, Emcn, Npnt, Apln, Ccr2, Tll1, Snord34, Snord99). Comparing Tbx22null/Y and WT in cranial neural crest (CNC) derived cells, only Cxcl14 was up regulated, while Tbx22 was down regulated. Osteoclast differentiation, calcium signalling, focal adhesion, Wnt signalling and cell adhesion molecule pathways were the most enriched pathways in functional annotation of significantly differentially expressed genes analysis. Finally, a family with an unusual velopharyngeal anatomy was investigated in order to determine the likely genetic cause. This involved the implementation of genetic technologies in an autosomal dominant multigeneration Egyptian family with 8 affected individuals who presented with absent uvula, short posterior border of the soft palate and abnormal pillars of the fauces. Using a combination of cytogenetic, linkage analysis and exome sequencing, followed by more detailed segregation and functional analysis, a dominantly acting missense mutation in the activation domain of FOXF2 was revealed. This variant was found to co-segregate with a copy number variant of unknown significance that could not at this stage be causally distinguished from the point mutation

    Doctor of Philosophy

    Get PDF
    dissertationBiomedical data are a rich source of information and knowledge. Not only are they useful for direct patient care, but they may also offer answers to important population-based questions. Creating an environment where advanced analytics can be performed against biomedical data is nontrivial, however. Biomedical data are currently scattered across multiple systems with heterogeneous data, and integrating these data is a bigger task than humans can realistically do by hand; therefore, automatic biomedical data integration is highly desirable but has never been fully achieved. This dissertation introduces new algorithms that were devised to support automatic and semiautomatic integration of heterogeneous biomedical data. The new algorithms incorporate both data mining and biomedical informatics techniques to create "concept bags" that are used to compute similarity between data elements in the same way that "word bags" are compared in data mining. Concept bags are composed of controlled medical vocabulary concept codes that are extracted from text using named-entity recognition software. To test the new algorithm, three biomedical text similarity use cases were examined: automatically aligning data elements between heterogeneous data sets, determining degrees of similarity between medical terms using a published benchmark, and determining similarity between ICU discharge summaries. The method is highly configurable and 5 different versions were tested. The concept bag method performed particularly well aligning data elements and outperformed the compared algorithms by iv more than 5%. Another configuration that included hierarchical semantics performed particularly well at matching medical terms, meeting or exceeding 30 of 31 other published results using the same benchmark. Results for the third scenario of computing ICU discharge summary similarity were less successful. Correlations between multiple methods were low, including between terminologists. The concept bag algorithms performed consistently and comparatively well and appear to be viable options for multiple scenarios. New applications of the method and ideas for improving the algorithm are being discussed for future work, including several performance enhancements, configuration-based enhancements, and concept vector weighting using the TF-IDF formulas
    • …
    corecore