6,625 research outputs found

    Configuration Management of Distributed Systems over Unreliable and Hostile Networks

    Get PDF
    Economic incentives of large criminal profits and the threat of legal consequences have pushed criminals to continuously improve their malware, especially command and control channels. This thesis applied concepts from successful malware command and control to explore the survivability and resilience of benign configuration management systems. This work expands on existing stage models of malware life cycle to contribute a new model for identifying malware concepts applicable to benign configuration management. The Hidden Master architecture is a contribution to master-agent network communication. In the Hidden Master architecture, communication between master and agent is asynchronous and can operate trough intermediate nodes. This protects the master secret key, which gives full control of all computers participating in configuration management. Multiple improvements to idempotent configuration were proposed, including the definition of the minimal base resource dependency model, simplified resource revalidation and the use of imperative general purpose language for defining idempotent configuration. Following the constructive research approach, the improvements to configuration management were designed into two prototypes. This allowed validation in laboratory testing, in two case studies and in expert interviews. In laboratory testing, the Hidden Master prototype was more resilient than leading configuration management tools in high load and low memory conditions, and against packet loss and corruption. Only the research prototype was adaptable to a network without stable topology due to the asynchronous nature of the Hidden Master architecture. The main case study used the research prototype in a complex environment to deploy a multi-room, authenticated audiovisual system for a client of an organization deploying the configuration. The case studies indicated that imperative general purpose language can be used for idempotent configuration in real life, for defining new configurations in unexpected situations using the base resources, and abstracting those using standard language features; and that such a system seems easy to learn. Potential business benefits were identified and evaluated using individual semistructured expert interviews. Respondents agreed that the models and the Hidden Master architecture could reduce costs and risks, improve developer productivity and allow faster time-to-market. Protection of master secret keys and the reduced need for incident response were seen as key drivers for improved security. Low-cost geographic scaling and leveraging file serving capabilities of commodity servers were seen to improve scaling and resiliency. Respondents identified jurisdictional legal limitations to encryption and requirements for cloud operator auditing as factors potentially limiting the full use of some concepts

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection

    Full text link
    Fabric defect segmentation is integral to textile quality control. Despite this, the scarcity of high-quality annotated data and the diversity of fabric defects present significant challenges to the application of deep learning in this field. These factors limit the generalization and segmentation performance of existing models, impeding their ability to handle the complexity of diverse fabric types and defects. To overcome these obstacles, this study introduces an innovative method to infuse specialized knowledge of fabric defects into the Segment Anything Model (SAM), a large-scale visual model. By introducing and training a unique set of fabric defect-related parameters, this approach seamlessly integrates domain-specific knowledge into SAM without the need for extensive modifications to the pre-existing model parameters. The revamped SAM model leverages generalized image understanding learned from large-scale natural image datasets while incorporating fabric defect-specific knowledge, ensuring its proficiency in fabric defect segmentation tasks. The experimental results reveal a significant improvement in the model's segmentation performance, attributable to this novel amalgamation of generic and fabric-specific knowledge. When benchmarking against popular existing segmentation models across three datasets, our proposed model demonstrates a substantial leap in performance. Its impressive results in cross-dataset comparisons and few-shot learning experiments further demonstrate its potential for practical applications in textile quality control.Comment: 13 pages,4 figures, 3 table

    Automated Mapping of Adaptive App GUIs from Phones to TVs

    Full text link
    With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TV. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there are millions of apps available on Google Play, only a few thousand are designed to support smart TV displays. Existing techniques to map a mobile app GUI to a TV either adopt a responsive design, which struggles to bridge the substantial gap between phone and TV or use mirror apps for improved video display, which requires hardware support and extra engineering efforts. Instead of developing another app for supporting TVs, we propose a semi-automated approach to generate corresponding adaptive TV GUIs, given the phone GUIs as the input. Based on our empirical study of GUI pairs for TV and phone in existing apps, we synthesize a list of rules for grouping and classifying phone GUIs, converting them to TV GUIs, and generating dynamic TV layouts and source code for the TV display. Our tool is not only beneficial to developers but also to GUI designers, who can further customize the generated GUIs for their TV app development. An evaluation and user study demonstrate the accuracy of our generated GUIs and the usefulness of our tool.Comment: 30 pages, 15 figure

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    GeschĂ€tzt mehr als 6.000 Erkrankungen werden durch VerĂ€nderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begĂŒnstigen. All diese Prozesse mĂŒssen ĂŒberprĂŒft werden, um die zum beschriebenen PhĂ€notyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer PathogenitĂ€t. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier prĂ€sentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells fĂŒr das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf AllelhĂ€ufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfĂŒgbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    An empirical investigation of the relationship between integration, dynamic capabilities and performance in supply chains

    Get PDF
    This research aimed to develop an empirical understanding of the relationships between integration, dynamic capabilities and performance in the supply chain domain, based on which, two conceptual frameworks were constructed to advance the field. The core motivation for the research was that, at the stage of writing the thesis, the combined relationship between the three concepts had not yet been examined, although their interrelationships have been studied individually. To achieve this aim, deductive and inductive reasoning logics were utilised to guide the qualitative study, which was undertaken via multiple case studies to investigate lines of enquiry that would address the research questions formulated. This is consistent with the author’s philosophical adoption of the ontology of relativism and the epistemology of constructionism, which was considered appropriate to address the research questions. Empirical data and evidence were collected, and various triangulation techniques were employed to ensure their credibility. Some key features of grounded theory coding techniques were drawn upon for data coding and analysis, generating two levels of findings. These revealed that whilst integration and dynamic capabilities were crucial in improving performance, the performance also informed the former. This reflects a cyclical and iterative approach rather than one purely based on linearity. Adopting a holistic approach towards the relationship was key in producing complementary strategies that can deliver sustainable supply chain performance. The research makes theoretical, methodological and practical contributions to the field of supply chain management. The theoretical contribution includes the development of two emerging conceptual frameworks at the micro and macro levels. The former provides greater specificity, as it allows meta-analytic evaluation of the three concepts and their dimensions, providing a detailed insight into their correlations. The latter gives a holistic view of their relationships and how they are connected, reflecting a middle-range theory that bridges theory and practice. The methodological contribution lies in presenting models that address gaps associated with the inconsistent use of terminologies in philosophical assumptions, and lack of rigor in deploying case study research methods. In terms of its practical contribution, this research offers insights that practitioners could adopt to enhance their performance. They can do so without necessarily having to forgo certain desired outcomes using targeted integrative strategies and drawing on their dynamic capabilities

    A Machine Learning based Empirical Evaluation of Cyber Threat Actors High Level Attack Patterns over Low level Attack Patterns in Attributing Attacks

    Full text link
    Cyber threat attribution is the process of identifying the actor of an attack incident in cyberspace. An accurate and timely threat attribution plays an important role in deterring future attacks by applying appropriate and timely defense mechanisms. Manual analysis of attack patterns gathered by honeypot deployments, intrusion detection systems, firewalls, and via trace-back procedures is still the preferred method of security analysts for cyber threat attribution. Such attack patterns are low-level Indicators of Compromise (IOC). They represent Tactics, Techniques, Procedures (TTP), and software tools used by the adversaries in their campaigns. The adversaries rarely re-use them. They can also be manipulated, resulting in false and unfair attribution. To empirically evaluate and compare the effectiveness of both kinds of IOC, there are two problems that need to be addressed. The first problem is that in recent research works, the ineffectiveness of low-level IOC for cyber threat attribution has been discussed intuitively. An empirical evaluation for the measure of the effectiveness of low-level IOC based on a real-world dataset is missing. The second problem is that the available dataset for high-level IOC has a single instance for each predictive class label that cannot be used directly for training machine learning models. To address these problems in this research work, we empirically evaluate the effectiveness of low-level IOC based on a real-world dataset that is specifically built for comparative analysis with high-level IOC. The experimental results show that the high-level IOC trained models effectively attribute cyberattacks with an accuracy of 95% as compared to the low-level IOC trained models where accuracy is 40%.Comment: 20 page

    Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring

    Full text link
    Artificially intelligent perception is increasingly present in the lives of every one of us. Vehicles are no exception, (...) In the near future, pattern recognition will have an even stronger role in vehicles, as self-driving cars will require automated ways to understand what is happening around (and within) them and act accordingly. (...) This doctoral work focused on advancing in-vehicle sensing through the research of novel computer vision and pattern recognition methodologies for both biometrics and wellbeing monitoring. The main focus has been on electrocardiogram (ECG) biometrics, a trait well-known for its potential for seamless driver monitoring. Major efforts were devoted to achieving improved performance in identification and identity verification in off-the-person scenarios, well-known for increased noise and variability. Here, end-to-end deep learning ECG biometric solutions were proposed and important topics were addressed such as cross-database and long-term performance, waveform relevance through explainability, and interlead conversion. Face biometrics, a natural complement to the ECG in seamless unconstrained scenarios, was also studied in this work. The open challenges of masked face recognition and interpretability in biometrics were tackled in an effort to evolve towards algorithms that are more transparent, trustworthy, and robust to significant occlusions. Within the topic of wellbeing monitoring, improved solutions to multimodal emotion recognition in groups of people and activity/violence recognition in in-vehicle scenarios were proposed. At last, we also proposed a novel way to learn template security within end-to-end models, dismissing additional separate encryption processes, and a self-supervised learning approach tailored to sequential data, in order to ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022 to the University of Port

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    Anwendungen maschinellen Lernens fĂŒr datengetriebene PrĂ€vention auf Populationsebene

    Get PDF
    Healthcare costs are systematically rising, and current therapy-focused healthcare systems are not sustainable in the long run. While disease prevention is a viable instrument for reducing costs and suffering, it requires risk modeling to stratify populations, identify high- risk individuals and enable personalized interventions. In current clinical practice, however, systematic risk stratification is limited: on the one hand, for the vast majority of endpoints, no risk models exist. On the other hand, available models focus on predicting a single disease at a time, rendering predictor collection burdensome. At the same time, the den- sity of individual patient data is constantly increasing. Especially complex data modalities, such as -omics measurements or images, may contain systemic information on future health trajectories relevant for multiple endpoints simultaneously. However, to date, this data is inaccessible for risk modeling as no dedicated methods exist to extract clinically relevant information. This study built on recent advances in machine learning to investigate the ap- plicability of four distinct data modalities not yet leveraged for risk modeling in primary prevention. For each data modality, a neural network-based survival model was developed to extract predictive information, scrutinize performance gains over commonly collected covariates, and pinpoint potential clinical utility. Notably, the developed methodology was able to integrate polygenic risk scores for cardiovascular prevention, outperforming existing approaches and identifying benefiting subpopulations. Investigating NMR metabolomics, the developed methodology allowed the prediction of future disease onset for many common diseases at once, indicating potential applicability as a drop-in replacement for commonly collected covariates. Extending the methodology to phenome-wide risk modeling, elec- tronic health records were found to be a general source of predictive information with high systemic relevance for thousands of endpoints. Assessing retinal fundus photographs, the developed methodology identified diseases where retinal information most impacted health trajectories. In summary, the results demonstrate the capability of neural survival models to integrate complex data modalities for multi-disease risk modeling in primary prevention and illustrate the tremendous potential of machine learning models to disrupt medical practice toward data-driven prevention at population scale.Die Kosten im Gesundheitswesen steigen systematisch und derzeitige therapieorientierte Gesundheitssysteme sind nicht nachhaltig. Angesichts vieler verhinderbarer Krankheiten stellt die PrĂ€vention ein veritables Instrument zur Verringerung von Kosten und Leiden dar. Risikostratifizierung ist die grundlegende Voraussetzung fĂŒr ein prĂ€ventionszentri- ertes Gesundheitswesen um Personen mit hohem Risiko zu identifizieren und Maßnah- men einzuleiten. Heute ist eine systematische Risikostratifizierung jedoch nur begrenzt möglich, da fĂŒr die meisten Krankheiten keine Risikomodelle existieren und sich verfĂŒg- bare Modelle auf einzelne Krankheiten beschrĂ€nken. Weil fĂŒr deren Berechnung jeweils spezielle Sets an PrĂ€diktoren zu erheben sind werden in Praxis oft nur wenige Modelle angewandt. Gleichzeitig versprechen komplexe DatenmodalitĂ€ten, wie Bilder oder -omics- Messungen, systemische Informationen ĂŒber zukĂŒnftige GesundheitsverlĂ€ufe, mit poten- tieller Relevanz fĂŒr viele Endpunkte gleichzeitig. Da es an dedizierten Methoden zur Ex- traktion klinisch relevanter Informationen fehlt, sind diese Daten jedoch fĂŒr die Risikomod- ellierung unzugĂ€nglich, und ihr Potenzial blieb bislang unbewertet. Diese Studie nutzt ma- chinelles Lernen, um die Anwendbarkeit von vier DatenmodalitĂ€ten in der PrimĂ€rprĂ€ven- tion zu untersuchen: polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention, NMR Meta- bolomicsdaten, elektronische Gesundheitsakten und Netzhautfundusfotos. Pro Datenmodal- itĂ€t wurde ein neuronales Risikomodell entwickelt, um relevante Informationen zu extra- hieren, additive Information gegenĂŒber ĂŒblicherweise erfassten Kovariaten zu quantifizieren und den potenziellen klinischen Nutzen der DatenmodalitĂ€t zu ermitteln. Die entwickelte Me-thodik konnte polygene Risikoscores fĂŒr die kardiovaskulĂ€re PrĂ€vention integrieren. Im Falle der NMR-Metabolomik erschloss die entwickelte Methodik wertvolle Informa- tionen ĂŒber den zukĂŒnftigen Ausbruch von Krankheiten. Unter Einsatz einer phĂ€nomen- weiten Risikomodellierung erwiesen sich elektronische Gesundheitsakten als Quelle prĂ€dik- tiver Information mit hoher systemischer Relevanz. Bei der Analyse von Fundusfotografien der Netzhaut wurden Krankheiten identifiziert fĂŒr deren Vorhersage Netzhautinformationen genutzt werden könnten. Zusammengefasst zeigten die Ergebnisse das Potential neuronaler Risikomodelle die medizinische Praxis in Richtung einer datengesteuerten, prĂ€ventionsori- entierten Medizin zu verĂ€ndern
    • 

    corecore