9,700 research outputs found
Evaluation Methodologies in Software Protection Research
Man-at-the-end (MATE) attackers have full control over the system on which
the attacked software runs, and try to break the confidentiality or integrity
of assets embedded in the software. Both companies and malware authors want to
prevent such attacks. This has driven an arms race between attackers and
defenders, resulting in a plethora of different protection and analysis
methods. However, it remains difficult to measure the strength of protections
because MATE attackers can reach their goals in many different ways and a
universally accepted evaluation methodology does not exist. This survey
systematically reviews the evaluation methodologies of papers on obfuscation, a
major class of protections against MATE attacks. For 572 papers, we collected
113 aspects of their evaluation methodologies, ranging from sample set types
and sizes, over sample treatment, to performed measurements. We provide
detailed insights into how the academic state of the art evaluates both the
protections and analyses thereon. In summary, there is a clear need for better
evaluation methodologies. We identify nine challenges for software protection
evaluations, which represent threats to the validity, reproducibility, and
interpretation of research results in the context of MATE attacks
Using machine learning to predict pathogenicity of genomic variants throughout the human genome
Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität.
Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores.
Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt.
Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity.
Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants.
The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency.
In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org
Identity, Power, and Prestige in Switzerland's Multilingual Education
Switzerland is known for its multilingualism, yet not all languages are represented equally in society. The situation is exacerbated by the influx of heritage languages and English through migration and globalization processes which challenge the traditional education system. This study is the first to investigate how schools in Grisons, Fribourg, and Zurich negotiate neoliberal forces leading to a growing necessity of English, a romanticized view on national languages, and the social justice perspective of institutionalizing heritage languages. It uncovers power and legitimacy issues and showcases students' and teachers' complex identities to advocate equitable multilingual education
Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse
This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses.
This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups.
In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in users’ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018—6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena
Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics
Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts.
In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact -values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited.
In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical in least squares regression.
In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions
Linear to multi-linear algebra and systems using tensors
In past few decades, tensor algebra also known as multi-linear algebra has
been developed and customized as a tool to be used for various engineering
applications. In particular, with the help of a special form of tensor
contracted product, known as the Einstein Product and its properties, many of
the known concepts from Linear Algebra could be extended to a multi-linear
setting. This enables to define the notions of multi-linear system theory where
the input, output signals and the system are multi-domain in nature. This paper
provides an overview of tensor algebra tools which can be seen as an extension
of linear algebra, at the same time highlighting the difference and advantages
that the multi-linear setting brings forth. In particular, the notion of tensor
inversion, tensor singular value and tensor Eigenvalue decomposition using the
Einstein product is explained. In addition, this paper also introduces the
notion of contracted convolution in both discrete and continuous multi-linear
system tensors. Tensor Networks representation of various tensor operations is
also presented. Also, application of tensor tools in developing transceiver
schemes for multi-domain communication systems, with an example of MIMO CDMA
systems, is presented. Thus this paper acts as an entry point tutorial for
graduate students whose research involves multi-domain or multi-modal signals
and systems
Introduction to Online Nonstochastic Control
This text presents an introduction to an emerging paradigm in control of
dynamical systems and differentiable reinforcement learning called online
nonstochastic control. The new approach applies techniques from online convex
optimization and convex relaxations to obtain new methods with provable
guarantees for classical settings in optimal and robust control.
The primary distinction between online nonstochastic control and other
frameworks is the objective. In optimal control, robust control, and other
control methodologies that assume stochastic noise, the goal is to perform
comparably to an offline optimal strategy. In online nonstochastic control,
both the cost functions as well as the perturbations from the assumed dynamical
model are chosen by an adversary. Thus the optimal policy is not defined a
priori. Rather, the target is to attain low regret against the best policy in
hindsight from a benchmark class of policies.
This objective suggests the use of the decision making framework of online
convex optimization as an algorithmic methodology. The resulting methods are
based on iterative mathematical optimization algorithms, and are accompanied by
finite-time regret and computational complexity guarantees.Comment: Draft; comments/suggestions welcome at
[email protected]
Oxidative aging and fracture behavior of polymers and composites: theory, modeling and experiments
Polymers and their composites (PMC) have emerged as effective alternative materials in structural, aerospace, and automotive industries due to their lightweight and tunable properties compared to metals. However, these materials tend to degrade during their operations in extreme environments. In this work, two extreme conditions are considered: - i) high-temperature oxidative degradation of polymers and polymer-based composites ii) Fracture and damage of polymer-based composites under thermo-mechanical loading.
Polymer oxidation starts when oxygen from the ambient diffuses into the bulk material and initiates chemical reactions to develop a coarse, brittle oxide layer on the exposed surface. The oxidative degradation process is inherently complex in nature, as it involves a coupling between diffusion, reaction, and mechanics. As oxygen diffuses into the polymer, a series of chain reactions occur, resulting in residual shrinkage strain on the oxidized layer of the material due to escaping of the volatiles. Consequently, residual stress develops within the material, causing spontaneous cracking even without the application of external loading. Thus, the oxidative aging can cause premature cracking in the material and requires a better understanding of the interaction between the chemistry and mechanics at different length scales and timescales to comprehend the effect of thermo-oxidative aging of polymeric materials. In this work, a fully coupled thermodynamically consistent chemo-mechanical phase-field fracture model is developed that attempts to bridge the gap between the experimental observations and a constitutive theory for thermo-oxidative aging in polymeric materials. To accomplish this, a novel approach has been adopted considering the chemical reactions at the polymer macromolecular level, a reaction-driven transient network evolution theory at the microscale, and a constitutive model at the macroscale. Finally, a phase-field fracture theory is added to the chemo-mechanical model to predict the oxidation-induced fracture in the polymer under mechanical loading. The model has been further extended to a homogenized continuum theory to capture the anisotropic oxidation characteristic of the fiber-reinforced polymer matrix composites. Specialized forms of the constitutive equations and the governing partial differential equations have also been developed for the polymers and the composite systems and numerically implemented in finite elements by writing ABAQUS user-defined element (UEL) subroutine.
Lastly, a unified phase-field fracture model is developed to create an experimentally validated, physically motivated, and computationally tractable model to predict the fracture response of the unidirectional fiber reinforced polymer matrix composites. A homogenized, coupled thermo-mechanical model is developed considering a thermo-viscoelastic polymer matrix. The model is numerically implemented by writing a ABAQUS user-element subroutine (UEL). The model can predict the constitutive response and direction-dependent damage propagation and final fracture in commercially acquired unidirectional glass-fiber-reinforced epoxy composite both at different fiber orientations and at different temperatures in substantial agreement with the experiments
Special Topics in Information Technology
This open access book presents thirteen outstanding doctoral dissertations in Information Technology from the Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy. Information Technology has always been highly interdisciplinary, as many aspects have to be considered in IT systems. The doctoral studies program in IT at Politecnico di Milano emphasizes this interdisciplinary nature, which is becoming more and more important in recent technological advances, in collaborative projects, and in the education of young researchers. Accordingly, the focus of advanced research is on pursuing a rigorous approach to specific research topics starting from a broad background in various areas of Information Technology, especially Computer Science and Engineering, Electronics, Systems and Control, and Telecommunications. Each year, more than 50 PhDs graduate from the program. This book gathers the outcomes of the thirteen best theses defended in 2020-21 and selected for the IT PhD Award. Each of the authors provides a chapter summarizing his/her findings, including an introduction, description of methods, main achievements and future work on the topic. Hence, the book provides a cutting-edge overview of the latest research trends in Information Technology at Politecnico di Milano, presented in an easy-to-read format that will also appeal to non-specialists
Differential Models, Numerical Simulations and Applications
This Special Issue includes 12 high-quality articles containing original research findings in the fields of differential and integro-differential models, numerical methods and efficient algorithms for parameter estimation in inverse problems, with applications to biology, biomedicine, land degradation, traffic flows problems, and manufacturing systems
- …