289 research outputs found
Comparing descriptors for molecular clusters in unsupervised learning
This thesis is about exploring descriptors for atmospheric molecular clusters. Descriptors are needed for applying machine learning methods for molecular systems. There is a collection of descriptors readily available in the DScribe-library developed in Aalto University for custom machine learning applications. The question of which descriptors to use is up to the user to decide. This study takes the first steps in integrating machine learning into existing procedure of configurational sampling that aims to find the optimal structure for any given molecular cluster of interest.
The structure selection step forms a bottleneck in the configurational sampling procedure. A new structure selection method presented in this study uses k-means clustering to find structures that are similar to each other. The clustering results can be used to discard redundant structures more effectively than before which leaves fewer structures to be calculated with more expensive computations. Altogether that speeds up the configurational sampling procedure. To aid the selection of suitable descriptor for this application, a comparison of four descriptors available in DScribe is made.
A procedure for structure selection by representing atmospheric clusters with descriptors and labeling them into groups with k-means was implemented. The performance of descriptors was compared with a custom score suitable for this application, and it was found that MBTR outperforms the other descriptors. This structure selection method will be utilized in the existing configurational sampling procedure for atmospheric molecular clusters but it is not restricted to that application
IST Austria Thesis
Social insect colonies tend to have numerous members which function together like a single organism in such harmony that the term ``super-organism'' is often used. In this analogy the reproductive caste is analogous to the primordial germ
cells of a metazoan, while the sterile worker caste corresponds to somatic cells. The worker castes, like tissues, are
in charge of all functions of a living being, besides reproduction. The establishment of new super-organismal units
(i.e. new colonies) is accomplished by the co-dependent castes. The term oftentimes goes beyond a metaphor. We invoke it when we speak about the metabolic rate, thermoregulation, nutrient regulation and gas exchange of a social insect colony. Furthermore, we assert that the super-organism has an immune system, and benefits from ``social immunity''.
Social immunity was first summoned by evolutionary biologists to resolve the apparent discrepancy between the expected high frequency of disease outbreak amongst numerous, closely related tightly-interacting hosts, living in stable and microbially-rich environments, against the exceptionally scarce epidemic accounts in natural populations. Social
immunity comprises a multi-layer assembly of behaviours which have evolved to effectively keep the pathogenic enemies of a colony at bay. The field of social immunity has drawn interest, as it becomes increasingly urgent to stop
the collapse of pollinator species and curb the growth of invasive pests. In the past decade, several mechanisms of
social immune responses have been dissected, but many more questions remain open.
I present my work in two experimental chapters. In the first, I use invasive garden ants (*Lasius neglectus*) to study how pathogen load and its distribution among nestmates affect the grooming response of the group. Any given group of ants will carry out the same total grooming work, but will direct their grooming effort towards individuals
carrying a relatively higher spore load. Contrary to expectation, the highest risk of transmission does not stem from grooming highly contaminated ants, but instead, we suggest that the grooming response likely minimizes spore loss to the environment, reducing contamination from inadvertent pickup from the substrate.
The second is a comparative developmental approach. I follow black garden ant queens (*Lasius niger*) and their colonies from mating flight, through hibernation for a year. Colonies which grow fast from the start, have a lower chance of survival through hibernation, and those which survive grow at a lower pace later. This is true for colonies of naive
and challenged queens. Early pathogen exposure of the queens changes colony dynamics in an unexpected way: colonies from exposed queens are more likely to grow slowly and recover in numbers only after they survive hibernation.
In addition to the two experimental chapters, this thesis includes a co-authored published review on organisational
immunity, where we enlist the experimental evidence and theoretical framework on which this hypothesis is built,
identify the caveats and underline how the field is ripe to overcome them. In a final chapter, I describe my part in
two collaborative efforts, one to develop an image-based tracker, and the second to develop a classifier for ant
behaviour
Algorithms and Methods for Designing and Scheduling Smart Manufacturing Systems
This book, as a Special Issue, is a collection of some of the latest advancements in designing and scheduling smart manufacturing systems. The smart manufacturing concept is undoubtedly considered a paradigm shift in manufacturing technology. This conception is part of the Industry 4.0 strategy, or equivalent national policies, and brings new challenges and opportunities for the companies that are facing tough global competition. Industry 4.0 should not only be perceived as one of many possible strategies for manufacturing companies, but also as an important practice within organizations. The main focus of Industry 4.0 implementation is to combine production, information technology, and the internet. The presented Special Issue consists of ten research papers presenting the latest works in the field. The papers include various topics, which can be divided into three categoriesâ(i) designing and scheduling manufacturing systems (seven articles), (ii) machining process optimization (two articles), (iii) digital insurance platforms (one article). Most of the mentioned research problems are solved in these articles by using genetic algorithms, the harmony search algorithm, the hybrid bat algorithm, the combined whale optimization algorithm, and other optimization and decision-making methods. The above-mentioned groups of articles are briefly described in this order in this book
BagStack Classification for Data Imbalance Problems with Application to Defect Detection and Labeling in Semiconductor Units
abstract: Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the âno free lunch theoremâ. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements.
Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of defect detection and classification in semiconductor units is challenging due to different acceptable variations that the manufacturing process introduces. Other variations are also typically introduced when using optical inspection systems due to changes in lighting conditions and misalignment of the imaged units, which makes the defect detection process more challenging.
In this thesis, a BagStack classification framework is proposed, which makes use of stacking and bagging concepts to handle both variance and bias errors. The classifier is designed to handle the data imbalance and overfitting problems by adaptively transforming the
multi-class classification problem into multiple binary classification problems, applying a bagging approach to train a set of base learners for each specific problem, adaptively specifying the number of base learners assigned to each problem, adaptively specifying the number of samples to use from each class, applying a novel data-imbalance aware cross-validation technique to generate the meta-data while taking into account the data imbalance problem at the meta-data level and, finally, using a multi-response random forest regression classifier as a meta-classifier. The BagStack classifier makes use of multiple features to solve the defect classification problem. In order to detect defects, a locally adaptive statistical background modeling is proposed. The proposed BagStack classifier outperforms state-of-the-art image classification techniques on our dataset in terms of overall classification accuracy and average per-class classification accuracy. The proposed detection method achieves high performance on the considered dataset in terms of recall and precision.Dissertation/ThesisDoctoral Dissertation Computer Engineering 201
Using MapReduce Streaming for Distributed Life Simulation on the Cloud
Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conwayâs life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MRâs applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithmsâ performance on Amazonâs Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
Artificial cognitive architecture with self-learning and self-optimization capabilities. Case studies in micromachining processes
Tesis doctoral inĂ©dita leĂda en la Universidad AutĂłnoma de Madrid, Escuela PolitĂ©cnica Superior, Departamento de IngenierĂa InformĂĄtica. Fecha de lectura : 22-09-201
Recommended from our members
Computational methods for single cell RNA and genome assembly resolution using genetic variation
Genetic variation and natural selection have driven the evolutionary history on this planet and are responsible for creating us and all other life as we know it. Over the past several decades, the genomic revolution has allowed us to assess population variation across humans and other species and use that to link genotypes with phenotypes and infer evolutionary histories. In this thesis, I explore computational methods for using genetic variation to demultiplex and disambiguate complex data.
In single cell RNAseq, problems of batch effects, doublets, and ambient RNA are each sources of noise that impede our ability to infer the functional states of cells and compare them between experiments. One new popular new experimental design promising to solve each of these while also reducing experimental costs is mixturing multiple individuals' cells into a single experiment. In chapter 2, I present a method for clustering cells by genotype, calling doublets, and using the cross-genotype signal in singletons to estimate and remove ambient RNA. I compare this methods to other existing methods including one that requires \textit{a priori} information about the genotypes, and two which do not. I find that my method outperforms each of these methods across a wide range of data parameters and sample types.
In genome assembly, the recent higher throughput and lower cost of long read sequencing has revolutionized our ability to create reference quality genomes and has revitalized the assembly community. Now, massive efforts are taking place in the Darwin Tree of Life project and the Earth Biogenome project to create reference genomes for all multicelular eukaryotic life. This will create a scientific resource for the next generation of biological science, will serve as a conservation of data that could otherwise be lost in this time of mass extinction, and will allow for a much more broad understanding of evolution and the evolutionary history of life on Earth. While much progress has been made in data quality and assembly algorithms, some problems still exist. Until recently, the DNA input requirements for long read sequencing technologies made it impossible to sequence single individuals of these species with long reads. Also, high heterozygosity makes assembly more difficult due to the inherent ambiguity between heterozygous sequence versus paralogous sequence when confronted with inexact homology. One solution to the DNA input requirements would be to pool individuals, but this only increases the heterozygosity of the sample and reduces assembly quality. In chapter 3, we present the first high quality assembly of a single mosquito using new library preparation methods with reduced DNA requirements. This reduces the number of haplotypes to two, improving the assembly quality. In chapter 4, we further address the problems brought on by heterozygosity in assembly. I present a suite of tools that use the phasing consistency of multiple heterozygous sequences as a signal for physical linkage, thus using genetic variation to our advantage rather than as a challenge to overcome. This tool creates phased, linked assemblies and phasing aware scaffolding. Further, I provide a tool for phasing aware scaffolding on existing assemblies. This includes a novel haplotype phasing algorithm with some unique beneficial properties. It is robust to non-heterozygous variants as input and can detect and correct those genotypes. And it naturally extends to polyploid genomes.Wellcome Trus
Multiscale dynamics in honeybee societies
In this dissertation, I examine the social organization of a model organism, the honeybee, at multiple scales. I begin in Part I at the microbial scale, by studying the relationship between the social caste of individuals and the microbes they harbour in their gastrointestinal tracts. Using 16S rRNA sequence data, I reconstruct the gut microbiomes of honeybees of different castes. I find that the microbiomes of two previously-uncharacterized social castes -- drones and queens -- contain the same bacteria as those in the guts of worker bees. However, despite this similarity, I show that the compositions of these bacteria in drones and queens are sufficiently different that their microbiomes can be distinguished from those of workers.
In Part II, I study the honeybee society at the level of its individual constituents, in particular, the set of foragers. I characterize the distribution of foraging activity across these individuals in the society, and find that this is highly skewed, with some individuals contributing much more to the activity of the colony than others. I establish these results in the framework used to describe the wealth of individuals in human society, and also characterize the temporal variation and resilience of foraging activity.
In Part III, I describe a system to track individual honeybees and their interactions inside a two-dimensional observation hive with high spatiotemporal resolution. At the level of individual honeybees, I study the temporal statistics of trophallaxis, an important social interaction that occurs in honeybee societies, and find that the distribution of trophallaxis durations is similar to the distribution of face-to-face interactions among humans. I propose a scaling argument to explain the scaling exponent of these distributions, and test the argument in simple random-walk models of proximity interactions. I then study the honeybee society at the collective scale of the trophallaxis interaction network, and find that although bees exhibit bursty patterns of trophallaxis just as humans do in communication, the dynamics of simulated spreading on the trophallaxis networks is fast relative to randomized reference models, unlike in human temporal networks
Structuring microscopic dynamics with macroscopic feedback: From social insects to artificial intelligence
Physical processes rely on the transmission of energy and information across scales. In the last century, theoretical tools have been developed in the field of statistical physics to infer macroscopic properties starting from a microscopic description of the system. However, less attention has been devoted to the remodelling of microscopic degrees of freedom by macroscopic feedback. In recent years, ideas from non-equilibrium physics have been applied to characterise biological and artificial intelligence systems. These systems share in common their structure in discrete scales of organisation that perform specialised functions. To correctly regulate these functions, the accurate transmission of information across scales is crucial. In this thesis we study the role of macroscopic feedback in the remodelling of microscopic degrees of freedom in two paradigmatic examples, one taken from the field of biology, the self-organisation of specialisation and plasticity in a social wasp, and one from artificial intelligence, the remodelling of deep neural networks in a stochastic many-particle system. In the first part of this thesis we study how the primitively social wasp Polistes canadensis simultaneously achieves robust specialization and rapid plasticity. Combining a unique experimental strategy correlating time-resolved measurements across vastly different scales with a theoretical approach, we characterise the re-establishment of the social steady state after queen removal. We show that Polistes integrates antagonistic processes on multiple scales to distinguish between extrinsic and intrinsic perturbations and thereby achieve both robust specialisation and rapid plasticity. Furthermore, we show that the long-term stability of the social structure relies on the regulation of transcriptional noise by dynamic DNA methylation.
In the second part of this thesis, we ask whether emergent collective interactions can be used to remodel deep neural networks. To this end, we study a paradigmatic stochastic manyparticle model where the dynamics are defined by the reaction rates of single particles, given by the output of distinct deep neural networks. The neural networks are in turn dynamically remodelled using deep reinforcement learning depending on the previous history of the system. In particular, we implement this model as a one dimensional stochastic lattice gas. Our results show the formation of two groups of particles that move in opposite directions, diffusively at early times and ballistically over longer time-scales, with the transition between these regimes corresponding to the time-scale of left/right symmetry breaking at the level of individual particles. Over a hierarchy of characteristic time-scales these particles develop emergent, increasingly complex interactions characterised by short-range repulsion and long-range attraction. As a result, the system asymptotically converges to a regime characterised by the presence of anti-ferromagnetic particle clusters. To conclude, we characterise the impact of memory effects and demographic disorder on the dynamics. Together, our results shed light on how non-equilibrium systems can employ macroscopic feedback to regulate the propagation of fluctuations across scales
- âŠ