478,549 research outputs found
Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data
Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations
Recommended from our members
Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering
The effective design of combinatorial libraries to balance fitness and diversity facilitates the engineering of useful enzyme functions, particularly those that are poorly characterized or unknown in biology. We introduce MODIFY, a machine learning (ML) algorithm that learns from natural protein sequences to infer evolutionarily plausible mutations and predict enzyme fitness. MODIFY co-optimizes predicted fitness and sequence diversity of starting libraries, prioritizing high-fitness variants while ensuring broad sequence coverage. In silico evaluation shows that MODIFY outperforms state-of-the-art unsupervised methods in zero-shot fitness prediction and enables ML-guided directed evolution with enhanced efficiency. Using MODIFY, we engineer generalist biocatalysts derived from a thermostable cytochrome c to achieve enantioselective C-B and C-Si bond formation via a new-to-nature carbene transfer mechanism, leading to biocatalysts six mutations away from previously developed enzymes while exhibiting superior or comparable activities. These results demonstrate MODIFY's potential in solving challenging enzyme engineering problems beyond the reach of classic directed evolution
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Gestures that accompany speech are an essential part of natural and efficient
embodied human communication. The automatic generation of such co-speech
gestures is a long-standing problem in computer animation and is considered an
enabling technology in film, games, virtual social spaces, and for interaction
with social robots. The problem is made challenging by the idiosyncratic and
non-periodic nature of human co-speech gesture motion, and by the great
diversity of communicative functions that gestures encompass. Gesture
generation has seen surging interest recently, owing to the emergence of more
and larger datasets of human gesture motion, combined with strides in
deep-learning-based generative models, that benefit from the growing
availability of data. This review article summarizes co-speech gesture
generation research, with a particular focus on deep generative models. First,
we articulate the theory describing human gesticulation and how it complements
speech. Next, we briefly discuss rule-based and classical statistical gesture
synthesis, before delving into deep learning approaches. We employ the choice
of input modalities as an organizing principle, examining systems that generate
gestures from audio, text, and non-linguistic input. We also chronicle the
evolution of the related training data sets in terms of size, diversity, motion
quality, and collection method. Finally, we identify key research challenges in
gesture generation, including data availability and quality; producing
human-like motion; grounding the gesture in the co-occurring speech in
interaction with other speakers, and in the environment; performing gesture
evaluation; and integration of gesture synthesis into applications. We
highlight recent approaches to tackling the various key challenges, as well as
the limitations of these approaches, and point toward areas of future
development.Comment: Accepted for EUROGRAPHICS 202
Formulating a Historical and Demographic Model of Recent Human Evolution Based on Resequencing Data from Noncoding Regions
BACKGROUND: Estimating the historical and demographic parameters that characterize modern human populations is a fundamental part of reconstructing the recent history of our species. In addition, the development of a model of human evolution that can best explain neutral genetic diversity is required to identify confidently regions of the human genome that have been targeted by natural selection. METHODOLOGY/PRINCIPAL FINDINGS: We have resequenced 20 independent noncoding autosomal regions dispersed throughout the genome in 213 individuals from different continental populations, corresponding to a total of approximately 6 Mb of diploid resequencing data. We used these data to explore and co-estimate an extensive range of historical and demographic parameters with a statistical framework that combines the evaluation of multiple models of human evolution via a best-fit approach, followed by an Approximate Bayesian Computation (ABC) analysis. From a methodological standpoint, evaluating the accuracy of the parameter co-estimation allowed us to identify the most accurate set of statistics to be used for the estimation of each of the different historical and demographic parameters characterizing recent human evolution. CONCLUSIONS/SIGNIFICANCE: Our results support a model in which modern humans left Africa through a single major dispersal event occurring approximately 60,000 years ago, corresponding to a drastic reduction of approximately 5 times the effective population size of the ancestral African population of approximately 13,800 individuals. Subsequently, the ancestors of modern Europeans and East Asians diverged much later, approximately 22,500 years ago, from the population of ancestral migrants. This late diversification of Eurasians after the African exodus points to the occurrence of a long maturation phase in which the ancestral Eurasian population was not yet diversified
Learning the Designer's Preferences to Drive Evolution
This paper presents the Designer Preference Model, a data-driven solution
that pursues to learn from user generated data in a Quality-Diversity
Mixed-Initiative Co-Creativity (QD MI-CC) tool, with the aims of modelling the
user's design style to better assess the tool's procedurally generated content
with respect to that user's preferences. Through this approach, we aim for
increasing the user's agency over the generated content in a way that neither
stalls the user-tool reciprocal stimuli loop nor fatigues the user with
periodical suggestion handpicking. We describe the details of this novel
solution, as well as its implementation in the MI-CC tool the Evolutionary
Dungeon Designer. We present and discuss our findings out of the initial tests
carried out, spotting the open challenges for this combined line of research
that integrates MI-CC with Procedural Content Generation through Machine
Learning.Comment: 16 pages, Accepted and to appear in proceedings of the 23rd European
Conference on the Applications of Evolutionary and bio-inspired Computation,
EvoApplications 202
Genetic Programming for Smart Phone Personalisation
Personalisation in smart phones requires adaptability to dynamic context
based on user mobility, application usage and sensor inputs. Current
personalisation approaches, which rely on static logic that is developed a
priori, do not provide sufficient adaptability to dynamic and unexpected
context. This paper proposes genetic programming (GP), which can evolve program
logic in realtime, as an online learning method to deal with the highly dynamic
context in smart phone personalisation. We introduce the concept of
collaborative smart phone personalisation through the GP Island Model, in order
to exploit shared context among co-located phone users and reduce convergence
time. We implement these concepts on real smartphones to demonstrate the
capability of personalisation through GP and to explore the benefits of the
Island Model. Our empirical evaluations on two example applications confirm
that the Island Model can reduce convergence time by up to two-thirds over
standalone GP personalisation.Comment: 43 pages, 11 figure
Ensemble Learning for Free with Evolutionary Algorithms ?
Evolutionary Learning proceeds by evolving a population of classifiers, from
which it generally returns (with some notable exceptions) the single
best-of-run classifier as final result. In the meanwhile, Ensemble Learning,
one of the most efficient approaches in supervised Machine Learning for the
last decade, proceeds by building a population of diverse classifiers. Ensemble
Learning with Evolutionary Computation thus receives increasing attention. The
Evolutionary Ensemble Learning (EEL) approach presented in this paper features
two contributions. First, a new fitness function, inspired by co-evolution and
enforcing the classifier diversity, is presented. Further, a new selection
criterion based on the classification margin is proposed. This criterion is
used to extract the classifier ensemble from the final population only
(Off-line) or incrementally along evolution (On-line). Experiments on a set of
benchmark problems show that Off-line outperforms single-hypothesis
evolutionary learning and state-of-art Boosting and generates smaller
classifier ensembles
Recombinant anticoccidial vaccines - a cup half full?
Eimeria species parasites can cause the disease coccidiosis, most notably in chickens. The occurrence of coccidiosis is currently controlled through a combination of good husbandry, chemoprophylaxis and/or live parasite vaccination; however, scalable, cost-effective subunit or recombinant vaccines are required. Many antigens have been proposed for use in novel anticoccidial vaccines, supported by the capacity to reduce disease severity or parasite replication, increase body weight gain in the face of challenge or improve feed conversion under experimental conditions, but none has reached commercial development. Nonetheless, the protection against challenge induced by some antigens has been within the lower range described for the ionophores against susceptible isolates or current live vaccines prior to oocyst recycling. With such levels of efficacy it may be that combinations of anticoccidial antigens already described are sufficient for development as novel multi-valent vaccines, pending identification of optimal delivery systems. Selection of the best antigens to be included in such vaccines can be informed by knowledge defining the natural occurrence of specific antigenic diversity, with relevance to the risk of immediate vaccine breakthrough, and the rate at which parasite genomes can evolve new diversity. For Eimeria, such data are now becoming available for antigens such as apical membrane antigen 1 (AMA1) and immune mapped protein 1 (IMP1) and more are anticipated as high-capacity, high-throughput sequencing technologies become increasingly accessible
- β¦