Search CORE

478,549 research outputs found

Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data

Author: A Broder
B Cowling
C Fraser
D Jamieson
E Mardis
G Bignell
J Salk
L Ding
L Pasqualucci
P Vlierberghe
R Johnston
S Shea
Publication venue
Publication date: 19/10/2010
Field of study

Recent developments in extracting and processing biological and clinical data are allowing quantitative approaches to studying living systems. High-throughput sequencing, expression profiles, proteomics, and electronic health records are some examples of such technologies. Extracting meaningful information from those technologies requires careful analysis of the large volumes of data they produce. In this note, we present a set of distributions that commonly appear in the analysis of such data. These distributions present some interesting features: they are discontinuous in the rational numbers, but continuous in the irrational numbers, and possess a certain self-similar (fractal-like) structure. The first set of examples which we present here are drawn from a high-throughput sequencing experiment. Here, the self-similar distributions appear as part of the evaluation of the error rate of the sequencing technology and the identification of tumorogenic genomic alterations. The other examples are obtained from risk factor evaluation and analysis of relative disease prevalence and co-mordbidity as these appear in electronic clinical data. The distributions are also relevant to identification of subclonal populations in tumors and the study of the evolution of infectious diseases, and more precisely the study of quasi-species and intrahost diversity of viral populations

arXiv.org e-Print Archive

Crossref

Columbia University Academic Commons

PubMed Central

Nature Precedings

Recommended from our members

Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering

Author: Chin Michael
Ding Kerr
Huang Wei
Liu Peng
Luo Yunan
Mai Binh Khanh
Wang Huanan
Yang Yang
Zhao Yunlong
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

The effective design of combinatorial libraries to balance fitness and diversity facilitates the engineering of useful enzyme functions, particularly those that are poorly characterized or unknown in biology. We introduce MODIFY, a machine learning (ML) algorithm that learns from natural protein sequences to infer evolutionarily plausible mutations and predict enzyme fitness. MODIFY co-optimizes predicted fitness and sequence diversity of starting libraries, prioritizing high-fitness variants while ensuring broad sequence coverage. In silico evaluation shows that MODIFY outperforms state-of-the-art unsupervised methods in zero-shot fitness prediction and enables ML-guided directed evolution with enhanced efficiency. Using MODIFY, we engineer generalist biocatalysts derived from a thermostable cytochrome c to achieve enantioselective C-B and C-Si bond formation via a new-to-nature carbene transfer mechanism, leading to biocatalysts six mutations away from previously developed enzymes while exhibiting superior or comparable activities. These results demonstrate MODIFY's potential in solving challenging enzyme engineering problems beyond the reach of classic directed evolution

eScholarship - University of California

A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

Author: Ahuja Chaitanya
Henter Gustav Eje
Kucherenko Taras
Neff Michael
Nyatsanga Simbarashe
Publication venue: 'Wiley'
Publication date: 10/04/2023
Field of study

Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

arXiv.org e-Print Archive

Formulating a Historical and Demographic Model of Recent Human Evolution Based on Resequencing Data from Noncoding Regions

Author: Barreiro Luis B.
Laval Guillaume
Patin Etienne
Quintana-Murci Lluís
Publication venue: Public Library of Science
Publication date: 22/04/2010
Field of study

BACKGROUND: Estimating the historical and demographic parameters that characterize modern human populations is a fundamental part of reconstructing the recent history of our species. In addition, the development of a model of human evolution that can best explain neutral genetic diversity is required to identify confidently regions of the human genome that have been targeted by natural selection. METHODOLOGY/PRINCIPAL FINDINGS: We have resequenced 20 independent noncoding autosomal regions dispersed throughout the genome in 213 individuals from different continental populations, corresponding to a total of approximately 6 Mb of diploid resequencing data. We used these data to explore and co-estimate an extensive range of historical and demographic parameters with a statistical framework that combines the evaluation of multiple models of human evolution via a best-fit approach, followed by an Approximate Bayesian Computation (ABC) analysis. From a methodological standpoint, evaluating the accuracy of the parameter co-estimation allowed us to identify the most accurate set of statistics to be used for the estimation of each of the different historical and demographic parameters characterizing recent human evolution. CONCLUSIONS/SIGNIFICANCE: Our results support a model in which modern humans left Africa through a single major dispersal event occurring approximately 60,000 years ago, corresponding to a drastic reduction of approximately 5 times the effective population size of the ancestral African population of approximately 13,800 individuals. Subsequently, the ancestors of modern Europeans and East Asians diverged much later, approximately 22,500 years ago, from the population of ancestral migrants. This late diversification of Eurasians after the African exodus points to the occurrence of a long maturation phase in which the ancestral Eurasian population was not yet diversified

Public Library of Science (PLOS)

CiteSeerX

Directory of Open Access Journals

PubMed Central

Learning the Designer's Preferences to Drive Evolution

Author: A Liapis
A Liapis
A Liapis
A Summerville
C Stanton
G Smith
GN Yannakakis
J Lehman
JK Pugh
N Shaker
SO Kimbrough
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This paper presents the Designer Preference Model, a data-driven solution that pursues to learn from user generated data in a Quality-Diversity Mixed-Initiative Co-Creativity (QD MI-CC) tool, with the aims of modelling the user's design style to better assess the tool's procedurally generated content with respect to that user's preferences. Through this approach, we aim for increasing the user's agency over the generated content in a way that neither stalls the user-tool reciprocal stimuli loop nor fatigues the user with periodical suggestion handpicking. We describe the details of this novel solution, as well as its implementation in the MI-CC tool the Evolutionary Dungeon Designer. We present and discuss our findings out of the initial tests carried out, spotting the open challenges for this combined line of research that integrates MI-CC with Procedural Content Generation through Machine Learning.Comment: 16 pages, Accepted and to appear in proceedings of the 23rd European Conference on the Applications of Evolutionary and bio-inspired Computation, EvoApplications 202

arXiv.org e-Print Archive

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Malmö University

Genetic Programming for Smart Phone Personalisation

Author: Cotillon Alban
Haak Aiden
Jurdak Raja
Valencia Philip
Publication venue
Publication date: 01/01/2014
Field of study

Personalisation in smart phones requires adaptability to dynamic context based on user mobility, application usage and sensor inputs. Current personalisation approaches, which rely on static logic that is developed a priori, do not provide sufficient adaptability to dynamic and unexpected context. This paper proposes genetic programming (GP), which can evolve program logic in realtime, as an online learning method to deal with the highly dynamic context in smart phone personalisation. We introduce the concept of collaborative smart phone personalisation through the GP Island Model, in order to exploit shared context among co-located phone users and reduce convergence time. We implement these concepts on real smartphones to demonstrate the capability of personalisation through GP and to explore the benefits of the Island Model. Our empirical evaluations on two example applications confirm that the Island Model can reduce convergence time by up to two-thirds over standalone GP personalisation.Comment: 43 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

Ensemble Learning for Free with Evolutionary Algorithms ?

Author: Gagné Christian
Schoenauer Marc
Sebag Michèle
Tomassini Marco
Publication venue
Publication date: 01/01/2007
Field of study

Evolutionary Learning proceeds by evolving a population of classifiers, from which it generally returns (with some notable exceptions) the single best-of-run classifier as final result. In the meanwhile, Ensemble Learning, one of the most efficient approaches in supervised Machine Learning for the last decade, proceeds by building a population of diverse classifiers. Ensemble Learning with Evolutionary Computation thus receives increasing attention. The Evolutionary Ensemble Learning (EEL) approach presented in this paper features two contributions. First, a new fitness function, inspired by co-evolution and enforcing the classifier diversity, is presented. Further, a new selection criterion based on the classification margin is proposed. This criterion is used to extract the classifier ensemble from the final population only (Off-line) or incrementally along evolution (On-line). Experiments on a set of benchmark problems show that Off-line outperforms single-hypothesis evolutionary learning and state-of-art Boosting and generates smaller classifier ensembles

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Polytechnique

HAL-Rennes 1

Recombinant anticoccidial vaccines - a cup half full?

Author: Ahmad
Arnott
Awad
Bafundo
Blake
Blake
Blake
Brothers
Brown
Chapman
Chapman
Chapman
Chapman
Chapman
Chapman
Chapman
Chapman
Chen
Clark
Clark
Cornelissen
Cosentino
Crane
Dalloul
Damer P. Blake
Ding
Ding
Djemai
Du
Eckford
Fiona M. Tomley
Fischbach
Fluit
Gedik
Gowen
Hedhli
Hoan
Huang
Iván Pastor-Fernández
Jahn
Jang
Jang
Jenkins
Jenkins
Jenkins
Jiang
Joyner
Karlsson
Keestra
Klotz
Koblansky
Konjufca
Kundu
Kvicerova
Labbe
Lai
Lassen
Lee
Li
Li
Lillehoj
Liu
Long
Marugan-Hernandez
Matthew J. Nolan
McDonald
McDonald
Min
Naciri
Nolan
Peek
Peek
Plattner
Pogonka
Reid
Ruff
Rugarabamu
Salazar Gonzalez
Sathish
Schaap
Shah
Sharman
Shen
Shirley
Shirley
Smith
Song
Song
Song
Song
Song
Song
Song
Subramanian
Sun
Takala
Tang
Tomley
Tomley
Tomley
Vermeulen
Vrba
Wallach
Wang
Wang
Williams
Williams
Williams
Williams
Williams
Witcombe
Wu
Xu
Xu
Yin
Yin
Yun
Zhang
Zhang
Zimmermann
Publication venue: 'Elsevier BV'
Publication date: 01/11/2017
Field of study

Eimeria species parasites can cause the disease coccidiosis, most notably in chickens. The occurrence of coccidiosis is currently controlled through a combination of good husbandry, chemoprophylaxis and/or live parasite vaccination; however, scalable, cost-effective subunit or recombinant vaccines are required. Many antigens have been proposed for use in novel anticoccidial vaccines, supported by the capacity to reduce disease severity or parasite replication, increase body weight gain in the face of challenge or improve feed conversion under experimental conditions, but none has reached commercial development. Nonetheless, the protection against challenge induced by some antigens has been within the lower range described for the ionophores against susceptible isolates or current live vaccines prior to oocyst recycling. With such levels of efficacy it may be that combinations of anticoccidial antigens already described are sufficient for development as novel multi-valent vaccines, pending identification of optimal delivery systems. Selection of the best antigens to be included in such vaccines can be informed by knowledge defining the natural occurrence of specific antigenic diversity, with relevance to the risk of immediate vaccine breakthrough, and the rate at which parasite genomes can evolve new diversity. For Eimeria, such data are now becoming available for antigens such as apical membrane antigen 1 (AMA1) and immune mapped protein 1 (IMP1) and more are anticipated as high-capacity, high-throughput sequencing technologies become increasingly accessible

Crossref