Search CORE

234 research outputs found

Recommended from our members

How does predicate invention affect human comprehensibility?

Author: AA Freitas
AM Turing
B Letham
JR Quinlan
KD Forbus
L Sterling
M Mozina
SH Muggleton
SH Muggleton
SH Muggleton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

City Research Online

Crossref

Recommended from our members

Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP

Author: A Srinivasan
AA Freitas
Alireza Tamaddoni-Nezhad
AM Turing
B Chandrasekaran
B Letham
Christina Zeller
E Kitzelmann
EA Lemke
F Bergadano
H Kahney
H Schielzeth
J Huysmans
JR Quinlan
JR Quinlan
KD Forbus
L Sterling
M Mozina
MD Hauser
MR Wick
SH Muggleton
SH Muggleton
SH Muggleton
SH Muggleton
SH Muggleton
Stephen H. Muggleton
Tarek Besold
TM Mitchell
U Schmid
Ute Schmid
WJ Clancey
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

During the 1980s Michie defined Machine Learning in terms of two orthogonal axes of performance: predictive accuracy and comprehensibility of generated hypotheses. Since predictive accuracy was readily measurable and comprehensibility not so, later definitions in the 1990s, such as Mitchell’s, tended to use a one-dimensional approach to Machine Learning based solely on predictive accuracy, ultimately favouring statistical over symbolic Machine Learning approaches. In this paper we provide a definition of comprehensibility of hypotheses which can be estimated using human participant trials. We present two sets of experiments testing human comprehensibility of logic programs. In the first experiment we test human comprehensibility with and without predicate invention. Results indicate comprehensibility is affected not only by the complexity of the presented program but also by the existence of anonymous predicate symbols. In the second experiment we directly test whether any state-of-the-art ILP systems are ultra-strong learners in Michie’s sense, and select the Metagol system for use in humans trials. Results show participants were not able to learn the relational concept on their own from a set of examples but they were able to apply the relational definition provided by the ILP system correctly. This implies the existence of a class of relational concepts which are hard to acquire for humans, though easy to understand given an abstract explanation. We believe improved understanding of this class could have potential relevance to contexts involving human learning, teaching and verbal interaction

City Research Online

Crossref

University of Surrey

Spiral - Imperial College Digital Repository

WordCluster: detecting clusters of DNA words and genomic elements

Author: A Sandelin
A Siepel
AR Quinlan
B Giardine
D Durand
D Karolchik
Guillermo Barturen
José L Oliver
KD Pruitt
M Ashburner
M Gardiner-Garden
M Hackenberg
M Hackenberg
M Hackenberg
M Hackenberg
Michael Hackenberg
P Carpena
Pedro Bernaola-Galván
Pedro Carpena
R Aloni
R Lister
TJ Hubbard
VJ Makeev
Ángel M Alganza
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Many <it>k-</it>mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (<it>k-</it>mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used <it>WordCluster </it>to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions <it>WordCluster </it>seems to predict biological meaningful clusters of DNA words (<it>k-</it>mers) and genomic entities. The implementation of the method into a web server is available at <url>http://bioinfo2.ugr.es/wordCluster/wordCluster.php</url> including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.</p

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Repositorio Institucional Universidad de Granada

Information and feedback to improve occupational physicians’ reporting of occupational diseases: a randomised controlled trial

Author: A Dijkstra
A Dijkstra
A Figueiras
A Vallano
Annet F. Lenderink
B Schüz
BJ Silk
CG Alexopoulos
D Coggon
Dick Spreeuwers
E Vet de
Frank J. H. van Dijk
G Pransky
H Nordman
HD Scott
I Brissette
J Biddle
Jac J. L. van der Klink
JM Castel
JO Prochaska
KB Quinlan
KD Rosenman
KD Rosenman
L Cornelissen
L Hazell
LS Azaroff
M Bäckström
M Bäckström
MMM Vos de
N Poonai
P McGettigan
PB Smits
R Orriols
RC Bracchi
SM Friedman
SM Wallerstedt
T Kauppinen
T Scherzer
WA Gebhardt
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

To assess the effectiveness of supplying occupational physicians (OPs) with targeted and stage-matched information or with feedback on reporting occupational diseases to the national registry in the Netherlands. In a randomized controlled design, 1076 OPs were divided into three groups based on previous reporting behaviour: precontemplators not considering reporting, contemplators considering reporting and actioners reporting occupational diseases. Precontemplators and contemplators were randomly assigned to receive stage-matched, stage-mismatched or general information. Actioners were randomly assigned to receive personalized or standardized feedback upon notification. Outcome measures were the number of OPs reporting and the number of reported occupational diseases in a 180-day period before and after the intervention. Precontemplators were significantly more male and self-employed compared to contemplators and actioners. There was no significant effect of stage-matched information versus stage-mismatched or general information on the percentage of reporting OPs and on the mean number of notifications in each group. Receiving any information affected reporting more in contemplators than in precontemplators. The mean number of notifications in actioners increased more after personalized feedback than after standardized feedback, but the difference was not significant. This study supports the concept that contemplators are more susceptible to receiving information but could not confirm an effect of stage-matching this information on reporting occupational diseases to the national registr

Crossref

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

PubMed Central

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Dissertations of the University of Groningen

Identifying hazardousness of sewer pipeline gas mixture using classification methods: a comparative study

Author: AS Weigend
Atal Chaudhuri
C Cantalini
C Cortes
C Wongchoosuk
CG Atkeson
D Lowe
D-S Lee
DH Wolpert
DR Cox
DS Simonton
DW Aha
E Llobet
F Esposito
H Baha
J Whorton
JJ Rodriguez
JR Quinlan
KD Mitzner
L Breiman
L Olshen
LI Kuncheva
LK Weaver
N Landwehr
Paramartha Dutta
R Polikar
RJ Lewis
SH Walker
TK Ho
Varun Kumar Ojha
VK Ojha
VK Ojha
VK Ojha
VK Ojha
W So
Y Freund
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In this work, we formulated a real-world problem related to sewer pipeline gas detection using the classification-based approaches. The primary goal of this work was to identify the hazardousness of sewer pipeline to offer safe and non-hazardous access to sewer pipeline workers so that the human fatalities, which occurs due to the toxic exposure of sewer gas components, can be avoided. The dataset acquired through laboratory tests, experiments, and various literature sources was organized to design a predictive model that was able to identify/classify hazardous and non-hazardous situation of sewer pipeline. To design such prediction model, several classification algorithms were used and their performances were evaluated and compared, both empirically and statistically, over the collected dataset. In addition, the performances of several ensemble methods were analyzed to understand the extent of improvement offered by these methods. The result of this comprehensive study showed that the instance-based learning algorithm performed better than many other algorithms such as multilayer perceptron, radial basis function network, support vector machine, reduced pruning tree. Similarly, it was observed that multi-scheme ensemble approach enhanced the performance of base predictors

arXiv.org e-Print Archive

Central Archive at the University of Reading

Crossref

DSpace at VSB Technical University of Ostrava

Rational Design of Temperature-Sensitive Alleles Using Computational Structure Prediction

Author: B Cunningham
B Lee
C Cortes
Ca Rohl
Christopher S. Poultney
CJ Burges
David Gresham
Dennis E. Shasha
EH Kellogg
G Chakshusmathi
Glenn L. Butterfoss
HM Muller
JM Word
JR Quinlan
K Bajaj
K Drew
KD Pruitt
Kevin Drew
Kristin C. Gunsalus
M Hall
Michelle R. Gutwein
N Eswar
N Siew
R Varadarajan
Richard Bonneau
RJ Dohmen
S Tweedie
SF Altschul
SF Altschul
TW Harris
Vladimir N. Uversky
WS Noble
WS Sandberg
Publication venue: Public Library of Science
Publication date: 02/09/2011
Field of study

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate “top 5” list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions

Public Library of Science (PLOS)

Crossref

PubMed Central

Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

Author: A Bairoch
A Christoffels
A Gurevich
A Kozomara
A McKenna
A Mitchell
A Morgulis
A Morgulis
A Pradhan
A Reiner
A Rodriguez-Mari
A Stamatakis
A Yates
AI Makunin
AJ Enright
AL Price
AL Price
Alan Christoffels
Aleksey Komissarov
Alexey Tupikin
Amy Hin Yan Tong
Andrey A. Yurchenko
AR Quinlan
B Langmead
B Star
C Berthelot
C Camacho
C Holt
C Wang
Chen-Shan Chin
CS Chin
D Brawand
D Ellinghaus
DA Benson
Darrell Green
DC Hardie
Dean R. Jerry
DH Alexander
Doreen Lau
DR Kelley
DRS-K C. Jerry
E Casacuberta
E. TG Staristina
EW Myers
F Abascal
F Chen
F Yang
FC Jones
FJ Krsticevic
Fritz J. Sedlazeck
G Abrusan
G Benson
G Lin
G Marcais
G Parra
G Parra
G Tamazian
GH Yue
GH Yue
Gopikrishna Gopalapillai
Gregory W. Vurture
GS Slater
GT Valente
H Li
H Saiga
Heiner Kuhl
HH Kazazian Jr.
I Braasch
Inna S. Kuznetsova
IS Kuznetsova
J Castresana
J Eid
J Huerta-Cepas
J Jurka
J Lin
James P. Drake
JG Ruby
JN Volff
JN Volff
Jolly M. Saju
Jonas Korlach
JS Chew
Junhui Jiang
K Howe
K Katoh
K Prufer
Kathiresan Purushothaman
KD Pruitt
KJ Hoff
KP Koepfli
KW Tzung
Lawrence S. Hon
László Orbán
M Blanchette
M Kanehisa
M Kasahara
M Kolmogorov
M Krzywinski
M Martin
M Schartl
M Tarailoâ-Graovac
M Tine
MA Larkin
Mario Jonas
Marsel Kabilov
Matthew Boitano
MB Stocks
MG Grabherr
Michael C. Schatz
MJ Chaisson
MR Friedlander
N Siegel
Natascha M. Thevasagayam
NM Thevasagayam
O Jaillon
O Otero
P Cingolani
P Ravi
P Schattner
P Shannon
P Xu
Paul M. Richardson
PE Warburton
Peter Van Heusden
R Kajitani
R Lorenz
R Luo
R Moore
R Pethiyagoda
R Poulter
R She
R Sreenivasan
Ramkumar Lachumanan
RD Ward
RD Ward
Richard Hall
RJ Roberts
S Chen
S Guindon
S Hoegg
S Hoegg
S Koren
S Vij
S Zhou
Sai Rama Sridatta Prakki
Sarah Mwangi
SF Altschul
Shubha Vij
Si Lok
Si Yan Ngoh
Siddharth Singh
Simon Moxon
SM Kielbasa
Sridhar Sivasubbu
Stanley Kimbung Mbandi
Stephen J. O'Brien
Stephen W. Turner
T Anantharaman
Tamás Dalmay
Tansyn H. Noble
TD Wu
TF DeLuca
TH O'Hare
TLO Davis
TS Anantharaman
Tyler Garvin
U Consortium
U Grimholt
V Douard
V Ravi
Vinaya Kumar Katneni
Vinod Scaria
Vladimir Trifonov
W Xue
WC Liew
Woei Chang Liew
WS Davidson
X Huang
X Zheng
XG Wang
XG Wang
Xueyan Shen
Y Guiguen
Y Han
Y Hashiguchi
Y Moriya
Y Sato
Y Sato
Y Sato
Z Lai
Ø Hammer
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

Public Library of Science (PLOS)

ResearchOnline@JCU

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

ResearchOnline at James Cook University

PubMed Central

Research Repository

Repository of the Academy's Library

University of East Anglia digital repository

NSU Works

MPG.PuRe

Intermediate DNA methylation is a conserved signature of genome regulation

Author: A Chess
A Doi
A Visel
A Wutz
AC Ferguson-Smith
AE Jaffe
AK Maunakea
AK Maunakea
AK Shalek
AR Quinlan
B Zhang
BE Bernstein
C Grunau
D Kitsberg
EP Consortium
F Fang
F Mohn
F Tang
GC Hon
H Cedar
J Ernst
J Gertz
JT Bell
K Kerkel
KD Hansen
LC Schalkwyk
M Okano
M Stevens
M Xie
MB Stadler
MJ Ziller
ML Maeder
P Liang
PB Gupta
Q Deng
R Lister
RA Harris
SR Romanov
ST Sherry
T Wang
W Xie
WG Chen
Y Bergman
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The role of intermediate methylation states in DNA is unclear. Here, to comprehensively identify regions of intermediate methylation and their quantitative relationship with gene activity, we apply integrative and comparative epigenomics to 25 human primary cell and tissue samples. We report 18,452 intermediate methylation regions located near 36 of genes and enriched at enhancers, exons and DNase I hypersensitivity sites. Intermediate methylation regions average 57 methylation, are predominantly allele-independent and are conserved across individuals and between mouse and human, suggesting a conserved function. These regions have an intermediate level of active chromatin marks and their associated genes have intermediate transcriptional activity. Exonic intermediate methylation correlates with exon inclusion at a level between that of fully methylated and unmethylated exons, highlighting gene context-dependent functions. We conclude that intermediate DNA methylation is a conserved signature of gene regulation and exon usage

University of Essex Research Repository

King's Research Portal

Global gene disruption in human cells to assign genes to phenotypes

Author: AR Quinlan
B Langmead
B Rhead
Bingbing Yuan
Carla P Guimaraes
Chong Sun
D Gommel
D Nesic
DA Scott
E Bergeret
F Bao
George Bell
Hidde L Ploegh
Irene Wuethrich
Jan E Carette
JD Gawronski
JE Carette
KD Pruitt
M Lara-Tejero
M Steegmaier
Malini Varadarajan
Markus K Muellner
MF van Delft
P Hauck
P Mazurkiewicz
R Janz
RL Momparler
S Dröse
Sebastian M Nijman
T Brady
T Higo
T Oltersdorf
T van Opijnen
Thijn R Brummelkamp
Vincent A Blomen
YJ Qin
Z Ge
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2010
Field of study

Insertional mutagenesis in a haploid background can disrupt gene function[superscript 1]. We extend our earlier work by using a retroviral gene-trap vector to generate insertions in >98% of the genes expressed in a human cancer cell line that is haploid for all but one of its chromosomes. We apply phenotypic interrogation via tag sequencing (PhITSeq) to examine millions of mutant alleles through selection and parallel sequencing. Analysis of pools of cells, rather than individual clones[superscript 1] enables rapid assessment of the spectrum of genes involved in the phenotypes under study. This facilitates comparative screens as illustrated here for the family of cytolethal distending toxins (CDTs). CDTs are virulence factors secreted by a variety of pathogenic Gram-negative bacteria responsible for tissue damage at distinct anatomical sites[superscript 2]. We identify 743 mutations distributed over 12 human genes important for intoxication by four different CDTs. Although related CDTs may share host factors, they also exploit unique host factors to yield a profile characteristic for each CDT

DSpace@MIT

Crossref

PubMed Central

Oxford University Research Archive

Chromatin States Accurately Classify Cell Differentiation Stages

Author: A Barski
A Magklara
AR Quinlan
B Wen
BD Strahl
BE Bernstein
BE Bernstein
C Chang
C Cortes
D Hebenstreit
D Huangfu
DB Allison
E Birney
F Mohn
G Wei
Guo-Cheng Yuan
H Ji
I Ben-Porath
J Ernst
J Ernst
J Kim
J Massague
Jessica L. Larson
JL Larson
K Pollard
KD Pruitt
LA Boyer
MD Young
ND Heintzman
ND Heintzman
PA Jones
R Durbin
R Lister
R Tibshirani
RD Hawkins
RE Thurman
RL Jirtle
S Keerthi
Simon Keith Whitehall
T Kouzarides
TS Mikkelsen
W Huang da
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Gene expression is controlled by the concerted interactions between transcription factors and chromatin regulators. While recent studies have identified global chromatin state changes across cell-types, it remains unclear to what extent these changes are co-regulated during cell-differentiation. Here we present a comprehensive computational analysis by assembling a large dataset containing genome-wide occupancy information of 5 histone modifications in 27 human cell lines (including 24 normal and 3 cancer cell lines) obtained from the public domain, followed by independent analysis at three different representations. We classified the differentiation stage of a cell-type based on its genome-wide pattern of chromatin states, and found that our method was able to identify normal cell lines with nearly 100% accuracy. We then applied our model to classify the cancer cell lines and found that each can be unequivocally classified as differentiated cells. The differences can be in part explained by the differential activities of three regulatory modules associated with embryonic stem cells. We also found that the “hotspot” genes, whose chromatin states change dynamically in accordance to the differentiation stage, are not randomly distributed across the genome but tend to be embedded in multi-gene chromatin domains, and that specialized gene clusters tend to be embedded in stably occupied domains

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

PubMed Central

FigShare