Search CORE

60 research outputs found

Icing: Large-scale inference of immunoglobulin clonotypes

Author: DK Ralph
E Alamyar
EB Fowlkes
EP Rock
J Glanville
JA Vander Heiden
NT Gupta
RW Hamming
SH Kleinstein
TF Smith
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Immunoglobulin (IG) clonotype identification is a fundamental open question in modern immunology. An accurate description of the IG repertoire is crucial to understand the variety within the immune system of an individual, potentially shedding light on the pathogenetic process. Intrinsic IG heterogeneity makes clonotype inference an extremely challenging task, both from a computational and a biological point of view. Here we present icing, a framework that allows to reconstruct clonal families also in case of highly mutated sequences. icing has a modular structure, and it is designed to be used with large next generation sequencing (NGS) datasets, a technology which allows the characterisation of large-scale IG repertoires. We extensively validated the framework with clustering performance metrics on the results in a simulated case. icing is implemented in Python, and it is publicly available under FreeBSD licence at https://github.com/slipguru/icing

Crossref

Archivio istituzionale della ricerca - Università di Genova

Ranked Adjusted Rand: integrating distance and partition information in a measure of clustering agreement

Author: A Thalamuthu
B Larsen
C Silva-Costa
C Silva-Costa
D Steinley
DL Wallace
EB Fowlkes
FJ Rohlf
Francisco R Pinto
FX Wu
GW Milligan
H Chipman
H Li
HL Kundel
I Serrano
JA Carrico
JA Carrico
Jonas S Almeida
João A Carriço
L Hubert
M Meila
Mário Ramirez
PH Sneath
S van Dongen
WM Rand
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Biological information is commonly used to cluster or classify entities of interest such as genes, conditions, species or samples. However, different sources of data can be used to classify the same set of entities and methods allowing the comparison of the performance of two data sources or the determination of how well a given classification agrees with another are frequently needed, especially in the absence of a universally accepted "gold standard" classification. RESULTS: Here, we describe a novel measure – the Ranked Adjusted Rand (RAR) index. RAR differs from existing methods by evaluating the extent of agreement between any two groupings, taking into account the intercluster distances. This characteristic is relevant to evaluate cases of pairs of entities grouped in the same cluster by one method and separated by another. The latter method may assign them to close neighbour clusters or, on the contrary, to clusters that are far apart from each other. RAR is applicable even when intercluster distance information is absent for both or one of the groupings. In the first case, RAR is equal to its predecessor, Adjusted Rand (HA) index. Artificially designed clusterings were used to demonstrate situations in which only RAR was able to detect differences in the grouping patterns. A study with larger simulated clusterings ensured that in realistic conditions, RAR is effectively integrating distance and partition information. The new method was applied to biological examples to compare 1) two microbial typing methods, 2) two gene regulatory network distances and 3) microarray gene expression data with pathway information. In the first application, one of the methods does not provide intercluster distances while the other originated a hierarchical clustering. RAR proved to be more sensitive than HA in the choice of a threshold for defining clusters in the hierarchical method that maximizes agreement between the results of both methods. CONCLUSION: RAR has its major advantage in combining cluster distance and partition information, while the previously available methods used only the latter. RAR should be used in the research problems were HA was previously used, because in the absence of inter cluster distance effects it is an equally effective measure, and in the presence of distance effects it is a more complete one

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evaluation of Jackknife and Bootstrap for Defining Confidence Intervals for Pairwise Agreement Measures

Author: A Thalamuthu
AC Shore
AN Albatineh
Ana Severiano
B Efron
B Efron
B Mirkin
D. Ashley Robinson
DL Wallace
DS Smyth
EB Fowlkes
EP Smith
Fabio Rapallo
FR Pinto
FR Pinto
Francisco R. Pinto
G Cagney
GA Price
JA Carriço
JF Heltshe
JF Heltshe
JJ Hellmann
João A. Carriço
L Hubert
Mário Ramirez
NA Faria
NH Zaiss
P Jaccard
R Newson
S Zahl
W Smith
WM Rand
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Extracting Patterns from Educational Traces via Clustering and Associated Quality Metrics

Author: A Patrikainen
AP Dempster
B Mirkin
BFA Hompes
DA Jackson
DL Wallace
EB Fowlkes
I Jugo
KR Koedinger
L Kaufman
M Hall
M Meilă
PHA Sneath
PJ Rousseeuw
S Dasgupta
WM Rand
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Lazy Lasso for local regression

Author: A Hoerl
AS Fotheringham
B Efron
C Loader
CL Mallows
Concha Bielza
D Donoho
D Ruppert
DC Wheeler
Diego Vidaurre
DM Allen
EB Fowlkes
F Ferraty
F Ferraty
F Ferraty
GAF Seber
H Wang
H Zou
H Zou
J Barrientos-Marin
J Fan
J Fan
J Lafferty
J Ramsay
JA Khan
JP Jones
K Knight
L Breiman
L Grosenick
N Meinshausen
P Larrañaga
P Zhao
Pedro Larrañaga
R Tibshirani
RE Kass
S Ma
S Weisberg
SD Foster
SJ Devlin
T Hastie
T Hesterberg
WS Cleveland
WS Cleveland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenario

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Oxford University Research Archive

Archivo Digital UPM

How to find simple and accurate rules for viral protease cleavage specificities

Author: A Grakoui
A Kontijevskis
A Narayanan
A Urbani
AA Kolykhalov
AD Kwong
B Keil
BA Malcolm
BE Turk
BM Dunn
C Howson
CM Overall
Daniel Garwicz
DJC MacKay
E Berry
EB Fowlkes
H Eizert
H Neurath
HB Shen
I Schechter
Ian Jarman
IH Jarman
J Shi
JK Stoller
K Fujikawa
K Li
L You
L You
Liwen You
MR Attwood
NA Thornberry
O Schilling
Paulo JG Lisboa
R Bartenschlager
R Bartenschlager
R Rönn
R Zhang
RA Poorman
RE Stauber
SC Pettit
SH Yang
SM Best
SS Leinbach
SY Kim
T Rögnvaldsson
TA Etchells
Terence A Etchells
Thorsteinn Rögnvaldsson
X Hou
YH Kou
ZR Yang
ZR Yang
ZR Yang
ZR Yang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way. Results A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods. Conclusion A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.</p

Lund University Publications

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Högskolebiblioteket i Halmstad Publikationer

An Overall Index for Comparing Hierarchical Clusterings

Author: AD Gordon
AM Krieger
AN Albatineh
C Reilly
D Steinley
DL Wallace
EB Fowlkes
EB Fowlkes
FB Baker
FJ Lapointe
FJ Rohlf
G Youness
L Denoeud
LJ Hubert
M Meila
MJ Brusco
MJ Warrens
R Fraiman
RR Sokal
RR Sokal
S Zani
WM Rand
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper we suggest a new index for measuring thedistance between two hierarchical clusterings. This index can bedecomposed into the contributions pertaining to each stage of thehierarchies. We show the relations of such components with thecurrently used criteria for comparing two partitions. We obtain asimilarity index as the complement to one of the suggesteddistance and we propose its adjustment for agreement due tochance. We consider the extension of the proposed distance andsimilarity measures to more than two dendrograms and their use forthe consensus of classification and variable selection in clusteranalysis

Crossref

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Clustering Methods

Author: A Ben-Hur
A Ng
EB Fowlkes
J Shi
JC Bezdek
NR Pal
R Sharan
SC Madeira
SV Dongen
UV Luxburg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Sjukvårdspersonalens upplevelse i mötet med patienter med fetma

Author: A Hinneburg
C Domeniconi
C Merkwirth
CB Barber
DW Scott
EB Fowlkes
F Godtliebsen
F Kowalewski
FA Hamprecht
HH Bock
M Ester
R Wehrens
Y Cheng
Publication venue: Blekinge Tekniska Högskola, Institutionen för hälsa
Publication date: 01/01/2003
Field of study

Bakgrund: Då fetma ökar i världen behöver sjukvårdspersonal vara förberedd för att möta dessa patienter ute i vården. I ett möte sker en kontinuerlig kommunikation, vilket i sin tur har en stor betydelse på relationer, då det är en ömsesidig process. Det krävs vidare forskning för att utvärdera hur patientens vikt påverkar mötet med sjukvårdspersonal. Det var därför av intresse att forska vidare inom detta område. Syfte: Att beskriva sjukvårdspersonals upplevelser i mötet med patienter med fetma Metod: Metoden var en litteraturstudie med kvalitativ metod baserad på sju vetenskapliga artiklar. Analysen genomfördes utifrån Graneheim och Lundmans tolkning av kvalitativ innehållsanalys. Resultat: Efter analysprocessen framkom tre kategorier som beskrev studiens resultat. Att känna medkänsla. Att känna frustration. Att känna oro. Slutsats: Sjukvårdspersonals upplevelse i mötet med patienter med fetma bygger till stor del på förutfattade meningar. Då fetma alltmer blir ett växande problem i världen kan detta leda till att sjukvårdspersonal i sitt arbete kommer att möta dessa patienter i större utsträckning. Det behövs därför vidare forskning för att sjukvårdspersonal ska ha en trygg grund att stå på inför mötet med patienter med fetma. Endast då kan fetma bekämpas

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Validating Dialect Comparison Methods

Author: A Weijnen
AK Jain
B Kessler
C Hoppenbrouwers
EB Fowlkes
H Goebl
H Goebl
J Daan
J Goossens
J Nerbonne
J Séguy
LJ Hubert
R Schalkoff
WM Rand
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

Crossref