Search CORE

172 research outputs found

Numerical Implementation of lepton-nucleus interactions and its effect on neutrino oscillation analysis

Author: Ankowski A.
Benhar O.
Furmanski A. P.
Jen C. -M.
Kalousis L. N.
Mariani C.
Publication venue: 'American Physical Society (APS)'
Publication date: 13/10/2014
Field of study

We discuss the implementation of the nuclear model based on realistic nuclear spectral functions in the GENIE neutrino interaction generator. Besides improving on the Fermi gas description of the nuclear ground state, our scheme involves a new prescription for

Q^2

selection, meant to efficiently enforce energy momentum conservation. The results of our simulations, validated through comparison to electron scattering data, have been obtained for a variety of target nuclei, ranging from carbon to argon, and cover the kinematical region in which quasi elastic scattering is the dominant reaction mechanism. We also analyse the influence of the adopted nuclear model on the determination of neutrino oscillation parameters.Comment: 19 pages, 35 figures, version accepted by Phys. Rev.

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Automated data pre-processing via meta-learning

Author: A Guazzelli
A Kalousis
D Pyle
F Serban
J Vanschoren
J-U Kietz
M Hall
MA Munson
SF Crone
T Dasu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Conditional Neural Relational Inference for Interacting Systems

Author: Armand Stéphane
Blondé Lionel
Kalousis Alexandros
Ramos Joao A. Candido
Publication venue
Publication date: 01/01/2021
Field of study

In this work, we want to learn to model the dynamics of similar yet distinct groups of interacting objects. These groups follow some common physical laws that exhibit specificities that are captured through some vectorial description. We develop a model that allows us to do conditional generation from any such group given its vectorial description. Unlike previous work on learning dynamical systems that can only do trajectory completion and require a part of the trajectory dynamics to be provided as input in generation time, we do generation using only the conditioning vector with no access to generation time's trajectories. We evaluate our model in the setting of modeling human gait and, in particular pathological human gait

arXiv.org e-Print Archive

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Archive ouverte UNIGE

Determining appropriate approaches for using data in feature selection

Author: A Kalousis
C Ambroise
DW Aha
F Wilcoxon
G Chandrashekar
H Liu
J Reunanen
JC Platt
JR Quinlan
L Yu
M Lecocke
MA Hall
P Somol
V Bolón-Canedo
Y Han
Y Saeys
Z He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/12/2015
Field of study

Feature selection is increasingly important in data analysis and machine learning in big data era. However, how to use the data in feature selection, i.e. using either ALL or PART of a dataset, has become a serious and tricky issue. Whilst the conventional practice of using all the data in feature selection may lead to selection bias, using part of the data may, on the other hand, lead to underestimating the relevant features under some conditions. This paper investigates these two strategies systematically in terms of reliability and effectiveness, and then determines their suitability for datasets with different characteristics. The reliability is measured by the Average Tanimoto Index and the Inter-method Average Tanimoto Index, and the effectiveness is measured by the mean generalisation accuracy of classification. The computational experiments are carried out on ten real-world benchmark datasets and fourteen synthetic datasets. The synthetic datasets are generated with a pre-set number of relevant features and varied numbers of irrelevant features and instances, and added with different levels of noise. The results indicate that the PART approach is more effective in reducing the bias when the size of a dataset is small but starts to lose its advantage as the dataset size increases

Crossref

Springer - Publisher Connector

University of East Anglia digital repository

Adjusted Measures for Feature Selection Stability for Data Sets with Similar Features

Author: A Bommert
A Kalousis
A Statnikov
J Vanschoren
JE Hopcroft
L Lausser
L Yu
M Lang
M Zhang
M Zucknick
MS Rahman
P Jaccard
Z He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2020
Field of study

For data sets with similar features, for example highly correlated features, most existing stability measures behave in an undesired way: They consider features that are almost identical but have different identifiers as different features. Existing adjusted stability measures, that is, stability measures that take into account the similarities between features, have major theoretical drawbacks. We introduce new adjusted stability measures that overcome these drawbacks. We compare them to each other and to existing stability measures based on both artificial and real sets of selected features. Based on the results, we suggest using one new stability measure that considers highly similar features as exchangeable

arXiv.org e-Print Archive

Crossref

Margin and Radius Based Multiple Kernel Learning

Author: A. Kalousis
A. Kalousis
B. Schölkopf
C.S. Ong
F.R. Bach
G. Lanckriet
J. Bonnans
J. Shawe-Taylor
K. Crammer
N. Cristianini
O. Bousquet
O. Chapelle
Q. McNemar
S. Sonnenburg
V. Vapnik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Stable and Accurate Feature Selection

Author: A. Kalousis
E. Alpaydin
G. Marsaglia
G. Tzanetakis
H. Liu
I. Ding
L. Yu
M.S. Pepe
Y. Saeys
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Crossref

Addressing the Challenge of Defining Valid Proteomic Biomarkers and Classifiers

Author: Carpentier S
Dakna M
Girolami M
Harris K
Haubitz M
Kalousis A
Kolch W
Mischak H
Schanstra JP
Vlahou A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: The purpose of this manuscript is to provide, based on an extensive analysis of a proteomic data set, suggestions for proper statistical analysis for the discovery of sets of clinically relevant biomarkers. As tractable example we define the measurable proteomic differences between apparently healthy adult males and females. We choose urine as body-fluid of interest and CE-MS, a thoroughly validated platform technology, allowing for routine analysis of a large number of samples. The second urine of the morning was collected from apparently healthy male and female volunteers (aged 21-40) in the course of the routine medical check-up before recruitment at the Hannover Medical School.Results: We found that the Wilcoxon-test is best suited for the definition of potential biomarkers. Adjustment for multiple testing is necessary. Sample size estimation can be performed based on a small number of observations via resampling from pilot data. Machine learning algorithms appear ideally suited to generate classifiers. Assessment of any results in an independent test set is essential.Conclusions: Valid proteomic biomarkers for diagnosis and prognosis only can be defined by applying proper statistical data mining procedures. In particular, a justification of the sample size should be part of the study design

Research Repository UCD

Irish Universities

Enlighten

White Rose Research Online

CUED - Cambridge University Engineering Department

Hal-Diderot

Lirias

Crossref

Springer - Publisher Connector

HAL-Inserm

UCL Discovery

PubMed Central

Algebraic Comparison of Partial Lists in Bioinformatics

Author: A Gobbi
A Kalousis
A Kossenkov
A Sboner
AC Haury
AL Boulesteix
Arkady B. Khodursky
B Di Camillo
B Efron
B Efron
B Efron
B Schowe
C Cortes
C Cortes
C Furlanello
C Schneider
C Schneider
C Soneson
C Yao
Cesare Furlanello
Consortium The MicroArray Quality Control (MAQC)
D Albanese
D Cai
D Corrada
D Critchlow
D Saari
D Witten
G Guzzetta
G Jurman
G Jurman
G Lance
G Lance
G Smyth
Giuseppe Jurman
GS Cheon
I Guyon
I Jeffery
I Lönnstedt
J Bar-Ilan
J Borda
J Chen
J Ioannidis
J Neter
J Storey
L Ein-Dor
L Kuncheva
L Yu
L Zhang
M Desarkar
M Kauers
M Kauers
M Kendall
M Schimek
M Schimek
M Slawski
M Villarino
M Villarino
O Bousquet
P Baldi
P Diaconis
P Diaconis
P Hall
P Hall
P Krízek
PC Boutros
R Fagin
R Gentleman
R Graham
R Pearson
R Pique-Regi
R Pique-Regi
R Simon
Roberto Visintainer
S Abramov
S Dudoit
S Lin
S Lin
S Mukherjee
S Setlur
S Simićc
S Vanderlooy
Samantha Riccadonna
SK Lau
T Bø
T Calders
V Tusher
Visintainer
W Fury
W Hoeffding
W Shi
X Wang
X Yang
Y Xiao
Y Xiao
Z He
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2010
Field of study

The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central

Measurement of cosmic-ray reconstruction efficiencies in the MicroBooNE LArTPC using a small external cosmic-ray counter

Author: Acciarri R.
Adams C.
An R.
Anthony J.
Asaadi J.
Auger M.
Bagby L.
Balasubramanian S.
Baller B.
Barnes C.
Barr G.
Bass M.
Bay F.
Bishai M.
Blake A.
Bolton T.
Caicedo D. A. Martinez
Camilleri L.
Caratelli D.
Carls B.
Cavanna F.
Chen H.
Church E.
Cianci D.
Cohen E.
Collin G. H.
Conrad J. M.
Convery M.
Crespo-Anadon J. I.
de Vries J. Jan
Del Tutto M.
Devitt D.
Dytman S.
Eberly B.
Ereditato A.
Esquivel J.
Fadeeva A. A.
Fernandez R. Castillo
Fleming B. T.
Foreman W.
Furmanski A. P.
Garcia-Gamez D.
Garvey G. T.
Genty V.
Goeldi D.
Gollapinni S.
Graf N.
Gramellini E.
Greenlee H.
Grosso R.
Guenette R.
Hackenburg A.
Hamilton P.
Hen O.
Hewes J.
Hill C.
Ho J.
Horton-Smith G.
Hourlier A.
Huang E. -C.
James C.
Jen C. -M.
Jiang L.
John J. St.
Johnson R. A.
Joshi J.
Jostlein H.
Kaleko D.
Kalousis L. N.
Karagiorgi G.
Ketchum W.
Kirby B.
Kirby M.
Kobilarcik T.
Kreslo I.
Lange G.
Laube A.
Li Y.
Lister A.
Littlejohn B. R.
Lockwitz S.
Lorca D.
Louis W. C.
Luethi M.
Lundberg B.
Luo X.
Marchionni A.
Mariani C.
Marshall J.
Meddage V.
Miceli T.
MicroBooNE collaboration
Mills G. B.
Moon J.
Mooney M.
Moore C. D.
Mousseau J.
Murrells R.
Naples D.
Nienaber P.
Nowak J.
Palamara O.
Paolone V.
Papavassiliou V.
Pate S. F.
Pavlovic Z.
Pelkey R.
Piasetzky E.
Porzio D.
Pulliam G.
Qian X.
Raaf J. L.
Rafique A.
Rochester L.
Russell B.
Sanchez L. Escudero
Schmitz D. W.
Schukraft A.
Seligman W.
Shaevitz M. H.
Sinclair J.
Smith A.
Snider E. L.
Soderberg M.
Soldner-Rembold S.
Soleti S. R.
Spentzouris P.
Spitz J.
Strauss T.
Szelc A. M.
Tagg N.
Terao K.
Thomson M.
Toups M.
Tsai Y. -T.
Tufanli S.
Usher T.
Van De Pontseele W.
Van de Water R. G.
Viren B.
von Rohr C. Rudolf
Weber M.
Wickremasinghe D. A.
Wolbers S.
Wongjirad T.
Woodruff K.
Yang T.
Yates L.
Zeller G. P.
Zennamo J.
Zhang C.
Publication venue: 'IOP Publishing'
Publication date: 31/07/2017
Field of study

The MicroBooNE detector is a liquid argon time projection chamber at Fermilab designed to study short-baseline neutrino oscillations and neutrino-argon interaction cross-section. Due to its location near the surface, a good understanding of cosmic muons as a source of backgrounds is of fundamental importance for the experiment. We present a method of using an external 0.5 m (L) x 0.5 m (W) muon counter stack, installed above the main detector, to determine the cosmic-ray reconstruction efficiency in MicroBooNE. Data are acquired with this external muon counter stack placed in three different positions, corresponding to cosmic rays intersecting different parts of the detector. The data reconstruction efficiency of tracks in the detector is found to be

\epsilon_{\mathrm{data}}=(97.1\pm0.1~(\mathrm{stat}) \pm 1.4~(\mathrm{sys}))\%

, in good agreement with the Monte Carlo reconstruction efficiency

\epsilon_{\mathrm{MC}} = (97.4\pm0.1)\%

. This analysis represents a small-scale demonstration of the method that can be used with future data coming from a recently installed cosmic-ray tagger system, which will be able to tag

\approx80\%

of the cosmic rays passing through the MicroBooNE detector.Comment: 19 pages, 12 figure

arXiv.org e-Print Archive