Search CORE

1,601,711 research outputs found

Proof-Pattern Recognition and Lemma Discovery in ACL2

Author: B. Buchberger
D. Kühlwein
E. Komendantskaya
J. Denzinger
J. Urban
K. Claessen
K. Claessen
M. Johansson
O. Montano-Rivas
R.L. McCasland
S. Colton
S. Hetzl
Publication venue
Publication date: 01/01/2013
Field of study

We present a novel technique for combining statistical machine learning for proof-pattern recognition with symbolic methods for lemma discovery. The resulting tool, ACL2(ml), gathers proof statistics and uses statistical pattern-recognition to pre-processes data from libraries, and then suggests auxiliary lemmas in new proofs by analogy with already seen examples. This paper presents the implementation of ACL2(ml) alongside theoretical descriptions of the proof-pattern recognition and lemma discovery methods involved in it

arXiv.org e-Print Archive

Crossref

Heriot Watt Pure

Chalmers Research

Discovery Research Portal

Swepub

Novel statistical approaches for non-normal censored immunological data: analysis of cytokine and gene expression data

Author: A Arabmazar
Andrew Rowland Dalby
Anna Lluis
B Genser
B Schaub
Bianca Schaub
CA Heid
EL Kaplan
Erika von Mutius
F Wilcoxon
HW Uh
J Tobin
JH Lubin
KE Hobbs
KH Brodersen
KJ Livak
Nikolaus Ballenberger
P Austin
R Peto
RL Iman
RL Prentice
Sabina Illi
TJ Buckley
WH Kruskal
WJ Conover
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Background: For several immune-mediated diseases, immunological analysis will become more complex in the future with datasets in which cytokine and gene expression data play a major role. These data have certain characteristics that require sophisticated statistical analysis such as strategies for non-normal distribution and censoring. Additionally, complex and multiple immunological relationships need to be adjusted for potential confounding and interaction effects. Objective: We aimed to introduce and apply different methods for statistical analysis of non-normal censored cytokine and gene expression data. Furthermore, we assessed the performance and accuracy of a novel regression approach in order to allow adjusting for covariates and potential confounding. Methods: For non-normally distributed censored data traditional means such as the Kaplan-Meier method or the generalized Wilcoxon test are described. In order to adjust for covariates the novel approach named Tobit regression on ranks was introduced. Its performance and accuracy for analysis of non-normal censored cytokine/gene expression data was evaluated by a simulation study and a statistical experiment applying permutation and bootstrapping. Results: If adjustment for covariates is not necessary traditional statistical methods are adequate for non-normal censored data. Comparable with these and appropriate if additional adjustment is required, Tobit regression on ranks is a valid method. Its power, type-I error rate and accuracy were comparable to the classical Tobit regression. Conclusion: Non-normally distributed censored immunological data require appropriate statistical methods. Tobit regression on ranks meets these requirements and can be used for adjustment for covariates and potential confounding in large and complex immunological datasets

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

Open Access LMU ( Ludwig-Maximilians-Univ. München)

PubMed Central

The Francis Crick Institute

Statistical inference using SGD

Author: Caramanis Constantine
Kyrillidis Anastasios
Li Tianyang
Liu Liu
Publication venue
Publication date: 19/11/2017
Field of study

We present a novel method for frequentist statistical inference in

M

-estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the Ornstein-Uhlenbeck process suggests that such averages are asymptotically normal. From a practical perspective, our SGD-based inference procedure is a first order method, and is well-suited for large scale problems. To show its merits, we apply it to both synthetic and real datasets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation.Comment: To appear in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Statistical Models of Reconstructed Phase Spaces for Signal Classification

Author: Johnson Michael T
Lindgren Andrew C.
Povinelli Richard J.
Roberts Felice M.
Ye Jinjin
Publication venue: e-Publications@Marquette
Publication date: 01/06/2006
Field of study

This paper introduces a novel approach to the analysis and classification of time series signals using statistical models of reconstructed phase spaces. With sufficient dimension, such reconstructed phase spaces are, with probability one, guaranteed to be topologically equivalent to the state dynamics of the generating system, and, therefore, may contain information that is absent in analysis and classification methods rooted in linear assumptions. Parametric and nonparametric distributions are introduced as statistical representations over the multidimensional reconstructed phase space, with classification accomplished through methods such as Bayes maximum likelihood and artificial neural networks (ANNs). The technique is demonstrated on heart arrhythmia classification and speech recognition. This new approach is shown to be a viable and effective alternative to traditional signal classification approaches, particularly for signals with strong nonlinear characteristics

epublications@Marquette

Fast and Guaranteed Tensor Decomposition via Sketching

Author: Anandkumar Animashree
Smola Alexander
Tung Hsiao-Yu
Wang Yining
Publication venue
Publication date: 01/01/2015
Field of study

Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information Processing Systems (NIPS), held at Montreal, Canada in 201

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Caltech Authors

Joint and individual analysis of breast cancer histologic images and genomic covariates

Author: Calhoun Benjamin C.
Carmichael Iain
Couture Heather D.
Geradts Joseph
Hannig Jan
Hoadley Katherine A.
Marron J. S.
Niethammer Marc
Olsson Linnea
Perou Charles M.
Troester Melissa A.
Publication venue
Publication date: 13/04/2020
Field of study

A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little overlap between these two fields. We aim to bridge this gap by developing methods based on Angle-based Joint and Individual Variation Explained (AJIVE) to directly explore similarities and differences between these two modalities. Our approach exploits Convolutional Neural Networks (CNNs) as a powerful, automatic method for image feature extraction to address some of the challenges presented by statistical analysis of histopathology image data. CNNs raise issues of interpretability that we address by developing novel methods to explore visual modes of variation captured by statistical algorithms (e.g. PCA or AJIVE) applied to CNN features. Our results provide many interpretable connections and contrasts between histopathology and genetics

arXiv.org e-Print Archive

PubMed Central

Carolina Digital Repository

The ScholarShip (East Carolina University)