Search CORE

71 research outputs found

Nonnegative/binary matrix factorization with a D-Wave quantum annealer

Author: Alexandrov Boian S.
Alexandrov Ludmil B.
O'Malley Daniel
Vesselinov Velimir V.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 05/04/2017
Field of study

D-Wave quantum annealers represent a novel computational architecture and have attracted significant interest, but have been used for few real-world computations. Machine learning has been identified as an area where quantum annealing may be useful. Here, we show that the D-Wave 2X can be effectively used as part of an unsupervised machine learning method. This method can be used to analyze large datasets. The D-Wave only limits the number of features that can be extracted from the dataset. We apply this method to learn the features from a set of facial images

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

MalwareDNA: Simultaneous Classification of Malware, Malware Families, and Novel Malware

Author: Alexandrov Boian S.
Bhattarai Manish
Eren Maksim E.
Nicholas Charles
Rasmussen Kim
Publication venue
Publication date: 04/09/2023
Field of study

Malware is one of the most dangerous and costly cyber threats to national security and a crucial factor in modern cyber-space. However, the adoption of machine learning (ML) based solutions against malware threats has been relatively slow. Shortcomings in the existing ML approaches are likely contributing to this problem. The majority of current ML approaches ignore real-world challenges such as the detection of novel malware. In addition, proposed ML approaches are often designed either for malware/benign-ware classification or malware family classification. Here we introduce and showcase preliminary capabilities of a new method that can perform precise identification of novel malware families, while also unifying the capability for malware/benign-ware classification and malware family classification into a single framework.Comment: Accepted at IEEE ISI 202

arXiv.org e-Print Archive

Tensor Network Space-Time Spectral Collocation Method for Time Dependent Convection-Diffusion-Reaction Equations

Author: Adak Dibyendu
Alexandrov Boian S.
Manzini Gianmarco
Rasmussen Kim Ø.
Truong Duc P.
Publication venue
Publication date: 28/02/2024
Field of study

Emerging tensor network techniques for solutions of Partial Differential Equations (PDEs), known for their ability to break the curse of dimensionality, deliver new mathematical methods for ultrafast numerical solutions of high-dimensional problems. Here, we introduce a Tensor Train (TT) Chebyshev spectral collocation method, in both space and time, for solution of the time dependent convection-diffusion-reaction (CDR) equation with inhomogeneous boundary conditions, in Cartesian geometry. Previous methods for numerical solution of time dependent PDEs often use finite difference for time, and a spectral scheme for the spatial dimensions, which leads to slow linear convergence. Spectral collocation space-time methods show exponential convergence, however, for realistic problems they need to solve large four-dimensional systems. We overcome this difficulty by using a TT approach as its complexity only grows linearly with the number of dimensions. We show that our TT space-time Chebyshev spectral collocation method converges exponentially, when the solution of the CDR is smooth, and demonstrate that it leads to very high compression of linear operators from terabytes to kilobytes in TT-format, and tens of thousands times speedup when compared to full grid space-time spectral method. These advantages allow us to obtain the solutions at much higher resolutions

arXiv.org e-Print Archive

Interactive Distillation of Large Single-Topic Corpora of Scientific Papers

Author: Alexandrov Boian S.
Barron Ryan
Bhattarai Manish
Eren Maksim E.
Rasmussen Kim O.
Solovyev Nicholas
Publication venue
Publication date: 19/09/2023
Field of study

Highly specific datasets of scientific literature are important for both research and education. However, it is difficult to build such datasets at scale. A common approach is to build these datasets reductively by applying topic modeling on an established corpus and selecting specific topics. A more robust but time-consuming approach is to build the dataset constructively in which a subject matter expert (SME) handpicks documents. This method does not scale and is prone to error as the dataset grows. Here we showcase a new tool, based on machine learning, for constructively generating targeted datasets of scientific literature. Given a small initial "core" corpus of papers, we build a citation network of documents. At each step of the citation network, we generate text embeddings and visualize the embeddings through dimensionality reduction. Papers are kept in the dataset if they are "similar" to the core or are otherwise pruned through human-in-the-loop selection. Additional insight into the papers is gained through sub-topic modeling using SeNMFk. We demonstrate our new tool for literature review by applying it to two different fields in machine learning.Comment: Accepted at 2023 IEEE ICMLA conferenc

arXiv.org e-Print Archive