Search CORE

2,782 research outputs found

A machine learning route between band mapping and band structure

Author: Bauer Stefan
Beaulieu Samuel
Carbogno Christian
Dendzik Maciej
Dong Shuo
Ernstorfer Ralph
Rettig Laurenz
Schölkopf Bernhard
Stimper Vincent
Wolf Martin
Xian Rui Patrick
Zacharias Marios
Publication venue
Publication date: 20/05/2020
Field of study

The electronic band structure (BS) of solid state materials imprints the multidimensional and multi-valued functional relations between energy and momenta of periodically confined electrons. Photoemission spectroscopy is a powerful tool for its comprehensive characterization. A common task in photoemission band mapping is to recover the underlying quasiparticle dispersion, which we call band structure reconstruction. Traditional methods often focus on specific regions of interests yet require extensive human oversight. To cope with the growing size and scale of photoemission data, we develop a generic machine-learning approach leveraging the information within electronic structure calculations for this task. We demonstrate its capability by reconstructing all fourteen valence bands of tungsten diselenide and validate the accuracy on various synthetic data. The reconstruction uncovers previously inaccessible momentum-space structural information on both global and local scales in conjunction with theory, while realizing a path towards integrating band mapping data into materials science databases

arXiv.org e-Print Archive

HAL-CEA

MPG.PuRe

HAL-Rennes 1

Scalable and Sustainable Deep Learning via Randomized Hashing

Author: Chen Wenlin
Gionis Aristides
Indyk Piotr
Loosli Gaëlle
Lv Qin
McMahan H. Brendan
Recht Benjamin
Shrivastava Anshumali
Shrivastava Anshumali
Publication venue
Publication date: 04/12/2016
Field of study

Current deep learning architectures are growing larger in order to learn from complex datasets. These architectures require giant matrix multiplication operations to train millions of parameters. Conversely, there is another growing trend to bring deep learning to low-power, embedded devices. The matrix operations, associated with both training and testing of deep networks, are very expensive from a computational and energy standpoint. We present a novel hashing based technique to drastically reduce the amount of computation needed to train and test deep networks. Our approach combines recent ideas from adaptive dropouts and randomized hashing for maximum inner product search to select the nodes with the highest activation efficiently. Our new algorithm for deep learning reduces the overall computational cost of forward and back-propagation by operating on significantly fewer (sparse) nodes. As a consequence, our algorithm uses only 5% of the total multiplications, while keeping on average within 1% of the accuracy of the original model. A unique property of the proposed hashing based back-propagation is that the updates are always sparse. Due to the sparse gradient updates, our algorithm is ideally suited for asynchronous and parallel training leading to near linear speedup with increasing number of cores. We demonstrate the scalability and sustainability (energy efficiency) of our proposed algorithm via rigorous experimental evaluations on several real datasets

arXiv.org e-Print Archive

Crossref

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Parallel processing and expert systems

Author: Lau Sonie
Yan Jerry C.
Publication venue
Publication date
Field of study

Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited

NASA Technical Reports Server