Search CORE

66 research outputs found

A slave mode expansion for obtaining ab-initio interatomic potentials

Author: Ai Xinyuan
Chen Yue
Marianetti Chris A.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2014
Field of study

Here we propose a new approach for performing a Taylor series expansion of the first-principles computed energy of a crystal as a function of the nuclear displacements. We enlarge the dimensionality of the existing displacement space and form new variables (ie. slave modes) which transform like irreducible representations of the space group and satisfy homogeneity of free space. Standard group theoretical techniques can then be applied to deduce the non-zero expansion coefficients a priori. At a given order, the translation group can be used to contract the products and eliminate terms which are not linearly independent, resulting in a final set of slave mode products. While the expansion coefficients can be computed in a variety of ways, we demonstrate that finite difference is effective up to fourth order. We demonstrate the power of the method in the strongly anharmonic system PbTe. All anharmonic terms within an octahedron are computed up to fourth order. A proper unitary transformation demonstrates that the vast majority of the anharmonicity can be attributed to just two terms, indicating that a minimal model of phonon interactions is achievable. The ability to straightforwardly generate polynomial potentials will allow precise simulations at length and time scales which were previously unrealizable

arXiv.org e-Print Archive

HKU Scholars Hub

Improved Operator Learning by Orthogonal Attention

Author: Deng Zhijie
Hao Zhongkai
Lin Bokai
Su Hang
Xiao Zipeng
Publication venue
Publication date: 23/10/2023
Field of study

Neural operators, as an efficient surrogate model for learning the solutions of PDEs, have received extensive attention in the field of scientific machine learning. Among them, attention-based neural operators have become one of the mainstreams in related research. However, existing approaches overfit the limited training data due to the considerable number of parameters in the attention mechanism. To address this, we develop an orthogonal attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions. The orthogonalization naturally poses a proper regularization effect on the resulting neural operator, which aids in resisting overfitting and boosting generalization. Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

Visual Odometry Revisited: What Should Be Learnt?

Author: Bian Jiawang
Reid Ian
Weerasekera Chamara Saroj
Zhan Huangying
Publication venue
Publication date: 01/01/2020
Field of study

In this work we present a monocular visual odometry (VO) algorithm which leverages geometry-based methods and deep learning. Most existing VO/SLAM systems with superior performance are based on geometry and have to be carefully designed for different application scenarios. Moreover, most monocular systems suffer from scale-drift issue.Some recent deep learning works learn VO in an end-to-end manner but the performance of these deep systems is still not comparable to geometry-based methods. In this work, we revisit the basics of VO and explore the right way for integrating deep learning with epipolar geometry and Perspective-n-Point (PnP) method. Specifically, we train two convolutional neural networks (CNNs) for estimating single-view depths and two-view optical flows as intermediate outputs. With the deep predictions, we design a simple but robust frame-to-frame VO algorithm (DF-VO) which outperforms pure deep learning-based and geometry-based methods. More importantly, our system does not suffer from the scale-drift issue being aided by a scale consistent single-view depth CNN. Extensive experiments on KITTI dataset shows the robustness of our system and a detailed ablation study shows the effect of different factors in our system.Comment: ICRA2020. Demo video: https://youtu.be/Nl8mFU4SJKY Code: https://github.com/Huangying-Zhan/DF-V

arXiv.org e-Print Archive

Crossref

Adelaide Research & Scholarship

WLFC: Write Less in Flash-based Cache

Author: Dong Chaos
Wang Fang
Zhang Jianshun
Publication venue
Publication date: 20/07/2021
Field of study

Flash-based disk caches, for example Bcache and Flashcache, has gained tremendous popularity in industry in the last decade because of its low energy consumption, non-volatile nature and high I/O speed. But these cache systems have a worse write performance than the read performance because of the asymmetric I/O costs and the the internal GC mechanism. In addition to the performance issues, since the NAND flash is a type of EEPROM device, the lifespan is also limited by the Program/Erase (P/E) cycles. So how to improve the performance and the lifespan of flash-based caches in write-intensive scenarios has always been a hot issue. Benefiting from Open-Channel SSDs (OCSSDs), we propose a write-friendly flash-based disk cache system, which is called WLFC (Write Less in the Flash-based Cache). In WLFC, a strictly sequential writing method is used to minimize the write amplification. A new replacement algorithm for the write buffer is designed to minimize the erase count caused by the evicting. And a new data layout strategy is designed to minimize the metadata size persisted in SSDs. As a result, the Over-Provisioned (OP) space is completely removed, the erase count of the flash is greatly reduced, and the metadata size is 1/10 or less than that in BCache. Even with a small amount of metadata, the data consistency after the crash is still guaranteed. Compared with the existing mechanism, WLFC brings a 7%-80% reduction in write latency, a 1.07*-4.5* increment in write throughput, and a 50%-88.9% reduction in erase count, with a moderate overhead in read performance

arXiv.org e-Print Archive

Codebook Features: Sparse and Discrete Interpretability for Neural Networks

Author: Goodman Noah D.
Tamkin Alex
Taufeeque Mohammad
Publication venue
Publication date: 26/10/2023
Field of study

Understanding neural networks is challenging in part because of the dense, continuous nature of their hidden states. We explore whether we can train neural networks to have hidden states that are sparse, discrete, and more interpretable by quantizing their continuous features into what we call codebook features. Codebook features are produced by finetuning neural networks with vector quantization bottlenecks at each layer, producing a network whose hidden features are the sum of a small number of discrete vector codes chosen from a larger codebook. Surprisingly, we find that neural networks can operate under this extreme bottleneck with only modest degradation in performance. This sparse, discrete bottleneck also provides an intuitive way of controlling neural network behavior: first, find codes that activate when the desired behavior is present, then activate those same codes during generation to elicit that behavior. We validate our approach by training codebook Transformers on several different datasets. First, we explore a finite state machine dataset with far more hidden states than neurons. In this setting, our approach overcomes the superposition problem by assigning states to distinct codes, and we find that we can make the neural network behave as if it is in a different state by activating the code for that state. Second, we train Transformer language models with up to 410M parameters on two natural language datasets. We identify codes in these models representing diverse, disentangled concepts (ranging from negative emotions to months of the year) and find that we can guide the model to generate different topics by activating the appropriate codes during inference. Overall, codebook features appear to be a promising unit of analysis and control for neural networks and interpretability. Our codebase and models are open-sourced at https://github.com/taufeeque9/codebook-features

arXiv.org e-Print Archive

Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks

Author: Liu Wei
Tu Kewei
Yan Zhaohui
Yang Songlin
Publication venue
Publication date: 26/10/2023
Field of study

Entity and Relation Extraction (ERE) is an important task in information extraction. Recent marker-based pipeline models achieve state-of-the-art performance, but still suffer from the error propagation issue. Also, most of current ERE models do not take into account higher-order interactions between multiple entities and relations, while higher-order modeling could be beneficial.In this work, we propose HyperGraph neural network for ERE (\hgnn{}), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model). To alleviate error propagation,we use a high-recall pruner mechanism to transfer the burden of entity identification and labeling from the NER module to the joint module of our model. For higher-order modeling, we build a hypergraph, where nodes are entities (provided by the span pruner) and relations thereof, and hyperedges encode interactions between two different relations or between a relation and its associated subject and object entities. We then run a hypergraph neural network for higher-order inference by applying message passing over the built hypergraph. Experiments on three widely used benchmarks (\acef{}, \ace{} and \scierc{}) for ERE task show significant improvements over the previous state-of-the-art PL-marker.Comment: Accepted to Proceedings of EMNLP, 202

arXiv.org e-Print Archive

WAIT-FREE LINEARIZATION WITH A MECHANICAL PROOF

Author: Hesselink Wim H.
Publication venue
Publication date: 01/08/1995
Field of study

Proceedings - University of Groningen

Spartan Daily, October 22, 1971

Author: San Jose State University School of Journalism and Mass Communications
Publication venue: SJSU ScholarWorks
Publication date: 22/10/1971
Field of study

Volume 59, Issue 21https://scholarworks.sjsu.edu/spartandaily/5554/thumbnail.jp

SJSU ScholarWorks

Comparing End-to-End Machine Learning Methods for Spectra Classification

Author: Brockhauser Sandor
Hegedus Peter
Sun Yue
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

In scientific research, spectroscopy and diffraction experimental techniques are widely used and produce huge amounts of spectral data. Learning patterns from spectra is critical during these experiments. This provides immediate feedback on the actual status of the experiment (e.g., time-resolved status of the sample), which helps guide the experiment. The two major spectral changes what we aim to capture are either the change in intensity distribution (e.g., drop or appearance) of peaks at certain locations, or the shift of those on the spectrum. This study aims to develop deep learning (DL) classification frameworks for one-dimensional (1D) spectral time series. In this work, we deal with the spectra classification problem from two different perspectives, one is a general two-dimensional (2D) space segmentation problem, and the other is a common 1D time series classification problem. We focused on the two proposed classification models under these two settings, the namely the end-to-end binned Fully Connected Neural Network (FCNN) with the automatically capturing weighting factors model and the convolutional SCT attention model. Under the setting of 1D time series classification, several other end-to-end structures based on FCNN, Convolutional Neural Network (CNN), ResNets, Long Short-Term Memory (LSTM), and Transformer were explored. Finally, we evaluated and compared the performance of these classification models based on the High Energy Density (HED) spectra dataset from multiple perspectives, and further performed the feature importance analysis to explore their interpretability. The results show that all the applied models can achieve 100% classification confidence, but the models applied under the 1D time series classification setting are superior. Among them, Transformer-based methods consume the least training time (0.449 s). Our proposed convolutional Spatial-Channel-Temporal (SCT) attention model uses 1.269 s, but its self-attention mechanism performed across spatial, channel, and temporal dimensions can suppress indistinguishable features better than others, and selectively focus on obvious features with high separability.Peer Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Repository of the Academy's Library

A Geospatial Service Model and Catalog for Discovery and Orchestration

Author: Ioup Elias
Publication venue: ScholarWorks@UNO
Publication date: 20/05/2011
Field of study

The goal of this research is to provide a supporting Web services architecture, consisting of a service model and catalog, to allow discovery and automatic orchestration of geospatial Web services. First, a methodology for supporting geospatial Web services with existing orchestration tools is presented. Geospatial services are automatically translated into SOAP/WSDL services by a portable service wrapper. Their data layers are exposed as atomic functions while WSDL extensions provide syntactic metadata. Compliant services are modeled using the descriptive logic capabilities of the Ontology Language for the Web (OWL). The resulting geospatial service model has a number of functions. It provides a basic taxonomy of geospatial Web services that is useful for templating service compositions. It also contains the necessary annotations to allow discovery of services. Importantly, the model defines a number of logical relationships between its internal concepts which allow inconsistency detection for the model as a whole and for individual service instances as they are added to the catalog. These logical relationships have the additional benefit of supporting automatic classification of geospatial services individuals when they are added to the service catalog. The geospatial service catalog is backed by the descriptive logic model. It supports queries which are more complex that those available using standard relational data models, such as the capability to query using concept hierarchies. An example orchestration system demonstrates the use of the geospatial service catalog for query evaluation in an automatic orchestration system (both fully and semi-automatic orchestration). Computational complexity analysis and experimental performance analysis identify potential performance problems in the geospatial service catalog. Solutions to these performance issues are presented in the form of partitioning service instance realization, low cost pre-filtering of service instances, and pre-processing realization. The resulting model and catalog provide an architecture to support automatic orchestration capable of complementing the multiple service composition algorithms that currently exist. Importantly, the geospatial service model and catalog go beyond simply supporting orchestration systems. By providing a general solution to the modeling and discovery of geospatial Web services they are useful in any geospastial Web service enterprise

University of New Orleans