66 research outputs found

    A slave mode expansion for obtaining ab-initio interatomic potentials

    Get PDF
    Here we propose a new approach for performing a Taylor series expansion of the first-principles computed energy of a crystal as a function of the nuclear displacements. We enlarge the dimensionality of the existing displacement space and form new variables (ie. slave modes) which transform like irreducible representations of the space group and satisfy homogeneity of free space. Standard group theoretical techniques can then be applied to deduce the non-zero expansion coefficients a priori. At a given order, the translation group can be used to contract the products and eliminate terms which are not linearly independent, resulting in a final set of slave mode products. While the expansion coefficients can be computed in a variety of ways, we demonstrate that finite difference is effective up to fourth order. We demonstrate the power of the method in the strongly anharmonic system PbTe. All anharmonic terms within an octahedron are computed up to fourth order. A proper unitary transformation demonstrates that the vast majority of the anharmonicity can be attributed to just two terms, indicating that a minimal model of phonon interactions is achievable. The ability to straightforwardly generate polynomial potentials will allow precise simulations at length and time scales which were previously unrealizable

    Improved Operator Learning by Orthogonal Attention

    Full text link
    Neural operators, as an efficient surrogate model for learning the solutions of PDEs, have received extensive attention in the field of scientific machine learning. Among them, attention-based neural operators have become one of the mainstreams in related research. However, existing approaches overfit the limited training data due to the considerable number of parameters in the attention mechanism. To address this, we develop an orthogonal attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions. The orthogonalization naturally poses a proper regularization effect on the resulting neural operator, which aids in resisting overfitting and boosting generalization. Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins.Comment: 14 pages, 5 figure

    Visual Odometry Revisited: What Should Be Learnt?

    Full text link
    In this work we present a monocular visual odometry (VO) algorithm which leverages geometry-based methods and deep learning. Most existing VO/SLAM systems with superior performance are based on geometry and have to be carefully designed for different application scenarios. Moreover, most monocular systems suffer from scale-drift issue.Some recent deep learning works learn VO in an end-to-end manner but the performance of these deep systems is still not comparable to geometry-based methods. In this work, we revisit the basics of VO and explore the right way for integrating deep learning with epipolar geometry and Perspective-n-Point (PnP) method. Specifically, we train two convolutional neural networks (CNNs) for estimating single-view depths and two-view optical flows as intermediate outputs. With the deep predictions, we design a simple but robust frame-to-frame VO algorithm (DF-VO) which outperforms pure deep learning-based and geometry-based methods. More importantly, our system does not suffer from the scale-drift issue being aided by a scale consistent single-view depth CNN. Extensive experiments on KITTI dataset shows the robustness of our system and a detailed ablation study shows the effect of different factors in our system.Comment: ICRA2020. Demo video: https://youtu.be/Nl8mFU4SJKY Code: https://github.com/Huangying-Zhan/DF-V

    WLFC: Write Less in Flash-based Cache

    Full text link
    Flash-based disk caches, for example Bcache and Flashcache, has gained tremendous popularity in industry in the last decade because of its low energy consumption, non-volatile nature and high I/O speed. But these cache systems have a worse write performance than the read performance because of the asymmetric I/O costs and the the internal GC mechanism. In addition to the performance issues, since the NAND flash is a type of EEPROM device, the lifespan is also limited by the Program/Erase (P/E) cycles. So how to improve the performance and the lifespan of flash-based caches in write-intensive scenarios has always been a hot issue. Benefiting from Open-Channel SSDs (OCSSDs), we propose a write-friendly flash-based disk cache system, which is called WLFC (Write Less in the Flash-based Cache). In WLFC, a strictly sequential writing method is used to minimize the write amplification. A new replacement algorithm for the write buffer is designed to minimize the erase count caused by the evicting. And a new data layout strategy is designed to minimize the metadata size persisted in SSDs. As a result, the Over-Provisioned (OP) space is completely removed, the erase count of the flash is greatly reduced, and the metadata size is 1/10 or less than that in BCache. Even with a small amount of metadata, the data consistency after the crash is still guaranteed. Compared with the existing mechanism, WLFC brings a 7%-80% reduction in write latency, a 1.07*-4.5* increment in write throughput, and a 50%-88.9% reduction in erase count, with a moderate overhead in read performance

    Codebook Features: Sparse and Discrete Interpretability for Neural Networks

    Full text link
    Understanding neural networks is challenging in part because of the dense, continuous nature of their hidden states. We explore whether we can train neural networks to have hidden states that are sparse, discrete, and more interpretable by quantizing their continuous features into what we call codebook features. Codebook features are produced by finetuning neural networks with vector quantization bottlenecks at each layer, producing a network whose hidden features are the sum of a small number of discrete vector codes chosen from a larger codebook. Surprisingly, we find that neural networks can operate under this extreme bottleneck with only modest degradation in performance. This sparse, discrete bottleneck also provides an intuitive way of controlling neural network behavior: first, find codes that activate when the desired behavior is present, then activate those same codes during generation to elicit that behavior. We validate our approach by training codebook Transformers on several different datasets. First, we explore a finite state machine dataset with far more hidden states than neurons. In this setting, our approach overcomes the superposition problem by assigning states to distinct codes, and we find that we can make the neural network behave as if it is in a different state by activating the code for that state. Second, we train Transformer language models with up to 410M parameters on two natural language datasets. We identify codes in these models representing diverse, disentangled concepts (ranging from negative emotions to months of the year) and find that we can guide the model to generate different topics by activating the appropriate codes during inference. Overall, codebook features appear to be a promising unit of analysis and control for neural networks and interpretability. Our codebase and models are open-sourced at https://github.com/taufeeque9/codebook-features

    Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks

    Full text link
    Entity and Relation Extraction (ERE) is an important task in information extraction. Recent marker-based pipeline models achieve state-of-the-art performance, but still suffer from the error propagation issue. Also, most of current ERE models do not take into account higher-order interactions between multiple entities and relations, while higher-order modeling could be beneficial.In this work, we propose HyperGraph neural network for ERE (\hgnn{}), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model). To alleviate error propagation,we use a high-recall pruner mechanism to transfer the burden of entity identification and labeling from the NER module to the joint module of our model. For higher-order modeling, we build a hypergraph, where nodes are entities (provided by the span pruner) and relations thereof, and hyperedges encode interactions between two different relations or between a relation and its associated subject and object entities. We then run a hypergraph neural network for higher-order inference by applying message passing over the built hypergraph. Experiments on three widely used benchmarks (\acef{}, \ace{} and \scierc{}) for ERE task show significant improvements over the previous state-of-the-art PL-marker.Comment: Accepted to Proceedings of EMNLP, 202

    Spartan Daily, October 22, 1971

    Get PDF
    Volume 59, Issue 21https://scholarworks.sjsu.edu/spartandaily/5554/thumbnail.jp

    Comparing End-to-End Machine Learning Methods for Spectra Classification

    Get PDF
    In scientific research, spectroscopy and diffraction experimental techniques are widely used and produce huge amounts of spectral data. Learning patterns from spectra is critical during these experiments. This provides immediate feedback on the actual status of the experiment (e.g., time-resolved status of the sample), which helps guide the experiment. The two major spectral changes what we aim to capture are either the change in intensity distribution (e.g., drop or appearance) of peaks at certain locations, or the shift of those on the spectrum. This study aims to develop deep learning (DL) classification frameworks for one-dimensional (1D) spectral time series. In this work, we deal with the spectra classification problem from two different perspectives, one is a general two-dimensional (2D) space segmentation problem, and the other is a common 1D time series classification problem. We focused on the two proposed classification models under these two settings, the namely the end-to-end binned Fully Connected Neural Network (FCNN) with the automatically capturing weighting factors model and the convolutional SCT attention model. Under the setting of 1D time series classification, several other end-to-end structures based on FCNN, Convolutional Neural Network (CNN), ResNets, Long Short-Term Memory (LSTM), and Transformer were explored. Finally, we evaluated and compared the performance of these classification models based on the High Energy Density (HED) spectra dataset from multiple perspectives, and further performed the feature importance analysis to explore their interpretability. The results show that all the applied models can achieve 100% classification confidence, but the models applied under the 1D time series classification setting are superior. Among them, Transformer-based methods consume the least training time (0.449 s). Our proposed convolutional Spatial-Channel-Temporal (SCT) attention model uses 1.269 s, but its self-attention mechanism performed across spatial, channel, and temporal dimensions can suppress indistinguishable features better than others, and selectively focus on obvious features with high separability.Peer Reviewe

    A Geospatial Service Model and Catalog for Discovery and Orchestration

    Get PDF
    The goal of this research is to provide a supporting Web services architecture, consisting of a service model and catalog, to allow discovery and automatic orchestration of geospatial Web services. First, a methodology for supporting geospatial Web services with existing orchestration tools is presented. Geospatial services are automatically translated into SOAP/WSDL services by a portable service wrapper. Their data layers are exposed as atomic functions while WSDL extensions provide syntactic metadata. Compliant services are modeled using the descriptive logic capabilities of the Ontology Language for the Web (OWL). The resulting geospatial service model has a number of functions. It provides a basic taxonomy of geospatial Web services that is useful for templating service compositions. It also contains the necessary annotations to allow discovery of services. Importantly, the model defines a number of logical relationships between its internal concepts which allow inconsistency detection for the model as a whole and for individual service instances as they are added to the catalog. These logical relationships have the additional benefit of supporting automatic classification of geospatial services individuals when they are added to the service catalog. The geospatial service catalog is backed by the descriptive logic model. It supports queries which are more complex that those available using standard relational data models, such as the capability to query using concept hierarchies. An example orchestration system demonstrates the use of the geospatial service catalog for query evaluation in an automatic orchestration system (both fully and semi-automatic orchestration). Computational complexity analysis and experimental performance analysis identify potential performance problems in the geospatial service catalog. Solutions to these performance issues are presented in the form of partitioning service instance realization, low cost pre-filtering of service instances, and pre-processing realization. The resulting model and catalog provide an architecture to support automatic orchestration capable of complementing the multiple service composition algorithms that currently exist. Importantly, the geospatial service model and catalog go beyond simply supporting orchestration systems. By providing a general solution to the modeling and discovery of geospatial Web services they are useful in any geospastial Web service enterprise
    • …
    corecore