Search CORE

2,098 research outputs found

Strain-engineered A-type antiferromagnetic order in YTiO $_3$ : a first-principles calculation

Author: Dong Shuai
Huang Xin
Tang Yankun
Publication venue: 'AIP Publishing'
Publication date: 15/11/2012
Field of study

The epitaxial strain effects on the magnetic ground state of YTiO

_3

films grown on LaAlO

_3

substrates have been studied using the first-principles density-functional theory. With the in-plane compressive strain induced by LaAlO

_3

(001) substrate, A-type antiferromagnetic order emerges against the original ferromagnetic order. This phase transition from ferromagnet to A-type antiferromagnet in YTiO

_3

film is robust since the energy gain is about 7.64 meV per formula unit despite the Hubbard interaction and modest lattice changes, even though the A-type antiferromagnetic order does not exist in any

R

TiO

_3

bulks.Comment: 3 pages, 2 figures. Proceeding of the 12th Joint MMM/Intermag Conference. Accepted by JA

arXiv.org e-Print Archive

A Multi-Objective DIRECT Algorithm Towards Structural Damage Identification with Limited Dynamic Response Information

Author: Cao Pei
Shuai Qi
Tang Jiong
Publication venue: 'ASME International'
Publication date: 05/10/2017
Field of study

A major challenge in Structural Health Monitoring (SHM) is to accurately identify both the location and severity of damage using the dynamic response information acquired. While in theory the vibration-based and impedance-based methods may facilitate damage identification with the assistance of a credible baseline finite element model since the changes of stationary wave responses are used in these methods, the response information is generally limited and the measurements may be heterogeneous, making an inverse analysis using sensitivity matrix difficult. Aiming at fundamental advancement, in this research we cast the damage identification problem into an optimization problem where possible changes of finite element properties due to damage occurrence are treated as unknowns. We employ the multiple damage location assurance criterion (MDLAC), which characterizes the relation between measurements and predictions (under sampled elemental property changes), as the vector-form objective function. We then develop an enhanced, multi-objective version of the DIRECT approach to solve the optimization problem. The underlying idea of the multi-objective DIRECT approach is to branch and bound the unknown parametric space to converge to a set of optimal solutions. A new sampling scheme is established, which significantly increases the efficiency in minimizing the error between measurements and predictions. The enhanced DIRECT algorithm is particularly suitable to solving for unknowns that are sparse, as in practical situations structural damage affect only a small number of finite elements. A number of test cases using vibration response information are executed to demonstrate the effectiveness of the new approach

arXiv.org e-Print Archive

Improving Sentence Representations with Consensus Maximisation

Author: de Sa Virginia R.
Tang Shuai
Publication venue
Publication date: 06/05/2019
Field of study

Consensus maximisation learning can provide self-supervision when different views are available of the same data. The distributional hypothesis provides another form of useful self-supervision from adjacent sentences which are plentiful in large unlabelled corpora. Motivated by the observation that different learning architectures tend to emphasise different aspects of sentence meaning, we present a new self-supervised learning framework for learning sentence representations which minimises the disagreement between two views of the same sentence where one view encodes the sentence with a recurrent neural network (RNN), and the other view encodes the same sentence with a simple linear model. After learning, the individual views (networks) result in higher quality sentence representations than their single-view learnt counterparts (learnt using only the distributional hypothesis) as judged by performance on standard downstream tasks. An ensemble of both views provides even better generalisation on both supervised and unsupervised downstream tasks. Also, importantly the ensemble of views trained with consensus maximisation between the two different architectures performs better on downstream tasks than an analogous ensemble made from the single-view trained counterparts.Comment: arXiv admin note: substantial text overlap with arXiv:1805.0744

arXiv.org e-Print Archive

Multi-view Sentence Representation Learning

Author: de Sa Virginia R.
Tang Shuai
Publication venue
Publication date: 18/05/2018
Field of study

Multi-view learning can provide self-supervision when different views are available of the same data. The distributional hypothesis provides another form of useful self-supervision from adjacent sentences which are plentiful in large unlabelled corpora. Motivated by the asymmetry in the two hemispheres of the human brain as well as the observation that different learning architectures tend to emphasise different aspects of sentence meaning, we create a unified multi-view sentence representation learning framework, in which, one view encodes the input sentence with a Recurrent Neural Network (RNN), and the other view encodes it with a simple linear model, and the training objective is to maximise the agreement specified by the adjacent context information between two views. We show that, after training, the vectors produced from our multi-view training provide improved representations over the single-view training, and the combination of different views gives further representational improvement and demonstrates solid transferability on standard downstream tasks

arXiv.org e-Print Archive

Leveraging Gaussian Process and Voting-Empowered Many-Objective Evaluation for Fault Identification

Author: Cao Pei
Shuai Qi
Tang Jiong
Publication venue
Publication date: 29/10/2018
Field of study

Using piezoelectric impedance/admittance sensing for structural health monitoring is promising, owing to the simplicity in circuitry design as well as the high-frequency interrogation capability. The actual identification of fault location and severity using impedance/admittance measurements, nevertheless, remains to be an extremely challenging task. A first-principle based structural model using finite element discretization requires high dimensionality to characterize the high-frequency response. As such, direct inversion using the sensitivity matrix usually yields an under-determined problem. Alternatively, the identification problem may be cast into an optimization framework in which fault parameters are identified through repeated forward finite element analysis which however is oftentimes computationally prohibitive. This paper presents an efficient data-assisted optimization approach for fault identification without using finite element model iteratively. We formulate a many-objective optimization problem to identify fault parameters, where response surfaces of impedance measurements are constructed through Gaussian process-based calibration. To balance between solution diversity and convergence, an -dominance enabled many-objective simulated annealing algorithm is established. As multiple solutions are expected, a voting score calculation procedure is developed to further identify those solutions that yield better implications regarding structural health condition. The effectiveness of the proposed approach is demonstrated by systematic numerical and experimental case studies

arXiv.org e-Print Archive

Exploiting Invertible Decoders for Unsupervised Sentence Representation Learning

Author: de Sa Virginia R.
Tang Shuai
Publication venue
Publication date: 31/05/2019
Field of study

The encoder-decoder models for unsupervised sentence representation learning tend to discard the decoder after being trained on a large unlabelled corpus, since only the encoder is needed to map the input sentence into a vector representation. However, parameters learnt in the decoder also contain useful information about language. In order to utilise the decoder after learning, we present two types of decoding functions whose inverse can be easily derived without expensive inverse calculation. Therefore, the inverse of the decoding function serves as another encoder that produces sentence representations. We show that, with careful design of the decoding functions, the model learns good sentence representations, and the ensemble of the representations produced from the encoder and the inverse of the decoder demonstrate even better generalisation ability and solid transferability

arXiv.org e-Print Archive

What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks

Author: Gallagher Patrick W.
Tang Shuai
Tu Zhuowen
Publication venue
Publication date: 23/11/2015
Field of study

Top-down information plays a central role in human perception, but plays relatively little role in many current state-of-the-art deep networks, such as Convolutional Neural Networks (CNNs). This work seeks to explore a path by which top-down information can have a direct impact within current deep networks. We explore this path by learning and using "generators" corresponding to the network internal effects of three types of transformation (each a restriction of a general affine transformation): rotation, scaling, and translation. We demonstrate how these learned generators can be used to transfer top-down information to novel settings, as mediated by the "feature flows" that the transformations (and the associated generators) correspond to inside the network. Specifically, we explore three aspects: 1) using generators as part of a method for synthesizing transformed images --- given a previously unseen image, produce versions of that image corresponding to one or more specified transformations, 2) "zero-shot learning" --- when provided with a feature flow corresponding to the effect of a transformation of unknown amount, leverage learned generators as part of a method by which to perform an accurate categorization of the amount of transformation, even for amounts never observed during training, and 3) (inside-CNN) "data augmentation" --- improve the classification performance of an existing network by using the learned generators to directly provide additional training "inside the CNN"

arXiv.org e-Print Archive

A Simple Recurrent Unit with Reduced Tensor Product Representations

Author: de Sa Virginia R.
Smolensky Paul
Tang Shuai
Publication venue
Publication date: 05/11/2019
Field of study

idely used recurrent units, including Long-short Term Memory (LSTM) and the Gated Recurrent Unit (GRU), perform well on natural language tasks, but their ability to learn structured representations is still questionable. Exploiting reduced Tensor Product Representations (TPRs) --- distributed representations of symbolic structure in which vector-embedded symbols are bound to vector-embedded structural positions --- we propose the TPRU, a simple recurrent unit that, at each time step, explicitly executes structural-role binding and unbinding operations to incorporate structural information into learning. A gradient analysis of our proposed TPRU is conducted to support our model design, and its performance on multiple datasets shows the effectiveness of our design choices. Furthermore, observations on a linguistically grounded study demonstrate the interpretability of our TPRU

arXiv.org e-Print Archive

Hierarchical Deep Recurrent Architecture for Video Understanding

Author: Deng Boyang
Tang Luming
Yi Shuai
Zhao Haiyu
Publication venue
Publication date: 11/07/2017
Field of study

This paper introduces the system we developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset was used for multi-label video classification. The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part. In the frame-level sequence modelling part, we explore a set of methods including Pooling-LSTM (PLSTM), Hierarchical-LSTM (HLSTM), Random-LSTM (RLSTM) in order to address the problem of large amount of frames in a video. We also introduce two attention pooling methods, single attention pooling (ATT) and multiply attention pooling (Multi-ATT) so that we can pay more attention to the informative frames in a video and ignore the useless frames. In the video-level classification part, two methods are proposed to increase the classification performance, i.e. Hierarchical-Mixture-of-Experts (HMoE) and Classifier Chains (CC). Our final submission is an ensemble consisting of 18 sub-models. In terms of the official evaluation metric Global Average Precision (GAP) at 20, our best submission achieves 0.84346 on the public 50% of test dataset and 0.84333 on the private 50% of test data.Comment: Accepted as Classification Challenge Track paper in CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understandin

arXiv.org e-Print Archive

An Empirical Study on Post-processing Methods for Word Embeddings

Author: de Sa Virginia R.
Mousavi Mahta
Tang Shuai
Publication venue
Publication date: 23/10/2019
Field of study

Word embeddings learnt from large corpora have been adopted in various applications in natural language processing and served as the general input representations to learning systems. Recently, a series of post-processing methods have been proposed to boost the performance of word embeddings on similarity comparison and analogy retrieval tasks, and some have been adapted to compose sentence representations. The general hypothesis behind these methods is that by enforcing the embedding space to be more isotropic, the similarity between words can be better expressed. We view these methods as an approach to shrink the covariance/gram matrix, which is estimated by learning word vectors, towards a scaled identity matrix. By optimising an objective in the semi-Riemannian manifold with Centralised Kernel Alignment (CKA), we are able to search for the optimal shrinkage parameter, and provide a post-processing method to smooth the spectrum of learnt word vectors which yields improved performance on downstream tasks

arXiv.org e-Print Archive