146 research outputs found
New Embedded Representations and Evaluation Protocols for Inferring Transitive Relations
Beyond word embeddings, continuous representations of knowledge graph (KG)
components, such as entities, types and relations, are widely used for entity
mention disambiguation, relation inference and deep question answering. Great
strides have been made in modeling general, asymmetric or antisymmetric KG
relations using Gaussian, holographic, and complex embeddings. None of these
directly enforce transitivity inherent in the is-instance-of and is-subtype-of
relations. A recent proposal, called order embedding (OE), demands that the
vector representing a subtype elementwise dominates the vector representing a
supertype. However, the manner in which such constraints are asserted and
evaluated have some limitations. In this short research note, we make three
contributions specific to representing and inferring transitive relations.
First, we propose and justify a significant improvement to the OE loss
objective. Second, we propose a new representation of types as
hyper-rectangular regions, that generalize and improve on OE. Third, we show
that some current protocols to evaluate transitive relation inference can be
misleading, and offer a sound alternative. Rather than use black-box deep
learning modules off-the-shelf, we develop our training networks using
elementary geometric considerations.Comment: Accepted at SIGIR 201
On representation learning for generative models of text
Cette thèse fait des petits pas dans la construction et la compréhension des systèmes d'apprentissage des représentations neuronales et des modèles génératifs pour le traitement du langage naturel. Il est présenté comme une thèse par article qui contient quatre travaux.
Dans le premier article, nous montrons que l'apprentissage multi-tâches peut être utilisé pour combiner les biais inductifs de plusieurs tâches d'apprentissage auto-supervisées et supervisées pour apprendre des représentations de phrases distribuées de longueur fixe à usage général qui obtiennent des résultats solides sur les tâches d'apprentissage par transfert en aval sans tout modèle de réglage fin.
Le deuxième article s'appuie sur le premier et présente un modèle génératif en deux étapes pour le texte qui modélise la distribution des représentations de phrases pour produire de nouveaux plongements de phrases qui servent de "contour neuronal" de haut niveau qui est reconstruit en mots avec un récurrent neuronal autorégressif conditionnel décodeur.
Le troisième article étudie la nécessité de représentations démêlées pour la génération de texte contrôlable. Une grande partie des systèmes de génération de texte contrôlables reposent sur l'idée que le contrôle d'un attribut (ou d'un style) particulier nécessite la construction de représentations dissociées qui séparent le contenu et le style. Nous démontrons que les représentations produites dans des travaux antérieurs qui utilisent la formation contradictoire du domaine ne sont pas dissociées dans la pratique. Nous présentons ensuite une approche qui ne vise pas à apprendre des représentations démêlées et montrons qu'elle permet d'obtenir des résultats nettement meilleurs que les travaux antérieurs.
Dans le quatrième article, nous concevons des modèles de langage de transformateur qui apprennent les représentations à plusieurs échelles de temps et montrent que ceux-ci peuvent aider à réduire l'empreinte mémoire importante de ces modèles. Il présente trois architectures multi-échelles différentes qui présentent des compromis favorables entre la perplexité et l'empreinte mémoire.This thesis takes baby steps in building and understanding neural representation learning systems and generative models for natural language processing. It is presented as a thesis by article that contains four pieces of work.
In the first article, we show that multi-task learning can be used to combine the inductive biases of several self-supervised and supervised learning tasks to learn general-purpose fixed-length distributed sentence representations that achieve strong results on downstream transfer learning tasks without any model fine-tuning.
The second article builds on the first and presents a two-step generative model for text that models the distribution of sentence representations to produce novel sentence embeddings that serves as a high level ``neural outline'' that is reconstructed to words with a conditional autoregressive RNN decoder.
The third article studies the necessity of disentangled representations for controllable text generation. A large fraction of controllable text generation systems rely on the idea that control over a particular attribute (or style) requires building disentangled representations that separate content and style. We demonstrate that representations produced in previous work that uses domain adversarial training are not disentangled in practice. We then present an approach that does not aim to learn disentangled representations and show that it achieves significantly better results than prior work.
In the fourth article, we design transformer language models that learn representations at multiple time scales and show that these can help address the large memory footprint these models typically have. It presents three different multi-scale architectures that exhibit favorable perplexity vs memory footprint trade-offs
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
A lot of the recent success in natural language processing (NLP) has been
driven by distributed vector representations of words trained on large amounts
of text in an unsupervised manner. These representations are typically used as
general purpose features for words across a range of NLP problems. However,
extending this success to learning representations of sequences of words, such
as sentences, remains an open problem. Recent work has explored unsupervised as
well as supervised learning techniques with different training objectives to
learn general purpose fixed-length sentence representations. In this work, we
present a simple, effective multi-task learning framework for sentence
representations that combines the inductive biases of diverse training
objectives in a single model. We train this model on several data sources with
multiple training objectives on over 100 million sentences. Extensive
experiments demonstrate that sharing a single recurrent sentence encoder across
weakly related tasks leads to consistent improvements over previous methods. We
present substantial improvements in the context of transfer learning and
low-resource settings using our learned general-purpose representations.Comment: Accepted at ICLR 201
Adversarial Generation of Natural Language
Generative Adversarial Networks (GANs) have gathered a lot of attention from
the computer vision community, yielding impressive results for image
generation. Advances in the adversarial generation of natural language from
noise however are not commensurate with the progress made in generating images,
and still lag far behind likelihood based methods. In this paper, we take a
step towards generating natural language with a GAN objective alone. We
introduce a simple baseline that addresses the discrete output space problem
without relying on gradient estimators and show that it is able to achieve
state-of-the-art results on a Chinese poem generation dataset. We present
quantitative results on generating sentences from context-free and
probabilistic context-free grammars, and qualitative language modeling results.
A conditional version is also described that can generate sequences conditioned
on sentence characteristics.Comment: 11 pages, 3 figures, 5 table
Sliding mode control method having terminal convergence in finite time
An object of this invention is to provide robust nonlinear controllers for robotic operations in unstructured environments based upon a new class of closed loop sliding control methods, sometimes denoted terminal sliders, where the new class will enforce closed-loop control convergence to equilibrium in finite time. Improved performance results from the elimination of high frequency control switching previously employed for robustness to parametric uncertainties. Improved performance also results from the dependence of terminal slider stability upon the rate of change of uncertainties over the sliding surface rather than the magnitude of the uncertainty itself for robust control. Terminal sliding mode control also yields improved convergence where convergence time is finite and is to be controlled. A further object is to apply terminal sliders to robot manipulator control and benchmark performance with the traditional computed torque control method and provide for design of control parameters
Viewing medium affects arm motor performance in 3D virtual environments
<p>Abstract</p> <p>Background</p> <p>2D and 3D virtual reality platforms are used for designing individualized training environments for post-stroke rehabilitation. Virtual environments (VEs) are viewed using media like head mounted displays (HMDs) and large screen projection systems (SPS) which can influence the quality of perception of the environment. We estimated if there were differences in arm pointing kinematics when subjects with and without stroke viewed a 3D VE through two different media: HMD and SPS.</p> <p>Methods</p> <p>Two groups of subjects participated (healthy control, n = 10, aged 53.6 ± 17.2 yrs; stroke, n = 20, 66.2 ± 11.3 yrs). Arm motor impairment and spasticity were assessed in the stroke group which was divided into mild (n = 10) and moderate-to-severe (n = 10) sub-groups based on Fugl-Meyer Scores. Subjects pointed (8 times each) to 6 randomly presented targets located at two heights in the ipsilateral, middle and contralateral arm workspaces. Movements were repeated in the same VE viewed using HMD (Kaiser XL50) and SPS. Movement kinematics were recorded using an Optotrak system (Certus, 6 markers, 100 Hz). Upper limb motor performance (precision, velocity, trajectory straightness) and movement pattern (elbow, shoulder ranges and trunk displacement) outcomes were analyzed using repeated measures ANOVAs.</p> <p>Results</p> <p>For all groups, there were no differences in endpoint trajectory straightness, shoulder flexion and shoulder horizontal adduction ranges and sagittal trunk displacement between the two media. All subjects, however, made larger errors in the vertical direction using HMD compared to SPS. Healthy subjects also made larger errors in the sagittal direction, slower movements overall and used less range of elbow extension for the lower central target using HMD compared to SPS. The mild and moderate-to-severe sub-groups made larger RMS errors with HMD. The only advantage of using the HMD was that movements were 22% faster in the moderate-to-severe stroke sub-group compared to the SPS.</p> <p>Conclusions</p> <p>Despite the similarity in majority of the movement kinematics, differences in movement speed and larger errors were observed for movements using the HMD. Use of the SPS may be a more comfortable and effective option to view VEs for upper limb rehabilitation post-stroke. This has implications for the use of VR applications to enhance upper limb recovery.</p
- …