Search CORE

265 research outputs found

Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models

Author: Elder Henry
Hokamp Chris
Publication venue
Publication date: 01/01/2018
Field of study

This work presents a new state of the art in reconstruction of surface realizations from obfuscated text. We identify the lack of sufficient training data as the major obstacle to training high-performing models, and solve this issue by generating large amounts of synthetic training data. We also propose preprocessing techniques which make the structure contained in the input features more accessible to sequence models. Our models were ranked first on all evaluation metrics in the English portion of the 2018 Surface Realization shared task

arXiv.org e-Print Archive

Crossref

Multi-source data assimilation for physically based hydrological modeling of an experimental hillslope

Author: Belluco Enrica
Botto Anna
Camporese Matteo
Publication venue: 'Copernicus GmbH'
Publication date: 01/01/2018
Field of study

Data assimilation has recently been the focus of much attention for integrated surface–subsurface hydrological models, whereby joint assimilation of water table, soil moisture, and river discharge measurements with the ensemble Kalman filter (EnKF) has been extensively applied. Although the EnKF has been specifically developed to deal with nonlinear models, integrated hydrological models based on the Richards equation still represent a challenge, due to strong nonlinearities that may significantly affect the filter performance. Thus, more studies are needed to investigate the capabilities of the EnKF to correct the system state and identify parameters in cases where the unsaturated zone dynamics are dominant, as well as to quantify possible tradeoffs associated with assimilation of multi-source data. Here, the CATHY (CATchment HYdrology) model is applied to reproduce the hydrological dynamics observed in an experimental two-layered hillslope, equipped with tensiometers, water content reflectometer probes, and tipping bucket flow gages to monitor the hillslope response to a series of artificial rainfall events. Pressure head, soil moisture, and subsurface outflow are assimilated with the EnKF in a number of scenarios and the challenges and issues arising from the assimilation of multi-source data in this real-world test case are discussed. Our results demonstrate that the EnKF is able to effectively correct states and parameters even in a real application characterized by strong nonlinearities. However, multi-source data assimilation may lead to significant tradeoffs: the assimilation of additional variables can lead to degradation of model predictions for other variables that are otherwise well reproduced. Furthermore, we show that integrated observations such as outflow discharge cannot compensate for the lack of well-distributed data in heterogeneous hillslopes.</p

Crossref

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Padova

Surface Realisation Using Full Delexicalisation

Author: Gardent Claire
Shimorina Anastasia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 03/11/2019
Field of study

International audienceSurface realisation (SR) maps a meaning representation to a sentence and can be viewed as consisting of three subtasks: word ordering, morphological inflection and contraction generation (e.g., clitic attachment in Portuguese or elision in French). We propose a modular approach to surface realisation which models each of these components separately, and evaluate our approach on the 10 languages covered by the SR'18 Surface Realisation Shared Task shallow track. We provide a detailed evaluation of how word order, morphological realisa-tion and contractions are handled by the model and an analysis of the differences in word ordering performance across languages

INRIA a CCSD electronic archive server

Econometric Methods for Endogenously Sampled Time Series: The Case of Commodity Price Speculation in the Steel Market

Author: George Hall
John Rust
Publication venue
Publication date
Field of study

This paper studies the econometric problems associated with estimation of a stochastic process that is endogenously sampled. Our interest is to infer the law of motion of a discrete-time stochastic process {pt} that is observed only at a subset of times {t1,..., tn} that depend on the outcome of a probabilistic sampling rule that depends on the history of the process as well as other observed covariates xt . We focus on a particular example where pt denotes the daily wholesale price of a standardized steel product. However there are no formal exchanges or centralized markets where steel is traded and pt can be observed. Instead nearly all steel transaction prices are a result of private bilateral negotiations between buyers and sellers, typically intermediated by middlemen known as steel service centers. Even though there is no central record of daily transactions prices in the steel market, we do observe transaction prices for a particular firm -- a steel service center that purchases large quantities of steel in the wholesale market for subsequent resale in the retail market. The endogenous sampling problem arises from the fact that the firm only records pt on the days that it purchases steel. We present a parametric analysis of this problem under the assumption that the timing of steel purchases is part of an optimal trading strategy that maximizes the firm's expected discounted trading profits. We derive a parametric partial information maximum likelihood (PIML) estimator that solves the endogenous sampling problem and efficiently estimates the unknown parameters of a Markov transition probability that determines the law of motion for the underlying {pt} process. The PIML estimator also yields estimates of the structural parameters that determine the optimal trading rule. We also introduce an alternative consistent, less efficient, but computationally simpler simulated minimum distance (SMD) estimator that avoids high dimensional numerical integrations required by the PIML estimator. Using the SMD estimator, we provide estimates of a truncated lognormal AR(1) model of the wholesale price processes for particular types of steel plate. We use this to infer the share of the middleman's discounted profits that are due to markups paid by its retail customers, and the share due to price speculation. The latter measures the firm's success in forecasting steel prices and in timing its purchases in order to buy low and sell high'. The more successful the firm is in speculation (i.e. in strategically timing its purchases), the more serious are the potential biases that would result from failing to account for the endogeneity of the sampling process.

Research Papers in Economics

Econometric Methods for Endogenously Sampled Time Series: The Case of Commodity Price Speculation in the Steel Market

Author: George Hall
John Rust
Publication venue
Publication date
Field of study

This paper studies the econometric problems associated with estimation of a stochastic process that is endogenously sampled. Our interest is to infer the law of motion of a discrete-time stochastic process {p_t} that is observed only at a subset of times {t_1,...,t_n} that depend on the outcome of a probabilistic sampling rule that depends on the history of the process as well as other observed covariates x_t. We focus on a particular example where p_t denotes the daily wholesale price of a standardized steel product. However there are no formal exchanges or centralized markets where steel is traded and pt can be observed. Instead nearly all steel transaction prices are a result of private bilateral negotiations between buyers and sellers, typically intermediated by middlemen known as steel service centers. Even though there is no central record of daily transactions prices in the steel market, we do observe transaction prices for a particular firm -- a steel service center that purchases large quantities of steel in the wholesale market for subsequent resale in the retail market. The endogenous sampling problem arises from the fact that the firm only records p_t on the days that it purchases steel. We present a parametric analysis of this problem under the assumption that the timing of steel purchases is part of an optimal trading strategy that maximizes the firm's expected discounted trading profits. We derive a parametric partial information maximum likelihood (PIML) estimator that solves the endogenous sampling problem and efficiently estimates the unknown parameters of a Markov transition probability that determines the law of motion for the underlying {p_t} process. The PIML estimator also yields estimates of the structural parameters that determine the optimal trading rule. We also introduce an alternative consistent, less efficient, but computationally simpler simulated minimum distance (SMD) estimator that avoids high dimensional numerical integrations required by the PIML estimator. Using the SMD estimator, we provide estimates of a truncated lognormal AR(1) model of the wholesale price processes for particular types of steel plate. We use this to infer the share of the middleman's discounted profits that are due to markups paid by its retail customers, and the share due to price speculation. The latter measures the firm's success in forecasting steel prices and in timing its purchases in order to "buy low and sell high'." The more successful the firm is in speculation (i.e., in strategically timing its purchases), the more serious are the potential biases that would result from failing to account for the endogeneity of the sampling process.Endogenous sampling, Markov processes, Maximum likelihood, Simulation estimation

Research Papers in Economics

Revisiting the Binary Linearization Technique for Surface Realization

Author: Dagan Ido
Gardent Claire
Gurevych Iryna
Puzikov Yevgenyi
Publication venue: HAL CCSD
Publication date: 29/10/2019
Field of study

International audienceEnd-to-end neural approaches have achieved state-of-the-art performance in many natural language processing (NLP) tasks. Yet, they often lack transparency of the underlying decision-making process, hindering error analysis and certain model improvements. In this work, we revisit the binary linearization approach to surface realization, which exhibits more interpretable behavior, but was falling short in terms of prediction accuracy. We show how enriching the training data to better capture word order constraints almost doubles the performance of the system. We further demonstrate that encoding both local and global prediction contexts yields another considerable performance boost. With the proposed modifications , the system which ranked low in the latest shared task on multilingual surface realization now achieves best results in five out of ten languages, while being on par with the state-of-the-art approaches in others

INRIA a CCSD electronic archive server

A prototype English-Turkish statistical machine translation system

Author: Durgar El-Kahlout İlknur
Durgar El-Kahlout Ilknur
Publication venue
Publication date: 01/01/2009
Field of study

Translating one natural language (text or speech) to another natural language automatically is known as machine translation. Machine translation is one of the major, oldest and the most active areas in natural language processing. The last decade and a half have seen the rise of the use of statistical approaches to the problem of machine translation. Statistical approaches learn translation parameters automatically from alignment text instead of relying on writing rules which is labor intensive. Although there has been quite extensive work in this area for some language pairs, there has not been research for the Turkish - English language pair. In this thesis, we present the results of our investigation and development of a state-of-theart statistical machine translation prototype from English to Turkish. Developing an English to Turkish statistical machine translation prototype is an interesting problem from a number of perspectives. The most important challenge is that English and Turkish are typologically rather distant languages. While English has very limited morphology and rather fixed Subject-Verb-Object constituent order, Turkish is an agglutinative language with very flexible (but Subject-Object-Verb dominant) constituent order and a very rich and productive derivational and inflectional morphology with word structures that can correspond to complete phrases of several words in English when translated. Our research is focused on making scientific contributions to the state-of-the-art by taking into account certain morphological properties of Turkish (and possibly similar languages) that have not been addressed sufficiently in previous research for other languages. In this thesis; we investigate how different morpheme-level representations of morphology on both the English and the Turkish sides impact statistical translation results. We experiment with local word ordering on the English side to bring the word order of specific English prepositional phrases and auxiliary verb complexes, in line with the corresponding case marked noun forms and complex verb forms, on the Turkish side to help with word alignment. We augment the training data with sentences just with content words (noun, verb, adjective, adverb) obtained from the original training data and with highly-reliable phrase-pairs obtained iteratively from an earlier phrase alignment to alleviate the dearth of the parallel data available. We use word-based language model in the reranking of the n-best lists in addition to the morpheme-based language model used for decoding, so that we can incorporate both the local morphotactic constraints and local word ordering constraints. Lastly, we present a procedure for repairing the decoder output by correcting words with incorrect morphological structure and out-of-vocabulary with respect to the training data and language model to further improve the translations. We also include fine-grained evaluation results and some oracle scores with the BLEU+ tool which is an extension of the evaluation metric BLEU. After all research and development, we improve from 19.77 BLEU points for our word-based baseline model to 27.60 BLEU points for an improvement of 7.83 points or about 40% relative improvement

Sabanci University Research Database

Scalable approximate inference methods for Bayesian deep learning

Author: Ritter Julian Hippolyt
Publication venue: UCL (University College London)
Publication date: 28/05/2023
Field of study

This thesis proposes multiple methods for approximate inference in deep Bayesian neural networks split across three parts. The first part develops a scalable Laplace approximation based on a block- diagonal Kronecker factored approximation of the Hessian. This approximation accounts for parameter correlations – overcoming the overly restrictive independence assumption of diagonal methods – while avoiding the quadratic scaling in the num- ber of parameters of the full Laplace approximation. The chapter further extends the method to online learning where datasets are observed one at a time. As the experiments demonstrate, modelling correlations between the parameters leads to improved performance over the diagonal approximation in uncertainty estimation and continual learning, in particular in the latter setting the improvements can be substantial. The second part explores two parameter-efficient approaches for variational inference in neural networks, one based on factorised binary distributions over the weights, one extending ideas from sparse Gaussian processes to neural network weight matrices. The former encounters similar underfitting issues as mean-field Gaussian approaches, which can be alleviated by a MAP-style method in a hierarchi- cal model. The latter, based on an extension of Matheron’s rule to matrix normal distributions, achieves comparable uncertainty estimation performance to ensembles with the accuracy of a deterministic network while using only 25% of the number of parameters of a single ResNet-50. The third part introduces TyXe, a probabilistic programming library built on top of Pyro to facilitate turning PyTorch neural networks into Bayesian ones. In contrast to existing frameworks, TyXe avoids introducing a layer abstraction, allowing it to support arbitrary architectures. This is demonstrated in a range of applications, from image classification with torchvision ResNets over node labelling with DGL graph neural networks to incorporating uncertainty into neural radiance fields with PyTorch3d

UCL Discovery

A Comprehensive Review of Data-Driven Co-Speech Gesture Generation

Author: Ahuja Chaitanya
Henter Gustav Eje
Kucherenko Taras
Neff Michael
Nyatsanga Simbarashe
Publication venue: 'Wiley'
Publication date: 10/04/2023
Field of study

Gestures that accompany speech are an essential part of natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation and is considered an enabling technology in film, games, virtual social spaces, and for interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models, that benefit from the growing availability of data. This review article summarizes co-speech gesture generation research, with a particular focus on deep generative models. First, we articulate the theory describing human gesticulation and how it complements speech. Next, we briefly discuss rule-based and classical statistical gesture synthesis, before delving into deep learning approaches. We employ the choice of input modalities as an organizing principle, examining systems that generate gestures from audio, text, and non-linguistic input. We also chronicle the evolution of the related training data sets in terms of size, diversity, motion quality, and collection method. Finally, we identify key research challenges in gesture generation, including data availability and quality; producing human-like motion; grounding the gesture in the co-occurring speech in interaction with other speakers, and in the environment; performing gesture evaluation; and integration of gesture synthesis into applications. We highlight recent approaches to tackling the various key challenges, as well as the limitations of these approaches, and point toward areas of future development.Comment: Accepted for EUROGRAPHICS 202

arXiv.org e-Print Archive

Recommended from our members

Recurrent Neural Network Language Generation for Dialogue Systems

Author: Wen Tsung-Hsien
Publication venue: University of Cambridge
Publication date: 08/05/2018
Field of study

Language is the principal medium for ideas, while dialogue is the most natural and effective way for humans to interact with and access information from machines. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact on usability and perceived quality. Many commonly used NLG systems employ rules and heuristics, which tend to generate inflexible and stylised responses without the natural variation of human language. However, the frequent repetition of identical output forms can quickly make dialogue become tedious for most real-world users. Additionally, these rules and heuristics are not scalable and hence not trivially extensible to other domains or languages. A statistical approach to language generation can learn language decisions directly from data without relying on hand-coded rules or heuristics, which brings scalability and flexibility to NLG. Statistical models also provide an opportunity to learn in-domain human colloquialisms and cross-domain model adaptations. A robust and quasi-supervised NLG model is proposed in this thesis. The model leverages a Recurrent Neural Network (RNN)-based surface realiser and a gating mechanism applied to input semantics. The model is motivated by the Long-Short Term Memory (LSTM) network. The RNN-based surface realiser and gating mechanism use a neural network to learn end-to-end language generation decisions from input dialogue act and sentence pairs; it also integrates sentence planning and surface realisation into a single optimisation problem. The single optimisation not only bypasses the costly intermediate linguistic annotations but also generates more natural and human-like responses. Furthermore, a domain adaptation study shows that the proposed model can be readily adapted and extended to new dialogue domains via a proposed recipe. Continuing the success of end-to-end learning, the second part of the thesis speculates on building an end-to-end dialogue system by framing it as a conditional generation problem. The proposed model encapsulates a belief tracker with a minimal state representation and a generator that takes the dialogue context to produce responses. These features suggest comprehension and fast learning. The proposed model is capable of understanding requests and accomplishing tasks after training on only a few hundred human-human dialogues. A complementary Wizard-of-Oz data collection method is also introduced to facilitate the collection of human-human conversations from online workers. The results demonstrate that the proposed model can talk to human judges naturally, without any difficulty, for a sample application domain. In addition, the results also suggest that the introduction of a stochastic latent variable can help the system model intrinsic variation in communicative intention much better.Tsung-Hsien Wen's Ph.D. is supported by Toshiba Research Europe Ltd, Cambridge Research Laborator

Apollo (Cambridge)