12 research outputs found

    An Evidence Based Approach To Determining Residential Occupancy and its Role in Demand Response Management

    Get PDF
    AbstractThis article introduces a methodological approach for analysing time series data from multiple sensors in order to estimate home occupancy. The approach combines the Dempster-Shafer theory, which allows the fusion of ‘evidence’ from multiple sensors, with the Hidden Markov Model. The procedure addresses some of the practicalities of occupancy estimation including the blind estimation of sensor distributions during unoccupied and occupied states, and issues of occupancy inference when some sensors have missing data. The approach is applied to preliminary data from a residential family home on the North Coast of Scotland. Features derived from sensors that monitored electrical power, dew point temperature and indoor CO2 concentration were fused and the Hidden Markov Model applied to predict the occupancy profile. The approach shown is able to predict daytime occupancy, while effectively handling periods of missing sensor data, according to cross-validation with available ground truth information. Knowledge of occupancy is then fused with consumption behaviour and a simple metric developed to allow the assessment of how likely it is that a household can participate in demand response at different periods during the day. The benefits of demand response initiatives are qualitatively discussed. The approach could be used to assist in the transition towards more active energy citizens, as envisaged by the smart grid

    Explicit length modelling for statistical machine translation

    Full text link
    [EN] Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods and two alternative parametrisations for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit bilingual length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, a systematic evaluation on reference SMT tasks considering different language pairs proves the benefits of explicit length modelling.Work supported by the EC (FEDER/FSE) under the transLectures project (FP7-ICT-2011-7-287755) and the Spanish MEC/MICINN under the MIPRCV "Consolider Ingenio 2010" program (CSD2007-00018) and iTrans2 (TIN2009-14511) projects and FPU grant (AP2010-4349). Also supported by the Spanish MITyC under the erudito.com (TSI-020110-2009-439) project and by the Generalitat Valenciana under grants Prometeo/2009/014 and GV/2010/067, and by the UPV under the AdInTAO (20091027) project. The authors wish to thank the anonymous reviewers for their criticisms and suggestions.Silvestre Cerdà, JA.; Andrés Ferrer, J.; Civera Saiz, J. (2012). Explicit length modelling for statistical machine translation. Pattern Recognition. 45(9):3183-3192. https://doi.org/10.1016/j.patcog.2012.01.006S3183319245

    TransSearch: from a bilingual concordancer to a translation finder

    Get PDF
    Abstract As basic as bilingual concordancers may appear, they are some of the most widely used computer-assisted translation tools among professional translators. Nevertheless, they still do not benefit from recent breakthroughs in machine translation. This paper describes the improvement of the commercial bilingual concordancer TransSearch in order to embed a word alignment feature. The use of statistical word alignment methods allows the system to spot user query translations, and thus the tool is transformed into a translation search engine. We describe several translation identification and postprocessing algorithms that enhance the application. The excellent results obtained using a large translation memory consisting of 8.3 million sentence pairs are confirmed via human evaluation

    Hmm word and phrase alignment for statistical machine translation

    No full text
    HMM-based models are developed for the alignment of words and phrases in bitext. The models are formulated so that alignment and parameter estimation can be performed efficiently. We find that Chinese-English word alignment performance is comparable to that of IBM Model-4 even over large training bitexts. Phrase pairs extracted from word alignments generated under the model can also be used for phrase-based translation, and in Chinese to English and Arabic to English translation, performance is comparable to systems based on Model-4 alignments. Direct phrase pair induction under the model is described and shown to improve translation performance.

    HMM word and phrase alignment for statistical machine translation

    No full text

    HMM word and phrase alignment for statistical machine translation

    No full text

    Constrained word alignment models for statistical machine translation

    Get PDF
    Word alignment is a fundamental and crucial component in Statistical Machine Translation (SMT) systems. Despite the enormous progress made in the past two decades, this task remains an active research topic simply because the quality of word alignment is still far from optimal. Most state-of-the-art word alignment models are grounded on statistical learning theory treating word alignment as a general sequence alignment problem, where many linguistically motivated insights are not incorporated. In this thesis, we propose new word alignment models with linguistically motivated constraints in a bid to improve the quality of word alignment for Phrase-Based SMT systems (PB-SMT). We start the exploration with an investigation into segmentation constraints for word alignment by proposing a novel algorithm, namely word packing, which is motivated by the fact that one concept expressed by one word in one language can frequently surface as a compound or collocation in another language. Our algorithm takes advantage of the interaction between segmentation and alignment, starting with some segmentation for both the source and target language and updating the segmentation with respect to the word alignment results using state-of-the-art word alignment models; thereafter a refined word alignment can be obtained based on the updated segmentation. In this process, the updated segmentation acts as a hard constraint on the word alignment models and reduces the complexity of the alignment models by generating more 1-to-1 correspondences through word packing. Experimental results show that this algorithm can lead to statistically significant improvements over the state-of-the-art word alignment models. Given that word packing imposes "hard" segmentation constraints on the word aligner, which is prone to introducing noise, we propose two new word alignment models using syntactic dependencies as soft constraints. The first model is a syntactically enhanced discriminative word alignment model, where we use a set of feature functions to express the syntactic dependency information encoded in both source and target languages. One the one hand, this model enjoys great flexibility in its capacity to incorporate multiple features; on the other hand, this model is designed to facilitate model tuning for different objective functions. Experimental results show that using syntactic constraints can improve the performance of the discriminative word alignment model, which also leads to better PB-SMT performance compared to using state-of-the-art word alignment models. The second model is a syntactically constrained generative word alignment model, where we add in a syntactic coherence model over the target phrases in the context of HMM word-to-phrase alignment. The advantages of our model are that (i) the addition of the syntactic coherence model preserves the efficient parameter estimation procedures; and (ii) the flexibility of the model can be increased so that it can be tuned according to different objective functions. Experimental results show that tuning this model properly leads to a significant gain in MT performance over the state-of-the-art
    corecore