398 research outputs found
A syntactic language model based on incremental CCG parsing
Syntactically-enriched language models (parsers) constitute a promising component in applications such as machine translation and speech-recognition. To maintain a useful level of accuracy, existing parsers are non-incremental and must span a combinatorially growing space of possible structures as every input word is processed. This prohibits their incorporation into standard linear-time decoders. In this paper, we present an incremental, linear-time dependency parser based on Combinatory Categorial Grammar (CCG) and classification techniques. We devise a deterministic transform of CCGbank canonical derivations into incremental ones, and train our parser on this data. We discover that a cascaded, incremental version provides an appealing balance between efficiency and accuracy
A syntactified direct translation model with linear-time decoding
Recent syntactic extensions of statistical translation models work with a synchronous context-free or tree-substitution grammar extracted from an automatically parsed parallel corpus. The decoders accompanying these extensions typically exceed quadratic time complexity. This paper extends the Direct Translation Model 2 (DTM2) with syntax while maintaining linear-time decoding. We employ a linear-time parsing algorithm based on an eager, incremental interpretation of Combinatory Categorial Grammar
(CCG). As every input word is processed, the local parsing decisions resolve ambiguity eagerly, by selecting a single
supertagâoperator pair for extending the dependency parse incrementally. Alongside translation features extracted from
the derived parse tree, we explore syntactic features extracted from the incremental derivation process. Our empirical experiments show that our model significantly
outperforms the state-of-the art DTM2 system
Supertagged phrase-based statistical machine translation
Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates
supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar
and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcomponents. Our best result (0.4688 BLEU) improves by 6.1% relative to a state-of-theart
PBSMT model, which compares very favourably with the leading systems on the NIST 2005 task
MaTrEx: the DCU machine translation system for IWSLT 2007
In this paper, we give a description of the machine translation system developed at DCU that was used for our second participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2007). In this participation, we focus on some new methods to improve system quality. Specifically, we try our word packing technique for different language pairs, we smooth our translation tables with out-of-domain word translations for the ArabicâEnglish and ChineseâEnglish tasks in order to solve the high number of out of vocabulary items, and finally we deploy a translation-based model for case and punctuation restoration
An efficient numerical method for the modified regularized long wave equation using Fourier spectral method
AbstractThe modified regularized long wave (MRLW) equation is numerically solved using Fourier spectral collection method. The MRLW equation is discretized in space variable by the Fourier spectral method and Leap-Frog method for time dependence. To validate the efficiency, accuracy and simplicity of the used method, four cases study are solved. The single soliton wave motion, interaction of two solitary waves, interaction of three solitary waves and a Maxwellian initial condition pulse are studied. The L2 and Lâ error norms are computed for the motion of single solitary waves. To determine the conservation properties of the MRLW equation three invariants of motion are evaluated for all test problems
Syntactic phrase-based statistical machine translation
Phrase-based statistical machine translation (PBSMT) systems represent the dominant approach in MT today. However, unlike systems in other paradigms, it has proven difficult to date to incorporate syntactic knowledge in order to improve translation quality. This paper improves on recent research which uses 'syntactified' target language phrases, by incorporating supertags as constraints to better resolve parse tree fragments. In addition, we do not impose any sentence-length limit, and using a log-linear decoder, we outperform a state-of-the-art PBSMT system by over 1.3 BLEU points (or 3.51% relative) on the NIST 2003 Arabic-English test corpus
Improving Traffic Safety And Drivers\u27 Behavior In Reduced Visibility Conditions
This study is concerned with the safety risk of reduced visibility on roadways. Inclement weather events such as fog/smoke (FS), heavy rain (HR), high winds, etc, do affect every road by impacting pavement conditions, vehicle performance, visibility distance, and driversâ behavior. Moreover, they affect travel demand, traffic safety, and traffic flow characteristics. Visibility in particular is critical to the task of driving and reduction in visibility due FS or other weather events such as HR is a major factor that affects safety and proper traffic operation. A real-time measurement of visibility and understanding driversâ responses, when the visibility falls below certain acceptable level, may be helpful in reducing the chances of visibility-related crashes. In this regard, one way to improve safety under reduced visibility conditions (i.e., reduce the risk of visibility related crashes) is to improve driversâ behavior under such adverse weather conditions. Therefore, one of objectives of this research was to investigate the factors affecting driversâ stated behavior in adverse visibility conditions, and examine whether drivers rely on and follow advisory or warning messages displayed on portable changeable message signs (CMS) and/or variable speed limit (VSL) signs in different visibility, traffic conditions, and on two types of roadways; freeways and two-lane roads. The data used for the analyses were obtained from a self-reported questionnaire survey carried out among 566 drivers in Central Florida, USA. Several categorical data analysis techniques such as conditional distribution, oddsâ ratio, and Chi-Square tests were applied. In addition, two modeling approaches; bivariate and multivariate probit models were estimated. The results revealed that gender, age, road type, visibility condition, and familiarity with VSL signs were the significant factors affecting the likelihood of reducing speed following CMS/VSL instructions in reduced visibility conditions. Other objectives of this survey study were to determine the content of messages that iv would achieve the best perceived safety and driversâ compliance and to examine the best way to improve safety during these adverse visibility conditions. The results indicated that Caution-fog ahead-reduce speed was the best message and using CMS and VSL signs together was the best way to improve safety during such inclement weather situations. In addition, this research aimed to thoroughly examine driversâ responses under low visibility conditions and quantify the impacts and values of various factors found to be related to driversâ compliance and driversâ satisfaction with VSL and CMS instructions in different visibility and traffic conditions. To achieve these goals, Explanatory Factor Analysis (EFA) and Structural Equation Modeling (SEM) approaches were adopted. The results revealed that driversâ satisfaction with VSL/CMS was the most significant factor that positively affected driversâ compliance with advice or warning messages displayed on VSL/CMS signs under different fog conditions followed by driver factors. Moreover, it was found that roadway type affected driversâ compliance to VSL instructions under medium and heavy fog conditions. Furthermore, driversâ familiarity with VSL signs and driver factors were the significant factors affecting driversâ satisfaction with VSL/CMS advice under reduced visibility conditions. Based on the findings of the survey-based study, several recommendations are suggested as guidelines to improve driversâ behavior in such reduced visibility conditions by enhancing driversâ compliance with VSL/CMS instructions. Underground loop detectors (LDs) are the most common freeway traffic surveillance technologies used for various intelligent transportation system (ITS) applications such as travel time estimation and crash detection. Recently, the emphasis in freeway management has been shifting towards using LDs data to develop real-time crash-risk assessment models. Numerous v studies have established statistical links between freeway crash risk and traffic flow characteristics. However, there is a lack of good understanding of the relationship between traffic flow variables (i.e. speed, volume and occupancy) and crashes that occur under reduced visibility (VR crashes). Thus, another objective of this research was to explore the occurrence of reduced visibility related (VR) crashes on freeways using real-time traffic surveillance data collected from loop detectors (LDs) and radar sensors. In addition, it examines the difference between VR crashes to those occurring at clear visibility conditions (CV crashes). To achieve these objectives, Random Forests (RF) and matched case-control logistic regression model were estimated. The results indicated that traffic flow variables leading to VR crashes are slightly different from those variables leading to CV crashes. It was found that, higher occupancy observed about half a mile between the nearest upstream and downstream stations increases the risk for both VR and CV crashes. Moreover, an increase of the average speed observed on the same half a mile increases the probability of VR crash. On the other hand, high speed variation coupled with lower average speed observed on the same half a mile increase the likelihood of CV crashes. Moreover, two issues that have not explicitly been addressed in prior studies are; (1) the possibility of predicting VR crashes using traffic data collected from the Automatic Vehicle Identification (AVI) sensors installed on Expressways and (2) which traffic data is advantageous for predicting VR crashes; LDs or AVIs. Thus, this research attempts to examine the relationships between VR crash risk and real-time traffic data collected from LDs installed on two Freeways in Central Florida (I-4 and I-95) and from AVI sensors installed on two vi Expressways (SR 408 and SR 417). Also, it investigates which data is better for predicting VR crashes. The approach adopted here involves developing Bayesian matched case-control logistic regression using the historical VR crashes, LDs and AVI data. Regarding models estimated based on LDs data, the average speed observed at the nearest downstream station along with the coefficient of variation in speed observed at the nearest upstream station, all at 5-10 minute prior to the crash time, were found to have significant effect on VR crash risk. However, for the model developed based on AVI data, the coefficient of variation in speed observed at the crash segment, at 5-10 minute prior to the crash time, affected the likelihood of VR crash occurrence. Argument concerning which traffic data (LDs or AVI) is better for predicting VR crashes is also provided and discussed
- âŠ