9 research outputs found

    LATE Ain'T Earley: A Faster Parallel Earley Parser

    Full text link
    We present the LATE algorithm, an asynchronous variant of the Earley algorithm for parsing context-free grammars. The Earley algorithm is naturally task-based, but is difficult to parallelize because of dependencies between the tasks. We present the LATE algorithm, which uses additional data structures to maintain information about the state of the parse so that work items may be processed in any order. This property allows the LATE algorithm to be sped up using task parallelism. We show that the LATE algorithm can achieve a 120x speedup over the Earley algorithm on a natural language task

    Gappy Pattern Matching on GPUs for On-Demand Extraction of Hierarchical Translation Grammars

    Get PDF
    Grammars for machine translation can be materialized on demand by finding source phrases in an indexed parallel corpus and extracting their translations. This approach is limited in practical applications by the computational expense of online lookup and extraction. For phrase-based models, recent work has shown that on-demand grammar extraction can be greatly accelerated by parallelization on general purpose graphics processing units (GPUs), but these algorithms do not work for hierarchical models, which require matching patterns that contain gaps. We address this limitation by presenting a novel GPU algorithm for on-demand hierarchical grammar extraction that is at least an order of magnitude faster than a comparable CPU algorithm when processing large batches of sentences. In terms of end-to-end translation, with decoding on the CPU, we increase throughput by roughly two thirds on a standard MT evaluation dataset. The GPU necessary to achieve these improvements increases the cost of a server by about a third. We believe that GPU-based extraction of hierarchical grammars is an attractive proposition, particularly for MT applications that demand high throughput.

    User modelling for robotic companions using stochastic context-free grammars

    Get PDF
    Creating models about others is a sophisticated human ability that robotic companions need to develop in order to have successful interactions. This thesis proposes user modelling frameworks to personalise the interaction between a robot and its user and devises novel scenarios where robotic companions may apply these user modelling techniques. We tackle the creation of user models in a hierarchical manner, using a streamlined version of the Hierarchical Attentive Multiple-Models for Execution and Recognition (HAMMER) architecture to detect low-level user actions and taking advantage of Stochastic Context-Free Grammars (SCFGs) to instantiate higher-level models which recognise uncertain and recursive sequences of low-level actions. We discuss a couple of distinct scenarios for robotic companions: a humanoid sidekick for power-wheelchair users and a companion of hospital patients. Next, we address the limitations of the previous scenarios by applying our user modelling techniques and designing two further scenarios that fully take advantage of the user model. These scenarios are: a wheelchair driving tutor which models the user abilities, and the musical collaborator which learns the preferences of its users. The methodology produced interesting results in all scenarios: users preferred the actual robot over a simulator as a wheelchair sidekick. Hospital patients rated positively their interactions with the companion independently of their age. Moreover, most users agreed that the music collaborator had become a better accompanist with our framework. Finally, we observed that users' driving performance improved when the robotic tutor instructed them to repeat a task. As our workforce ages and the care requirements in our society grow, robots will need to play a role in helping us lead better lives. This thesis shows that, through the use of SCFGs, adaptive user models may be generated which then can be used by robots to assist their users.Open Acces

    Fast machine translation on parallel and massively parallel hardware

    Get PDF
    Parallel systems have been widely adopted in the field of machine translation, because the raw computational power they offer is well suited to this computationally intensive task. However programming for parallel hardware is not trivial as it requires redesign of the existing algorithms. In my thesis I design efficient algorithms for machine translation on parallel hardware. I identify memory accesses as the biggest bottleneck to processing speed and propose novel algorithms that minimize them. I present three distinct case studies in which minimizing memory access substantially improves speed: Starting with statistical machine translation, I design a phrase table that makes decoding ten times faster on a multi-threaded CPU. Next, I design a GPU-based n-gram language model that is twice as fast per £ as a highly optimized CPU implementation. Turning to neural machine translation, I design new stochastic gradient descent techniques that make end-to-end training twice as fast. The work in this thesis has been incorporated in two popular machine translation toolkits: Moses and Marian

    TRANSLATING VISUALIZATION INTERACTION INTO NATURAL LANGUAGE

    Get PDF
    Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Recent advancements in data collection and storage call for more complex analytical tasks to make sense of readily available datasets. More complicated and sophisticated tools are needed to complete those tasks. However, as these visualization tools get more complicated, it becomes increasingly difficult to learn interaction sequences, recall past queries asked from a visualization, and correctly interpret visual states to forage the data. Moreover, the high interactivity of such tools increases the challenge of connecting low-level acquired information to higher-level analytical questions and hypotheses to support, reason, and eventually present insights. This makes studying the usability of complex interactive visualizations, both in the process of foraging and making sense of data, an essential part of visual analytic research. This research can be approached in at least two major ways. One can focus on studying new techniques and guidelines for designing interactive complex visualizations that are easy to use and understand. One can also focus on keeping the capabilities of existing complex visualizations, yet provide supporting capabilities that increases their usability. The latter is an emerging area of research in visual analytics, and is the focus of this dissertation. This dissertation describes six contributions to the field of visual analytics. The first contribution is an architecture of a query-to-question supporting system that automatically records user interactions and presents them contextually using natural written language. The architecture takes into account the domain knowledge of experts/designers and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a log of text that can be visualized. The second contribution is query-to-question (Q2Q), an implemented system that translates low-level user interactions into high-level analytical questions and presents them as a log of styled text that complements and effectively extends the functionality of visualization tools. The third contribution is a demonstration of the beneficial effects of accompanying a visualization with a textual translation of user interaction on the usability of visualizations. The presence of the translation interface produces considerable improvements in learnability, efficiency, and memorability of visualization in terms of speed and the length of interaction sequences that users perform, along with a modest decrease in error ratio. The fourth contribution is a set of design guidelines for translating user interactions into natural language, taking into account variation in user knowledge and roles, the types of data being visualized, and the types of interaction supported. The fifth contribution is a history organizer interface that enables users to organize their analytical process. The structured textual translations output from Q2Q are input into a history organizer tool (HOT) that imposes reordering, sequencing, and grouping of the translated interactions. HOT provides a reasoning framework for users to organize and present hypotheses and insight acquired from a visualization. The sixth contribution is a demonstration of the efficiency of a suite of arrangement options for organizing questions asked in a visualization. Integration of query translation and history organization improves users' speed, error ratio, and number of reordering actions performed during organization of translated interactions. Overall, this dissertation contributes to the analysis and discovery of user storytelling patterns and behaviours, thereby paving the way to the creation of more intelligent, effective, and user-oriented visual analysis presentation tools
    corecore