168 research outputs found

    Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned

    Get PDF
    Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation. In this work we evaluate the contribution made by individual attention heads in the encoder to the overall performance of the model and analyze the roles played by them. We find that the most important and confident heads play consistent and often linguistically-interpretable roles. When pruning heads using a method based on stochastic gates and a differentiable relaxation of the L0 penalty, we observe that specialized heads are last to be pruned. Our novel pruning method removes the vast majority of heads without seriously affecting performance. For example, on the English-Russian WMT dataset, pruning 38 out of 48 encoder heads results in a drop of only 0.15 BLEU.Comment: ACL 2019 (camera-ready

    Modeling affirmative and negated action processing in the brain with lexical and compositional semantic models

    Get PDF
    Recent work shows that distributional semantic models can be used to decode patterns of brain activity associated with individual words and sentence meanings. However, it is yet unclear to what extent such models can be used to study and ecode fMRI patterns associated with specific aspects of semantic composition such as the negation function. In this paper, we apply lexical and compositional semantic models to decode fMRI patterns associated with negated and affirmative sentences containing hand-action verbs. Our results show reduced decoding (correlation) of sentences where the verb is in the negated context, as compared to the affirmative one, within brain regions implicated in action-semantic processing. This supports behavioral and brain imaging studies, suggesting that negation involves reduced access to aspects of the affirmative mental representation. The results pave the way for testing alternate semantic models of negation against human semantic processing in the brain

    A Conversational Academic Assistant for the Interaction in Virtual Worlds

    Get PDF
    Proceedings of: Forth International Workshop on User-Centric Technologies and applications (CONTEXTS 2010). Valencia, 07-10 September , 2010.The current interest and extension of social networking are rapidly introducing a large number of applications that originate new communication and interaction forms among their users. Social networks and virtual worlds, thus represent a perfect environment for interacting with applications that use multimodal information and are able to adapt to the specific characteristics and preferences of each user. As an example of this application, in this paper we present an example of the integration of conversational agents in social networks, describing the development of a conversational avatar that provides academic information in the virtual world of Second Life. For its implementation techniques from Speech Technologies and Natural Language Processing have been used to allow a more natural interaction with the system using voice.Funded by projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, SINPROB, CAM MADRINET S-0505/TIC/0255, and DPS2008-07029-C02-02.Publicad

    Conversation acts in task-oriented spoken dialogue

    Get PDF
    A linguistic form\u27s compositional, timeless meaning can be surrounded or even contradicted by various social, aesthetic, or analogistic companion meanings. This paper addresses a series of problems in the structure of spoken language discourse, including turn-taking and grounding. It views these processes as composed of fine-grained actions, which resemble speech acts both in resulting from a computational mechanism of planning and in having a rich relationship to the specific linguistic features which serve to indicate their presence. The resulting notion of Conversation Acts is more general than speech act theory, encompassing not only the traditional speech acts but turn-taking, grounding, and higher-level argumentation acts as well. Furthermore, the traditional speech acts in this scheme become fully joint actions, whose successful performance requires full listener participation. This paper presents a detailed analysis of spoken language dialogue. It shows the role of each class of conversation acts in discourse structure, and discusses how members of each class can be recognized in conversation. Conversation acts, it will be seen, better account for the success of conversation than speech act theory alone

    Photoelectron diffraction: from phenomenological demonstration to practical tool

    Get PDF
    The potential of photoelectron diffraction—exploiting the coherent interference of directly-emitted and elastically scattered components of the photoelectron wavefield emitted from a core level of a surface atom to obtain structural information—was first appreciated in the 1970s. The first demonstrations of the effect were published towards the end of that decade, but the method has now entered the mainstream armoury of surface structure determination. This short review has two objectives: First, to outline the way that the idea emerged and the way this evolved in my own collaboration with Neville Smith and his colleagues at Bell Labs in the early years: Second, to provide some insight into the current state-of-the art in application of (scanned-energy mode) photoelectron diffraction to address two key issue in quantitative surface structure determination, namely, complexity and precision. In this regard a particularly powerful aspect of photoelectron diffraction is its elemental and chemical-state specificity

    From process models to chatbots

    Get PDF
    The effect of digital transformation in organizations needs to go beyond automation, so that human capabilities are also augmented. A possibility in this direction is to make formal representations of processes more accessible for the actors involved. On this line, this paper presents a methodology to transform a formal process description into a conversational agent, which can guide a process actor through the required steps in a user-friendly conversation. The presented system relies on dialog systems and natural language processing and generation techniques, to automatically build a chatbot from a process model. A prototype tool – accessible online – has been developed to transform a process model in BPMN into a chatbot, defined in Artificial Intelligence Marking Language (AIML), which has been evaluated over academic and industrial professionals, showing potential into improving the gap between process understanding and execution.Peer ReviewedPostprint (author's final draft

    Motion Rail: A Virtual Reality Level Crossing Training Application

    Get PDF
    This paper presents the development and usability testing of a Virtual Reality (VR) based system named 'Motion Rail' for training children on railway crossing safety. The children are to use a VR head mounted device and a controller to navigate the VR environment to perform a level crossing task and they will receive instant feedback on pass or failure on a display in the VR environment. Five participants consisting of two male and three females were considered for the usability test. The outcomes of the test was promising, as the children were very engaging and will like to adopt this training approach in future safety training

    Whole Exome Sequencing of Patients with Steroid-Resistant Nephrotic Syndrome

    Get PDF
    BACKGROUND AND OBJECTIVES: Steroid-resistant nephrotic syndrome overwhelmingly progresses to ESRD. More than 30 monogenic genes have been identified to cause steroid-resistant nephrotic syndrome. We previously detected causative mutations using targeted panel sequencing in 30% of patients with steroid-resistant nephrotic syndrome. Panel sequencing has a number of limitations when compared with whole exome sequencing. We employed whole exome sequencing to detect monogenic causes of steroid-resistant nephrotic syndrome in an international cohort of 300 families. DESIGN, SETTING, PARTIIPANTS AND MEASUREMENTS: Three hundred thirty-five individuals with steroid-resistant nephrotic syndrome from 300 families were recruited from April of 1998 to June of 2016. Age of onset was restricted to <25 years of age. Exome data were evaluated for 33 known monogenic steroid-resistant nephrotic syndrome genes. RESULTS: In 74 of 300 families (25%), we identified a causative mutation in one of 20 genes known to cause steroid-resistant nephrotic syndrome. In 11 families (3.7%), we detected a mutation in a gene that causes a phenocopy of steroid-resistant nephrotic syndrome. This is consistent with our previously published identification of mutations using a panel approach. We detected a causative mutation in a known steroid-resistant nephrotic syndrome gene in 38% of consanguineous families and in 13% of nonconsanguineous families, and 48% of children with congenital nephrotic syndrome. A total of 68 different mutations were detected in 20 of 33 steroid-resistant nephrotic syndrome genes. Fifteen of these mutations were novel. NPHS1, PLCE1, NPHS2, and SMARCAL1 were the most common genes in which we detected a mutation. In another 28% of families, we detected mutations in one or more candidate genes for steroid-resistant nephrotic syndrome. CONCLUSIONS: Whole exome sequencing is a sensitive approach toward diagnosis of monogenic causes of steroid-resistant nephrotic syndrome. A molecular genetic diagnosis of steroid-resistant nephrotic syndrome may have important consequences for the management of treatment and kidney transplantation in steroid-resistant nephrotic syndrome
    corecore