251 research outputs found

    A Rule-based Part-of-speech Tagger for Classical Tibetan

    Get PDF
    This paper reports on the development of a rule-based part-of-speech tagger for Classical Tibetan. Far from being an obscure tool of minor utility to scholars, the rule-based tagger is a key component of a larger initiative aimed at radically transforming the practice of Tibetan linguistics through the application of corpus and computational methods

    Source Selection Languages:A Usability Evaluation

    Get PDF

    The contribution of corpus linguistics to lexicography and the future of Tibetan dictionaries

    Get PDF
    The first alphabetized dictionary of Tibetan appeared in 1829 (cf. Bray 2008) and the intervening 184 years have witnessed the publication of scores of other Tibetan dictionaries (cf. Simon 1964). Hundreds of Tibetan dictionaries are now available; these include bilin gual dictionaries, both to and from such languages as English, French, German, Latin, Japanese, etc. and specialized dictionaries focusing on medicine, plants, dialects, archaic terms, neologisms, etc. (cf. Walter 2006, McGrath 2008). However, if one classifies Tibetan dictionaries by the methods of their compilation the accomplishments of Tibetan lexicography are less impressive. Methodologies of dictionary compilation divide heuristically into three types. First, some dictionaries lack explicit methodology; these works assemble words in an ad hoc manner and illustrate them with invented examples. Second, there are dictionaries that are compiled over very long periods of time on the basis of collections of slips recording attestations of words as used in context. Third, more recent dictionaries are compiled on the basis of electronic text corpora, which are processed computationally to aid in the precision, consistency and speed of dictionary compilation. These methods may be called respectively the 'informal method', the 'traditional method', and the 'modern method'. The overwhelming majority of Tibetan dictionaries were compiled with the informal method. Only five Tibetan dictionaries use the traditional methodology. No Tibetan dictionary yet compiled makes use of the modern method

    Compartmentalized PDE4A5 signaling impairs hippocampal synaptic plasticity and long-term memory

    Get PDF
    Alterations in cAMP signaling are thought to contribute to neurocognitive and neuropsychiatric disorders. Members of the cAMP-specific phosphodiesterase 4 (PDE4) family, which contains >25 different isoforms, play a key role in determining spatial cAMP degradation so as to orchestrate compartmentalized cAMP signaling in cells. Each isoform binds to a different set of protein complexes through its unique N-terminal domain, thereby leading to targeted degradation of cAMP in specific intracellular compartments. However, the functional role of specific compartmentalized PDE4 isoforms has not been examined in vivo. Here, we show that increasing protein levels of the PDE4A5 isoform in mouse hippocampal excitatory neurons impairs a long-lasting form of hippocampal synaptic plasticity and attenuates hippocampus-dependent long-term memories without affecting anxiety. In contrast, viral expression of a truncated version of PDE4A5, which lacks the unique N-terminal targeting domain, does not affect long-term memory. Further, overexpression of the PDE4A1 isoform, which targets a different subset of signalosomes, leaves memory undisturbed. Fluorescence resonance energy transfer sensor-based cAMP measurements reveal that the full-length PDE4A5, in contrast to the truncated form, hampers forskolin-mediated increases in neuronal cAMP levels. Our study indicates that the unique N-terminal localization domain of PDE4A5 is essential for the targeting of specific cAMP-dependent signaling underlying synaptic plasticity and memory. The development of compounds to disrupt the compartmentalization of individual PDE4 isoforms by targeting their unique N-terminal domains may provide a fruitful approach to prevent cognitive deficits in neuropsychiatric and neurocognitive disorders that are associated with alterations in cAMP signaling

    Data context informed data wrangling

    Get PDF
    The process of preparing potentially large and complex data sets for further analysis or manual examination is often called data wrangling. In classical warehousing environments, the steps in such a process have been carried out using Extract-Transform-Load platforms, with significant manual involvement in specifying, configuring or tuning many of them. Cost-effective data wrangling processes need to ensure that data wrangling steps benefit from automation wherever possible. In this paper, we define a methodology to fully automate an end-to-end data wrangling process incorporating data context, which associates portions of a target schema with potentially spurious extensional data of types that are commonly available. Instance-based evidence together with data profiling paves the way to inform automation in several steps within the wrangling process, specifically, matching, mapping validation, value format transformation, and data repair. The approach is evaluated with real estate data showing substantial improvements in the results of automated wrangling

    The VADA Architecture for Cost-Effective Data Wrangling

    Get PDF
    Data wrangling, the multi-faceted process by which the data required by an application is identified, extracted, cleaned and integrated, is often cumbersome and labor intensive. In this paper, we present an architecture that supports a complete data wrangling lifecycle, orchestrates components dynamically, builds on automation wherever possible, is informed by whatever data is available, refines automatically produced results in the light of feedback, takes into account the user’s priorities, and supports data scientists with diverse skill sets. The architecture is demonstrated in practice for wrangling property sales and open government data
    • 

    corecore