783 research outputs found

    Adaptive Sentence Boundary Disambiguation

    Full text link
    Labeling of sentence boundaries is a necessary prerequisite for many natural language processing tasks, including part-of-speech tagging and sentence alignment. End-of-sentence punctuation marks are ambiguous; to disambiguate them most systems use brittle, special-purpose regular expression grammars and exception rules. As an alternative, we have developed an efficient, trainable algorithm that uses a lexicon with part-of-speech probabilities and a feed-forward neural network. After training for less than one minute, the method correctly labels over 98.5\% of sentence boundaries in a corpus of over 27,000 sentence-boundary marks. We show the method to be efficient and easily adaptable to different text genres, including single-case texts.Comment: This is a Latex version of the previously submitted ps file (formatted as a uuencoded gz-compressed .tar file created by csh script). The software from the work described in this paper is available by contacting [email protected]

    Multi Visualization and Dynamic Query for Effective Exploration of Semantic Data

    Get PDF
    Semantic formalisms represent content in a uniform way according to ontologies. This enables manipulation and reasoning via automated means (e.g. Semantic Web services), but limits the user’s ability to explore the semantic data from a point of view that originates from knowledge representation motivations. We show how, for user consumption, a visualization of semantic data according to some easily graspable dimensions (e.g. space and time) provides effective sense-making of data. In this paper, we look holistically at the interaction between users and semantic data, and propose multiple visualization strategies and dynamic filters to support the exploration of semantic-rich data. We discuss a user evaluation and how interaction challenges could be overcome to create an effective user-centred framework for the visualization and manipulation of semantic data. The approach has been implemented and evaluated on a real company archive

    Structure of psoralen-crosslinked ribosomal RNA from Drosophila melanogaster.

    Full text link

    Multilevel predictors of adolescent physical activity: a longitudinal analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To examine how factors from a social ecologic model predict physical activity (PA) among adolescents using a longitudinal analysis.</p> <p>Methods</p> <p>Participants in this longitudinal study were adolescents (ages 10-16 at baseline) and one parent enrolled in the Transdisciplinary Research on Energetics and Cancer-Identifying Determinants of Eating and Activity (TREC-IDEA) and the Etiology of Childhood Obesity (ECHO). Both studies were designed to assess a socio-ecologic model of adolescent obesity risk. PA was collected using ActiGraph activity monitors at two time points 24 months apart. Other measures included objective height and weight, adolescent and parent questionnaires on multilevel psychological, behavioral and social determinants of PA, and a home PA equipment inventory. Analysis was conducted using SAS, including descriptive characteristics, bivariate and stepped multivariate mixed models, using baseline adjustment. Models were stratified by gender.</p> <p>Results</p> <p>There were 578 adolescents with complete data. Results suggest few statistically significant longitudinal associations with physical activity measured as minutes of MVPA or total counts from accelerometers. For boys, greater self-efficacy (B = 0.75, <it>p </it>= 0.01) and baseline MVPA (B = 0.55, <it>p </it>< 0.01) remained significantly associated with MVPA at follow-up. A similar pattern was observed for total counts. For girls, baseline MVPA (B = 0.58, <it>p </it>= 0.01) and barriers (B = -0.32, <it>p </it>= 0.05) significantly predicted MVPA at follow-up in the full model. The full multilevel model explained 30% of the variance in PA among boys and 24% among girls.</p> <p>Conclusions</p> <p>PA change in adolescents is a complex issue that is not easily understood. Our findings suggest early PA habits are the most important predictor of PA levels in adolescence. Intervention may be necessary prior to middle school to maintain PA through adolescence.</p

    Exploring Large Digital Library Collections Using a Map-Based Visualisation

    Get PDF
    In this paper we describe a novel approach for exploring large document collections using a map-based visualisation. We use hierarchically structured semantic concepts that are attached to the documents to create a visualisation of the semantic space that resembles a Google Map. The approach is novel in that we exploit the hierarchical structure to enable the approach to scale to large document collections and to create a map where the higher levels of spatial abstraction have semantic meaning. An informal evaluation is carried out to gather subjective feedback from users. Overall results are positive with users finding the visualisation enticing and easy to use

    Do rats learn conditional independence?

    Get PDF
    If acquired associations are to accurately represent real relevance relations, there is motivation for the hypothesis that learning will, in some circumstances, be more appropriately modelled, not as direct dependence, but as conditional independence. In a serial compound conditioning experiment, two groups of rats were presented with a conditioned stimulus (CS1) that imperfectly (50%) predicted food, and was itself imperfectly predicted by a CS2. Groups differed in the proportion of CS2 presentations that were ultimately followed by food (25% versus 75%). Thus, the information presented regarding the relevance of CS2 to food was ambiguous between direct dependence and conditional independence (given CS1). If rats learnt that food was conditionally independent of CS2, given CS1, subjects of both groups should thereafter respond similarly to CS2 alone. Contrary to the conditionality hypothesis, subjects attended to the direct food predictability of CS2, suggesting that rats treat even distal stimuli in a CS sequence as immediately relevant to food, not conditional on an intermediate stimulus. These results urge caution in representing indirect associations as conditional associations, accentuate the theoretical weight of the Markov condition in graphical models, and challenge theories to articulate the conditions under which animals are expected to learn conditional associations, if ever.All funding for the project was internal, from Indiana University

    Scalable Cross-lingual Document Similarity through Language-specific Concept Hierarchies

    Full text link
    With the ongoing growth in number of digital articles in a wider set of languages and the expanding use of different languages, we need annotation methods that enable browsing multi-lingual corpora. Multilingual probabilistic topic models have recently emerged as a group of semi-supervised machine learning models that can be used to perform thematic explorations on collections of texts in multiple languages. However, these approaches require theme-aligned training data to create a language-independent space. This constraint limits the amount of scenarios that this technique can offer solutions to train and makes it difficult to scale up to situations where a huge collection of multi-lingual documents are required during the training phase. This paper presents an unsupervised document similarity algorithm that does not require parallel or comparable corpora, or any other type of translation resource. The algorithm annotates topics automatically created from documents in a single language with cross-lingual labels and describes documents by hierarchies of multi-lingual concepts from independently-trained models. Experiments performed on the English, Spanish and French editions of JCR-Acquis corpora reveal promising results on classifying and sorting documents by similar content.Comment: Accepted at the 10th International Conference on Knowledge Capture (K-CAP 2019

    A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries

    Full text link
    Interactive query expansion can assist users during their query formulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re-commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain-specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out-performed all other implementations. All our observations show that domain-specific vocabulary can support the user in finding alternative concepts and formulating queries.Comment: To be published in Proceedings of Theories and Practice in Digital Libraries (TPDL), 201

    Multiple Sexual Partners and Condom use among 10 - 19 Year-olds in four Districts in Tanzania: What do we Learn?

    Get PDF
    Although some studies in Tanzania have addressed the question of sexuality and STIs among adolescents, mostly those aged 15 - 19 years, evidence on how multiple sexual partners influence condom use among 10 - 19 year-olds is limited. This study attempts to bridge this gap by testing a hypothesis that sexual relationships with multiple partners in the age group 10 - 19 years spurs condom use during sex in four districts in Tanzania. Secondary analysis was performed using data from the Adolescents Module of the cross-sectional household survey on Maternal, Newborn and Child Health (MNCH) that was done in Kigoma, Kilombero, Rufiji and Ulanga districts, Tanzania in 2008. A total of 612 adolescents resulting from a random sample of 1200 households participated in this study. Pearson Chi-Square was used as a test of association between multiple sexual partners and condom use. Multivariate logistic regression model was fitted to the data to assess the effect of multiple sexual partners on condom use, having adjusted for potential confounding variables. STATA (10) statistical software was used to carry out this process at 5% two-sided significance level. Of the 612 adolescents interviewed, 23.4% reported being sexually active and 42.0% of these reported having had multiple (> 1) sexual partners in the last 12 months. The overall prevalence of condom use among them was 39.2%. The proportion using a condom at the last sexual intercourse was higher among those who knew that they can get a condom if they want than those who did not. No evidence of association was found between multiple sexual partners and condom use (OR = 0.77, 95% CI = 0.35 - 1.67, P = 0.504). With younger adolescents (10 - 14 years) being a reference, condom use was associated with age group (15 - 19: OR = 3.69, 95% CI = 1.21 - 11.25, P = 0.022) and district of residence (Kigoma: OR = 7.45, 95% CI = 1.79 - 31.06, P = 0.006; Kilombero: OR = 8.89, 95% CI = 2.91 - 27.21, P < 0.001; Ulanga: OR = 5.88, 95% CI = 2.00 - 17.31, P = 0.001), Rufiji being a reference category. No evidence of association was found between multiple sexual partners and condom use among adolescents in the study area. The large proportion of adolescents who engage in sexual activity without using condoms, even those with multiple partners, perpetuates the risk of transmission of HIV infections in the community. Strategies such as sex education and easing access to and making a friendly environment for condom availability are important to address the risky sexual behaviour among adolescents
    corecore