Search CORE

783 research outputs found

Adaptive Sentence Boundary Disambiguation

Author: Hearst Marti A.
Palmer David D.
Publication venue
Publication date: 01/01/1994
Field of study

Labeling of sentence boundaries is a necessary prerequisite for many natural language processing tasks, including part-of-speech tagging and sentence alignment. End-of-sentence punctuation marks are ambiguous; to disambiguate them most systems use brittle, special-purpose regular expression grammars and exception rules. As an alternative, we have developed an efficient, trainable algorithm that uses a lexicon with part-of-speech probabilities and a feed-forward neural network. After training for less than one minute, the method correctly labels over 98.5\% of sentence boundaries in a corpus of over 27,000 sentence-boundary marks. We show the method to be efficient and easily adaptable to different text genres, including single-case texts.Comment: This is a Latex version of the previously submitted ps file (formatted as a uuencoded gz-compressed .tar file created by csh script). The software from the work described in this paper is available by contacting [email protected]

arXiv.org e-Print Archive

CiteSeerX

Multi Visualization and Dynamic Query for Effective Exploration of Semantic Data

Author: B. Shneiderman
B.B. Bederson
D. Petrelli
E. Oren
G. Stasko
H. Stuckenschmidt
K.W. Tu
M. Hearst
M. Hearst
M.C. Schraefel
M.C. Schraefel
P. Mutton
R. Bhagdev
R. Bhagdev
R. White
S. Card
T. Aditya
V.T. Thai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Semantic formalisms represent content in a uniform way according to ontologies. This enables manipulation and reasoning via automated means (e.g. Semantic Web services), but limits the user’s ability to explore the semantic data from a point of view that originates from knowledge representation motivations. We show how, for user consumption, a visualization of semantic data according to some easily graspable dimensions (e.g. space and time) provides effective sense-making of data. In this paper, we look holistically at the interaction between users and semantic data, and propose multiple visualization strategies and dynamic filters to support the exploration of semantic-rich data. We discuss a user evaluation and how interaction challenges could be overcome to create an effective user-centred framework for the visualization and manipulation of semantic data. The approach has been implemented and evaluated on a real company archive

Crossref

Sheffield Hallam University Research Archive

White Rose Research Online

Structure of psoralen-crosslinked ribosomal RNA from Drosophila melanogaster.

Author: D. C. Youvan
J. E. Hearst
P. L. Wollenzien
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref

Multilevel predictors of adolescent physical activity: a longitudinal analysis

Author: Farbakhsh Kian
Hearst Mary O
Lytle Leslie A
Patnode Carrie D
Sirard John R
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background To examine how factors from a social ecologic model predict physical activity (PA) among adolescents using a longitudinal analysis. Methods Participants in this longitudinal study were adolescents (ages 10-16 at baseline) and one parent enrolled in the Transdisciplinary Research on Energetics and Cancer-Identifying Determinants of Eating and Activity (TREC-IDEA) and the Etiology of Childhood Obesity (ECHO). Both studies were designed to assess a socio-ecologic model of adolescent obesity risk. PA was collected using ActiGraph activity monitors at two time points 24 months apart. Other measures included objective height and weight, adolescent and parent questionnaires on multilevel psychological, behavioral and social determinants of PA, and a home PA equipment inventory. Analysis was conducted using SAS, including descriptive characteristics, bivariate and stepped multivariate mixed models, using baseline adjustment. Models were stratified by gender. Results There were 578 adolescents with complete data. Results suggest few statistically significant longitudinal associations with physical activity measured as minutes of MVPA or total counts from accelerometers. For boys, greater self-efficacy (B = 0.75, <it>p </it>= 0.01) and baseline MVPA (B = 0.55, <it>p </it>< 0.01) remained significantly associated with MVPA at follow-up. A similar pattern was observed for total counts. For girls, baseline MVPA (B = 0.58, <it>p </it>= 0.01) and barriers (B = -0.32, <it>p </it>= 0.05) significantly predicted MVPA at follow-up in the full model. The full multilevel model explained 30% of the variance in PA among boys and 24% among girls. Conclusions PA change in adolescents is a complex issue that is not easily understood. Our findings suggest early PA habits are the most important predictor of PA levels in adolescence. Intervention may be necessary prior to middle school to maintain PA through adolescence.</p

Crossref

ScholarWorks@UMass Amherst

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

The Relationship of Area-Level Sociodemographic Characteristics, Household Composition and Individual-Level Socioeconomic Status on Walking Behavior Among Adults

Author: Forsyth Ann
Green Christine G.
Hearst Mary O.
Klein Elizabeth G.
Lytle Leslie A.
Parker Emily D.
Sirard John R.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Understanding the contextual factors associated with why adults walk is important for those interested in increasing walking as a mode of transportation and leisure. This paper investigates the relationships between neighborhood-level sociodemographic context, individual level sociodemographic characteristics and walking for leisure and transport. Data from two community-based studies of adults (n = 550) were used to determine the association between the Area Sociodemographic Environment (ASDE), calculated from U.S. Census variables, and individual-level SES as potential correlates of walking behavior. Descriptive statistics, mean comparisons and Pearson’s correlations coefficients were used to assess bivariate relationships. Generalized estimating equations were used to model the relationship between ASDE, as quartiles, and walking behavior. Adjusted models suggest adults engage in more minutes of walking for transportation and less walking for leisure in the most disadvantaged compared to the least disadvantaged neighborhoods but adding individual level demographics and SES eliminated the significant results. However, when models were stratified for free or reduced cost lunch, of those with children who qualified for free or reduced lunch, those who lived in the wealthiest neighborhoods engaged in 10.7 min less of total walking per day compared to those living in the most challenged neighborhoods (p < 0.001). Strategies to increase walking for transportation or leisure need to take account of individual level socioeconomic factors in addition to area-level measures

Harvard University - DASH

ScholarWorks@UMass Amherst

PubMed Central

Carolina Digital Repository

Exploring Large Digital Library Collections Using a Map-Based Visualisation

Author: A. Kuhn
A. Çöltekin
A.A. Shiri
B. Delaunay
C. Chen
C. Forsell
C. Plaisant
C.L. Liew
D. Mashima
E. Pampalk
G. Marchionini
J.B. Kruskal
K. Hornbæk
K. Lagus
K.A. Olsen
M.A. Butavicius
M.A. Hearst
M.A. Hearst
M.J. Egenhofer
P. Pirolli
R. Rao
R.W. White
S. Greene
S.I. Fabrikant
S.J. Westerman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

In this paper we describe a novel approach for exploring large document collections using a map-based visualisation. We use hierarchically structured semantic concepts that are attached to the documents to create a visualisation of the semantic space that resembles a Google Map. The approach is novel in that we exploit the hierarchical structure to enable the approach to scale to large document collections and to create a map where the higher levels of spatial abstraction have semantic meaning. An informal evaluation is carried out to gather subjective feedback from users. Overall results are positive with users finding the visualisation enticing and easy to use

Crossref

Edge Hill University Research Information Repository

Open Research Online

Do rats learn conditional independence?

Author: Dickinson A
Domjan MP
Hearst E
Hume D
Jenkins HM
Kamin LJ
Miller RR
Pavlov IP
Pearl J
Rescorla RA
Timberlake W
Timberlake W
Tinbergen N
Publication venue: 'The Royal Society'
Publication date: 01/02/2017
Field of study

If acquired associations are to accurately represent real relevance relations, there is motivation for the hypothesis that learning will, in some circumstances, be more appropriately modelled, not as direct dependence, but as conditional independence. In a serial compound conditioning experiment, two groups of rats were presented with a conditioned stimulus (CS1) that imperfectly (50%) predicted food, and was itself imperfectly predicted by a CS2. Groups differed in the proportion of CS2 presentations that were ultimately followed by food (25% versus 75%). Thus, the information presented regarding the relevance of CS2 to food was ambiguous between direct dependence and conditional independence (given CS1). If rats learnt that food was conditionally independent of CS2, given CS1, subjects of both groups should thereafter respond similarly to CS2 alone. Contrary to the conditionality hypothesis, subjects attended to the direct food predictability of CS2, suggesting that rats treat even distal stimuli in a CS sequence as immediately relevant to food, not conditional on an intermediate stimulus. These results urge caution in representing indirect associations as conditional associations, accentuate the theoretical weight of the Markov condition in graphical models, and challenge theories to articulate the conditions under which animals are expected to learn conditional associations, if ever.All funding for the project was internal, from Indiana University

Universidade do Minho: RepositoriUM

Crossref

PubMed Central

Scalable Cross-lingual Document Similarity through Language-specific Concept Hierarchies

Author: Badenes-Olmedo Carlos
Blei David M
Boyd-Graber Jordan
Hakkani-Tur D
Hearst Marti
Kenter Tom
Luo Wenhan
Pritchard Jonathan K.
Rao C Radhakrishna
Towne W Ben
Wang Chong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/12/2020
Field of study

With the ongoing growth in number of digital articles in a wider set of languages and the expanding use of different languages, we need annotation methods that enable browsing multi-lingual corpora. Multilingual probabilistic topic models have recently emerged as a group of semi-supervised machine learning models that can be used to perform thematic explorations on collections of texts in multiple languages. However, these approaches require theme-aligned training data to create a language-independent space. This constraint limits the amount of scenarios that this technique can offer solutions to train and makes it difficult to scale up to situations where a huge collection of multi-lingual documents are required during the training phase. This paper presents an unsupervised document similarity algorithm that does not require parallel or comparable corpora, or any other type of translation resource. The algorithm annotates topics automatically created from documents in a single language with cross-lingual labels and describes documents by hierarchies of multi-lingual concepts from independently-trained models. Experiments performed on the English, Spanish and French editions of JCR-Acquis corpora reveal promising results on classifying and sorting documents by similar content.Comment: Accepted at the 10th International Conference on Knowledge Capture (K-CAP 2019

arXiv.org e-Print Archive

Crossref

A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries

Author: A. Aula
A. Shiri
B.J. Jansen
B.R. Schatz
C. Plaunt
D. Nettle
E. Hargittai
E.N. Efthimiadis
G.W. Furnas
J. Bhogal
M. Hearst
N.J. Belkin
O. Vechtomova
R.W. White
R.W. White
V. Petras
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Interactive query expansion can assist users during their query formulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re-commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain-specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out-performed all other implementations. All our observations show that domain-specific vocabulary can support the user in finding alternative concepts and formulating queries.Comment: To be published in Proceedings of Theories and Practice in Digital Libraries (TPDL), 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multiple Sexual Partners and Condom use among 10 - 19 Year-olds in four Districts in Tanzania: What do we Learn?

Author: A Buvé
A Kaler
A Kigombola
A McKay
A Talle
Amon Exavery
Angelina M Lutambi
BT Johnson
CI Pacheco-Sánchez
CW Kabiru
D Meekers
E Eggleston
E Matasha
E Vittinghoff
ES Maswanya
Godfrey M Mubyazi
Godfrey Mbaruku
Honorati Masanja
I Tavory
International Women's Health Coalition (IWHC)
J Baele
J Holland
J Renju
Khadija Kweka
L Wambua
M Rotermann
MR Kazaura
N Hearst
N Prata
NL Galambos
R Ingham
S Babalola
S Kalichman
T Adair
T Rehle
Tanzania Commission for AIDS (TACAIDS)
The World Bank
UNAIDS
UNAIDS
V Bond
WHO
WHO
WHO
Publication venue: Biomed Central
Publication date: 01/01/2011
Field of study

Although some studies in Tanzania have addressed the question of sexuality and STIs among adolescents, mostly those aged 15 - 19 years, evidence on how multiple sexual partners influence condom use among 10 - 19 year-olds is limited. This study attempts to bridge this gap by testing a hypothesis that sexual relationships with multiple partners in the age group 10 - 19 years spurs condom use during sex in four districts in Tanzania. Secondary analysis was performed using data from the Adolescents Module of the cross-sectional household survey on Maternal, Newborn and Child Health (MNCH) that was done in Kigoma, Kilombero, Rufiji and Ulanga districts, Tanzania in 2008. A total of 612 adolescents resulting from a random sample of 1200 households participated in this study. Pearson Chi-Square was used as a test of association between multiple sexual partners and condom use. Multivariate logistic regression model was fitted to the data to assess the effect of multiple sexual partners on condom use, having adjusted for potential confounding variables. STATA (10) statistical software was used to carry out this process at 5% two-sided significance level. Of the 612 adolescents interviewed, 23.4% reported being sexually active and 42.0% of these reported having had multiple (> 1) sexual partners in the last 12 months. The overall prevalence of condom use among them was 39.2%. The proportion using a condom at the last sexual intercourse was higher among those who knew that they can get a condom if they want than those who did not. No evidence of association was found between multiple sexual partners and condom use (OR = 0.77, 95% CI = 0.35 - 1.67, P = 0.504). With younger adolescents (10 - 14 years) being a reference, condom use was associated with age group (15 - 19: OR = 3.69, 95% CI = 1.21 - 11.25, P = 0.022) and district of residence (Kigoma: OR = 7.45, 95% CI = 1.79 - 31.06, P = 0.006; Kilombero: OR = 8.89, 95% CI = 2.91 - 27.21, P < 0.001; Ulanga: OR = 5.88, 95% CI = 2.00 - 17.31, P = 0.001), Rufiji being a reference category. No evidence of association was found between multiple sexual partners and condom use among adolescents in the study area. The large proportion of adolescents who engage in sexual activity without using condoms, even those with multiple partners, perpetuates the risk of transmission of HIV infections in the community. Strategies such as sex education and easing access to and making a friendly environment for condom availability are important to address the risky sexual behaviour among adolescents

Crossref

Springer - Publisher Connector

Directory of Open Access Journals