Search CORE

3,279 research outputs found

Supporting collocation learning with a digital library

Author: Franken Margaret
Witten Ian H.
Wu Shaoqun
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2010
Field of study

Extensive knowledge of collocations is a key factor that distinguishes learners from fluent native speakers. Such knowledge is difficult to acquire simply because there is so much of it. This paper describes a system that exploits the facilities offered by digital libraries to provide a rich collocation-learning environment. The design is based on three processes that have been identified as leading to lexical acquisition: noticing, retrieval and generation. Collocations are automatically identified in input documents using natural language processing techniques and used to enhance the presentation of the documents and also as the basis of exercises, produced under teacher control, that amplify students' collocation knowledge. The system uses a corpus of 1.3 B short phrases drawn from the web, from which 29 M collocations have been automatically identified. It also connects to examples garnered from the live web and the British National Corpus

Research Commons@Waikato

Refining the use of the web (and web search) as a language teaching and learning resource

Author: Franken Margaret
Witten Ian H.
Wu Shaoqun
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

The web is a potentially useful corpus for language study because it provides examples of language that are contextualized and authentic, and is large and easily searchable. However, web contents are heterogeneous in the extreme, uncontrolled and hence 'dirty,' and exhibit features different from the written and spoken texts in other linguistic corpora. This article explores the use of the web and web search as a resource for language teaching and learning. We describe how a particular derived corpus containing a trillion word tokens in the form of n-grams has been filtered by word lists and syntactic constraints and used to create three digital library collections, linked with other corpora and the live web, that exploit the affordances of web text and mitigate some of its constraints

Research Commons@Waikato

Pregnancy in the Classroom

Author: Wu Margaret Elizabeth
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2015
Field of study

This study examines the lived experiences of four high school teachers who have taught while they were pregnant. The teachers’ experiences are contextualized within a feminist psychoanalytic theoretical framework. Current maternity leave policy in the United States and popular culture texts provide additional contextualization for the women’s experiences

Purdue E-Pubs

Model Cards for Model Reporting

Author: Barnes Parker
Gebru Timnit
Hutchinson Ben
Mitchell Margaret
Raji Inioluwa Deborah
Spitzer Elena
Vasserman Lucy
Wu Simone
Zaldivar Andrew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/01/2019
Field of study

Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation

arXiv.org e-Print Archive

Crossref

Rev-dependent lentiviral expression vector

Author: Beddall Margaret H
Marsh Jon W
Wu Yuntao
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: HIV-responsive expression vectors are all based on the HIV promoter, the long terminal repeat (LTR). While responsive to an early HIV protein, Tat, the LTR is also responsive to cellular activation states and to the local chromatin activity where the integration has occurred. This can result in high HIV-independent activity, and has restricted the use of LTR-based reporter vectors to cloned cells, where aberrantly high expressing (HIV-negative) cells can be eliminated. Enhancements in specificity would increase opportunities for expression vector use in detection of HIV as well as in experimental gene expression in HIV-infected cells. RESULTS: We have constructed an expression vector that possesses, in addition to the Tat-responsive LTR, numerous HIV DNA sequences that include the Rev-response element and HIV splicing sites that are efficiently used in human cells. It also contains a reading frame that is removed by cellular splicing activity in the absence of HIV Rev. The vector was incorporated into a lentiviral reporter virus, permitting detection of replicating HIV in living cell populations. The activity of the vector was measured by expression of green fluorescence protein (GFP) reporter and by PCR of reporter transcript following HIV infection. The vector displayed full HIV dependency. CONCLUSION: As with the earlier developed Tat-dependent expression vectors, the Rev system described here is an exploitation of an evolved HIV process. The inclusion of Rev-dependency renders the LTR-based expression vector highly dependent on the presence of replicating HIV. The application of this vector as reported here, an HIV-dependent reporter virus, offers a novel alternative approach to existing methods, in situ PCR or HIV antigen staining, to identify HIV-positive cells. The vector permits examination of living cells, can express any gene for basic or clinical experimentation, and as a pseudo-typed lentivirus has access to most cell types and tissues

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Latent Health Factor Model for Estimating Estuarine Ecosystem Health

Author: Wu Margaret
Publication venue: 'University of Waterloo'
Publication date: 01/05/2009
Field of study

Assessment of the “health” of an ecosystem is often of great interest to those interested in monitoring and conservation of ecosystems. Traditionally, scientists have quantified the health of an ecosystem using multimetric indices that are semi-qualitative. Recently, a statistical-based index called the Latent Health Factor Index (LHFI) was devised to address many inadequacies of the conventional indices. Relying on standard modelling procedures, unlike the conventional indices, accords the LHFI many advantages: the LHFI is less arbitrary, and it allows for straightforward model inference and for formal statistical prediction of health for a new site (using only supplementary environmental covariates). In contrast, with conventional indices, formal statistical prediction does not exist, meaning that proper estimation of health for a new site requires benthic data which are expensive and time-consuming to gather. As the LHFI modelling methodology is a relatively new concept, it has so far only been demonstrated (and validated) on freshwater ecosystems. The goal of this thesis is to apply the LHFI modelling methodology to estuarine ecosystems, particularly to the previously unassessed system in Richibucto, New Brunswick. Specifically, the aims of this thesis are threefold: firstly, to investigate whether the LHFI is even applicable to estuarine systems since estuarine and freshwater metrics, or indicators of health, are quite different; secondly, to determine the appropriate form that the LHFI model if the technique is applicable; and thirdly, to assess the health of the Richibucto system. Note that the second objective includes determining which covariates may have a significant impact on estuarine health. As scientists have previously used the AZTI Marine Biotic Index (AMBI) and the Infaunal Trophic Index (ITI) as measurements of estuarine ecosystem health, this thesis investigates LHFI models using metrics from these two indices simultaneously. Two sets of models were considered in a Bayesian framework and implemented using Markov chain Monte Carlo techniques, the first using only metrics from AMBI, and the second using metrics from both AMBI and ITI. Both sets of LHFI models were successful in that they were able to make distinctions between health levels at different sites

University of Waterloo's Institutional Repository

Rolling Back Transparency in China\u27s Courts

Author: Liebman Benjamin L.
Roberts Margaret
Stern Rachel E.
Wu Xiaohan
Publication venue: 'Center for Open Science'
Publication date: 01/01/2023
Field of study

Despite a burgeoning conversation about the centrality of information management to governments, scholars are only just beginning to address the role of legal information in sustaining authoritarian rule. This Essay presents a case study showing how legal information can be manipulated: through the deletion of previously published cases from China’s online public database of court decisions. Using our own dataset of all 42 million cases made public in China between January 1, 2014, and September 2, 2018, we examine the recent deletion of criminal cases from the China Judgements Online website. We find that the deletion of cases likely results from a range of overlapping, often ad hoc, concerns: the international and domestic images of Chinese courts, institutional relationships within the Chinese Party-State, worries about revealing negative social phenomena, and concerns about copycat crimes. Taken together, the decision(s) to remove hundreds of thousands of unconnected cases shape a narrative about the Chinese courts, Chinese society, and the Chinese Party-State. Our findings also provide insight into the interrelated mechanisms of censorship and transparency in an era in which data governance is increasingly central. We highlight how courts seek to curate a narrative that protects the courts from criticism and boosts their standing with the public and within the Party-State. Examining how Chinese courts manage the removal of cases suggests that how courts curate and manage information disclosure may also be central to their legitimacy and influence

Columbia Law School Scholarship Archive

Recommended from our members

An NGO-Implemented Community-Clinic Health Worker Approach to Providing Long-Term Care for Hypertension in a Remote Region of Southern India.

Author: Agnew Kaylan
Ashok Sangeetha
Dandu Madhavi
Fang Margaret C
Harrison James D
Khanna Raman A
Ravi Prema S
Sankaran Sujatha
Shanabogue Sharan
Wu Yichen Ethel
Publication venue: eScholarship, University of California
Publication date: 01/12/2017
Field of study

Poor blood pressure control results in tremendous morbidity and mortality in India where the leading cause of death among adults is from coronary heart disease. Despite having little formal education, community health workers (CHWs) are integral to successful public health interventions in India and other low- and middle-income countries that have a shortage of trained health professionals. Training CHWs to screen for and manage chronic hypertension, with support from trained clinicians, offers an excellent opportunity for effecting systemwide change in hypertension-related burden of disease. In this article, we describe the development of a program that trained CHWs between 2014 and 2015 in the tribal region of the Sittilingi Valley in southern India, to identify hypertensive patients in the community, refer them for diagnosis and initial management in a physician-staffed clinic, and provide them with sustained lifestyle interventions and medications over multiple visits. We found that after 2 years, the CHWs had screened 7,176 people over age 18 for hypertension, 1,184 (16.5%) of whom were screened as hypertensive. Of the 1,184 patients screened as hypertensive, 898 (75.8%) had achieved blood pressure control, defined as a systolic blood pressure less than 140 and a diastolic blood pressure less than 90 sustained over 3 consecutive visits. While all of the 24 trained CHWs reported confidence in checking blood pressure with a manual blood pressure cuff, 4 of the 24 CHWs reported occasional difficulty documenting blood pressure values because they were unable to write numbers properly. They compensated by asking other CHWs or members of their community to help with documentation. Our experience and findings suggest that a CHW blood pressure screening system linked to a central clinic can be a promising avenue for improving hypertension control rates in low- and middle-income countries

eScholarship - University of California