3,233 research outputs found

    Supporting collocation learning with a digital library

    Get PDF
    Extensive knowledge of collocations is a key factor that distinguishes learners from fluent native speakers. Such knowledge is difficult to acquire simply because there is so much of it. This paper describes a system that exploits the facilities offered by digital libraries to provide a rich collocation-learning environment. The design is based on three processes that have been identified as leading to lexical acquisition: noticing, retrieval and generation. Collocations are automatically identified in input documents using natural language processing techniques and used to enhance the presentation of the documents and also as the basis of exercises, produced under teacher control, that amplify students' collocation knowledge. The system uses a corpus of 1.3 B short phrases drawn from the web, from which 29 M collocations have been automatically identified. It also connects to examples garnered from the live web and the British National Corpus

    Refining the use of the web (and web search) as a language teaching and learning resource

    Get PDF
    The web is a potentially useful corpus for language study because it provides examples of language that are contextualized and authentic, and is large and easily searchable. However, web contents are heterogeneous in the extreme, uncontrolled and hence 'dirty,' and exhibit features different from the written and spoken texts in other linguistic corpora. This article explores the use of the web and web search as a resource for language teaching and learning. We describe how a particular derived corpus containing a trillion word tokens in the form of n-grams has been filtered by word lists and syntactic constraints and used to create three digital library collections, linked with other corpora and the live web, that exploit the affordances of web text and mitigate some of its constraints

    Pregnancy in the Classroom

    Get PDF
    This study examines the lived experiences of four high school teachers who have taught while they were pregnant. The teachers’ experiences are contextualized within a feminist psychoanalytic theoretical framework. Current maternity leave policy in the United States and popular culture texts provide additional contextualization for the women’s experiences

    Model Cards for Model Reporting

    Full text link
    Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation

    Rev-dependent lentiviral expression vector

    Get PDF
    BACKGROUND: HIV-responsive expression vectors are all based on the HIV promoter, the long terminal repeat (LTR). While responsive to an early HIV protein, Tat, the LTR is also responsive to cellular activation states and to the local chromatin activity where the integration has occurred. This can result in high HIV-independent activity, and has restricted the use of LTR-based reporter vectors to cloned cells, where aberrantly high expressing (HIV-negative) cells can be eliminated. Enhancements in specificity would increase opportunities for expression vector use in detection of HIV as well as in experimental gene expression in HIV-infected cells. RESULTS: We have constructed an expression vector that possesses, in addition to the Tat-responsive LTR, numerous HIV DNA sequences that include the Rev-response element and HIV splicing sites that are efficiently used in human cells. It also contains a reading frame that is removed by cellular splicing activity in the absence of HIV Rev. The vector was incorporated into a lentiviral reporter virus, permitting detection of replicating HIV in living cell populations. The activity of the vector was measured by expression of green fluorescence protein (GFP) reporter and by PCR of reporter transcript following HIV infection. The vector displayed full HIV dependency. CONCLUSION: As with the earlier developed Tat-dependent expression vectors, the Rev system described here is an exploitation of an evolved HIV process. The inclusion of Rev-dependency renders the LTR-based expression vector highly dependent on the presence of replicating HIV. The application of this vector as reported here, an HIV-dependent reporter virus, offers a novel alternative approach to existing methods, in situ PCR or HIV antigen staining, to identify HIV-positive cells. The vector permits examination of living cells, can express any gene for basic or clinical experimentation, and as a pseudo-typed lentivirus has access to most cell types and tissues

    A Latent Health Factor Model for Estimating Estuarine Ecosystem Health

    Get PDF
    Assessment of the “health” of an ecosystem is often of great interest to those interested in monitoring and conservation of ecosystems. Traditionally, scientists have quantified the health of an ecosystem using multimetric indices that are semi-qualitative. Recently, a statistical-based index called the Latent Health Factor Index (LHFI) was devised to address many inadequacies of the conventional indices. Relying on standard modelling procedures, unlike the conventional indices, accords the LHFI many advantages: the LHFI is less arbitrary, and it allows for straightforward model inference and for formal statistical prediction of health for a new site (using only supplementary environmental covariates). In contrast, with conventional indices, formal statistical prediction does not exist, meaning that proper estimation of health for a new site requires benthic data which are expensive and time-consuming to gather. As the LHFI modelling methodology is a relatively new concept, it has so far only been demonstrated (and validated) on freshwater ecosystems. The goal of this thesis is to apply the LHFI modelling methodology to estuarine ecosystems, particularly to the previously unassessed system in Richibucto, New Brunswick. Specifically, the aims of this thesis are threefold: firstly, to investigate whether the LHFI is even applicable to estuarine systems since estuarine and freshwater metrics, or indicators of health, are quite different; secondly, to determine the appropriate form that the LHFI model if the technique is applicable; and thirdly, to assess the health of the Richibucto system. Note that the second objective includes determining which covariates may have a significant impact on estuarine health. As scientists have previously used the AZTI Marine Biotic Index (AMBI) and the Infaunal Trophic Index (ITI) as measurements of estuarine ecosystem health, this thesis investigates LHFI models using metrics from these two indices simultaneously. Two sets of models were considered in a Bayesian framework and implemented using Markov chain Monte Carlo techniques, the first using only metrics from AMBI, and the second using metrics from both AMBI and ITI. Both sets of LHFI models were successful in that they were able to make distinctions between health levels at different sites

    Rolling Back Transparency in China\u27s Courts

    Get PDF
    Despite a burgeoning conversation about the centrality of information management to governments, scholars are only just beginning to address the role of legal information in sustaining authoritarian rule. This Essay presents a case study showing how legal information can be manipulated: through the deletion of previously published cases from China’s online public database of court decisions. Using our own dataset of all 42 million cases made public in China between January 1, 2014, and September 2, 2018, we examine the recent deletion of criminal cases from the China Judgements Online website. We find that the deletion of cases likely results from a range of overlapping, often ad hoc, concerns: the international and domestic images of Chinese courts, institutional relationships within the Chinese Party-State, worries about revealing negative social phenomena, and concerns about copycat crimes. Taken together, the decision(s) to remove hundreds of thousands of unconnected cases shape a narrative about the Chinese courts, Chinese society, and the Chinese Party-State. Our findings also provide insight into the interrelated mechanisms of censorship and transparency in an era in which data governance is increasingly central. We highlight how courts seek to curate a narrative that protects the courts from criticism and boosts their standing with the public and within the Party-State. Examining how Chinese courts manage the removal of cases suggests that how courts curate and manage information disclosure may also be central to their legitimacy and influence
    corecore