13,857 research outputs found
Authorship Attribution Using a Neural Network Language Model
In practice, training language models for individual authors is often
expensive because of limited data resources. In such cases, Neural Network
Language Models (NNLMs), generally outperform the traditional non-parametric
N-gram models. Here we investigate the performance of a feed-forward NNLM on an
authorship attribution problem, with moderate author set size and relatively
limited data. We also consider how the text topics impact performance. Compared
with a well-constructed N-gram baseline method with Kneser-Ney smoothing, the
proposed method achieves nearly 2:5% reduction in perplexity and increases
author classification accuracy by 3:43% on average, given as few as 5 test
sentences. The performance is very competitive with the state of the art in
terms of accuracy and demand on test data. The source code, preprocessed
datasets, a detailed description of the methodology and results are available
at https://github.com/zge/authorship-attribution.Comment: Proceedings of the 30th AAAI Conference on Artificial Intelligence
(AAAI'16
Genomic characterisation of an endometrial pathogenic <i>Escherichia coli</i> strain reveals the acquisition of genetic elements associated with extra-intestinal pathogenicity
<b>Background</b><p></p>
Strains of <i>Escherichia coli</i> cause a wide variety of intestinal and extra-intestinal diseases in both humans and animals, and are also often found in healthy individuals or the environment. Broadly, a strong phylogenetic relationship exists that distinguishes most <i>E. Coli</i> causing intestinal disease from those that cause extra-intestinal disease, however, isolates within a recently described subclass of Extra-Intestinal Pathogenic <i>E. Coli</i> (ExPEC), termed endometrial pathogenic <i>E. Coli</i>, tend to be phylogenetically distant from the vast majority of characterised ExPECs, and more closely related to human intestinal pathogens. In this work, we investigate the genetic basis for ExPEC infection in the prototypic endometrial pathogenic <i>E. Coli</i> strain MS499.<p></p>
<b>Results</b><p></p>
By investigating the genome of MS499 in comparison with a range of other E. coli sequences, we have discovered that this bacterium has acquired substantial lengths of DNA which encode factors more usually associated with ExPECs and less frequently found in the phylogroup relatives of MS499. Many of these acquired factors, including several iron acquisition systems and a virulence plasmid similar to that found in several ExPECs such as APEC O1 and the neonatal meningitis <i>E. Coli</i> S88, play characterised roles in a variety of typical ExPEC infections and appear to have been acquired recently by the evolutionary lineage leading to MS499.<p></p>
<b>Conclusions</b><p></p>
Taking advantage of the phylogenetic relationship we describe between MS499 and several other closely related <i>E. Coli</i> isolates from across the globe, we propose a step-wise evolution of a novel clade of sequence type 453 ExPECs within phylogroup B1, involving the recruitment of ExPEC virulence factors into the genome of an ancestrally non-extraintestinal <i>E. Coli</i>, which has repurposed this lineage with the capacity to cause extraintestinal disease. These data reveal the genetic components which may be involved in this phenotype switching, and argue that horizontal gene exchange may be a key factor in the emergence of novel lineages of ExPECs.<p></p>
Recommended from our members
Cancer-Related Mutations Are Not Enriched in Naive Human Pluripotent Stem Cells.
Previous analysis of RNA sequencing (RNA-seq) data from human naive pluripotent stem cells reported multiple point "mutations" in cancer-related genes and implicated selective culture conditions. We observed, however, that those mutations were only present in co-cultures with mouse feeder cells. Inspection of reads containing the polymorphisms revealed complete identity to the mouse reference genome. After we filtered reads to remove sequences of mouse origin, the actual incidence of oncogenic polymorphisms arising in naive pluripotent stem cells is close to zero.We are grateful to James Clarke for cell culture support and to Vicki Murry and Maike Paramor for generating sequencing libraries. This research was funded by the Medical Research Council (MRC) of the United Kingdom. The Wellcome-MRC Cambridge Stem Cell Institute receives core support from Wellcome and MRC. AS is a Medical Research Council Professor
Undergraduate medical textbooks do not provide adequate information on intravenous fluid therapy: a systematic survey and suggestions for improvement
<b>Background</b><p></p>
Inappropriate prescribing of intravenous (IV) fluid, particularly 0.9% sodium chloride, causes post-operative complications. Fluid prescription is often left to junior medical staff and is frequently poorly managed. One reason for poor intravenous fluid prescribing practices could be inadequate coverage of this topic in the textbooks that are used.<p></p>
<b>Methods</b><p></p>
We formulated a comprehensive set of topics, related to important common clinical situations involving IV fluid therapy, (routine fluid replacement, fluid loss, fluids overload) to assess the adequacy of textbooks in common use. We assessed 29 medical textbooks widely available to students in the UK, scoring the presence of information provided by each book on each of the topics. The scores indicated how fully the topics were considered: not at all, partly, and adequately. No attempt was made to judge the quality of the information, because there is no consensus on these topics.<p></p>
<b>Results</b><p></p>
The maximum score that a book could achieve was 52. Three of the topics we chose were not considered by any of the books. Discounting these topics as “too esoteric”, the maximum possible score became 46. One textbook gained a score of 45, but the general score was poor (median 11, quartiles 4, 21). In particular, coverage of routine postoperative management was inadequate.<p></p>
<b>Conclusions</b><p></p>
Textbooks for undergraduates cover the topic of intravenous therapy badly, which may partly explain the poor knowledge and performance of junior doctors in this important field. Systematic revision of current textbooks might improve knowledge and practice by junior doctors. Careful definition of the remit and content of textbooks should be applied more widely to ensure quality and “fitness for purpose”, and avoid omission of vital knowledge
Recommended from our members
The Cell-Surface Marker Sushi Containing Domain 2 Facilitates Establishment of Human Naive Pluripotent Stem Cells.
Recently naive human pluripotent stem cells (hPSCs) have been described that relate to an earlier stage of development than conventional hPSCs. Naive hPSCs remain challenging to generate and authenticate, however. Here we report that Sushi Containing Domain 2 (SUSD2) is a robust cell-surface marker of naive hPSCs in the embryo and in vitro. SUSD2 transcripts are enriched in the pre-implantation epiblast of human blastocysts and immunostaining shows localization of SUSD2 to KLF17-positive epiblast cells. SUSD2 mRNA is strongly expressed in naive hPSCs but is negligible in other hPSCs. SUSD2 immunostaining of live or fixed cells provides unambiguous discrimination of naive versus conventional hPSCs. SUSD2 staining or flow cytometry enable monitoring of naive hPSCs in maintenance culture, and their isolation and quantification during resetting of conventional hPSCs or somatic cell reprogramming. Thus SUSD2 is a powerful non-invasive tool for reliable identification and purification of the naive hPSC phenotype.This research was funded by the Medical Research Council of the United Kingdom (G1001028 and MR/P00072X/1) and European Commission Framework 7 (HEALTH-F4-2013-602423, PluriMes). The Cambridge Stem Cell Institute receives core support from the Wellcome Trust and the Medical Research Council. AS is a Medical Research Council Professor
Mapping the cold molecular gas in a cooling flow cluster: Abell 1795
Cold molecular gas is found in several clusters of galaxies (Edge, 2001,
Salome' & Combes, 2003): single dish telescope observations in CO(1-0) and
CO(2-1) emission lines have revealed the existence of large amounts of cold gas
(up to ~10^11 Msol) in the central region of cooling flow clusters. We present
here interferometric observations performed with the IRAM Plateau de Bure
interferometer in Abell 1795. Comparison with IRAM 30m data shows the cold gas
detected is extended suggesting a cooling flow origin. The CO features
identified are very similar to the structures observed in Halpha and with the
star forming regions observed through UV continuum excess. A large fraction of
the cold gas is not centered on the central cD, but located near brightest
X-ray emitting regions along the North-West orientated radio lobe. The cold gas
kinematics is consistent with the optical nebulosity behaviour in the very
central region. It is not in rotation around the central cD : a velocity
gradient shows the cold gas might be cooled gas from the intra-cluster medium
being accreted by the central galaxy. The optical filaments, aligned with the
cD orbit, are intimately related to the radio jets and lobes. The material
fueling the star formation certainly comes from the deposited gas, cooling more
efficiently along the edge of the radio lobes. Even if some heating mechanisms
are present, these millimetric observations show that an effective cooling to
very low temperatures indeed occurs and is probably accelerated by the presence
of the radio source.Comment: 4 pages, 4 figures, accepted for publication in A&A (Letter
Naive stem cell blastocyst model captures human embryo lineage segregation.
Human naive pluripotent cells can differentiate into extraembryonic trophectoderm and hypoblast. Here we describe a human embryo model (blastoid) generated by self-organization. Brief induction of trophectoderm leads to formation of blastocyst-like structures within 3 days. Blastoids are composed of three tissue layers displaying exclusive lineage markers, mimicking the natural blastocyst. Single-cell transcriptome analyses confirm segregation of trophectoderm, hypoblast, and epiblast with high fidelity to the human embryo. This versatile and scalable system provides a robust experimental model for human embryo research
- …