1,162,350 research outputs found

    Wide-coverage deep statistical parsing using automatic dependency structure annotation

    Get PDF
    A number of researchers (Lin 1995; Carroll, Briscoe, and Sanfilippo 1998; Carroll et al. 2002; Clark and Hockenmaier 2002; King et al. 2003; Preiss 2003; Kaplan et al. 2004;Miyao and Tsujii 2004) have convincingly argued for the use of dependency (rather than CFG-tree) representations for parser evaluation. Preiss (2003) and Kaplan et al. (2004) conducted a number of experiments comparing “deep” hand-crafted wide-coverage with “shallow” treebank- and machine-learning based parsers at the level of dependencies, using simple and automatic methods to convert tree output generated by the shallow parsers into dependencies. In this article, we revisit the experiments in Preiss (2003) and Kaplan et al. (2004), this time using the sophisticated automatic LFG f-structure annotation methodologies of Cahill et al. (2002b, 2004) and Burke (2006), with surprising results. We compare various PCFG and history-based parsers (based on Collins, 1999; Charniak, 2000; Bikel, 2002) to find a baseline parsing system that fits best into our automatic dependency structure annotation technique. This combined system of syntactic parser and dependency structure annotation is compared to two hand-crafted, deep constraint-based parsers (Carroll and Briscoe 2002; Riezler et al. 2002). We evaluate using dependency-based gold standards (DCU 105, PARC 700, CBS 500 and dependencies for WSJ Section 22) and use the Approximate Randomization Test (Noreen 1989) to test the statistical significance of the results. Our experiments show that machine-learning-based shallow grammars augmented with sophisticated automatic dependency annotation technology outperform hand-crafted, deep, widecoverage constraint grammars. Currently our best system achieves an f-score of 82.73% against the PARC 700 Dependency Bank (King et al. 2003), a statistically significant improvement of 2.18%over the most recent results of 80.55%for the hand-crafted LFG grammar and XLE parsing system of Riezler et al. (2002), and an f-score of 80.23% against the CBS 500 Dependency Bank (Carroll, Briscoe, and Sanfilippo 1998), a statistically significant 3.66% improvement over the 76.57% achieved by the hand-crafted RASP grammar and parsing system of Carroll and Briscoe (2002)

    Similarity of Semantic Relations

    Get PDF
    There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason:stone is analogous to the pair carpenter:wood. This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, and information retrieval. Recently the Vector Space Model (VSM) of information retrieval has been adapted to measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data, and (3) automatically generated synonyms are used to explore variations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying semantic relations, LRA achieves similar gains over the VSM

    Topological data analysis of Escherichia coli O157:H7 and non-O157 survival in soils.

    Get PDF
    Shiga toxin-producing E. coli O157:H7 and non-O157 have been implicated in many foodborne illnesses caused by the consumption of contaminated fresh produce. However, data on their persistence in soils are limited due to the complexity in datasets generated from different environmental variables and bacterial taxa. There is a continuing need to distinguish the various environmental variables and different bacterial groups to understand the relationships among these factors and the pathogen survival. Using an approach called Topological Data Analysis (TDA); we reconstructed the relationship structure of E. coli O157 and non-O157 survival in 32 soils (16 organic and 16 conventionally managed soils) from California (CA) and Arizona (AZ) with a multi-resolution output. In our study, we took a community approach based on total soil microbiome to study community level survival and examining the network of the community as a whole and the relationship between its topology and biological processes. TDA produces a geometric representation of complex data sets. Network analysis showed that Shiga toxin negative strain E. coli O157:H7 4554 survived significantly longer in comparison to E. coli O157:H7 EDL 933, while the survival time of E. coli O157:NM was comparable to that of E. coli O157:H7 EDL 933 in all of the tested soils. Two non-O157 strains, E. coli O26:H11 and E. coli O103:H2 survived much longer than E. coli O91:H21 and the three strains of E. coli O157. We show that there are complex interactions between E. coli strain survival, microbial community structures, and soil parameters

    Reliability measurement without limits

    Get PDF
    In computational linguistics, a reliability measurement of 0.8 on some statistic such as κ\kappa is widely thought to guarantee that hand-coded data is fit for purpose, with lower values suspect. We demonstrate that the main use of such data, machine learning, can tolerate data with a low reliability as long as any disagreement among human coders looks like random noise. When it does not, however, data can have a reliability of more than 0.8 and still be unsuitable for use: the disagreement may indicate erroneous patterns that machine-learning can learn, and evaluation against test data that contain these same erroneous patterns may lead us to draw wrong conclusions about our machine-learning algorithms. Furthermore, lower reliability values still held as acceptable by many researchers, between 0.67 and 0.8, may even yield inflated performance figures in some circumstances. Although this is a common sense result, it has implications for how we work that are likely to reach beyond the machine-learning applications we discuss. At the very least, computational linguists should look for any patterns in the disagreement among coders and assess what impact they will have

    Characterization of extended-spectrum β-lactamases produced by Escherichia coli isolated from hospitalized and nonhospitalized patients : emergence of CTX-M-15-producing strains causing urinary tract infections

    Get PDF
    Extended-spectrum β-lactamase-producing Escherichia coli isolates were obtained from hospitalised and non-hospitalised patients in Belgium between August 2006 and November 2007. The antimicrobial susceptibility of these isolates was determined and their ESBL genes were characterized. Clonal relationships between the CTX-M-producing E. coli isolates causing urinary tract infections were also studied. A total of 90 hospital- and 45 community-acquired cephalosporin-resistant E. coli isolates were obtained. Tetracycline, enrofloxacine, gentamicin and trimethoprim-sulfamethaxozole resistance rates were significantly different between the community-onset and hospital-acquired isolates. A high diversity of different ESBLs was observed among the hospital-acquired E. coli isolates whereas CTX-M-15 was dominating among the community-acquired E. coli isolates (n=28). Thirtheen different PFGE profiles were observed in the community-acquired CTX-M-15-producing E. coli indicating that multiple clones have acquired the blaCTX-M-15 gene. All community-acquired CTX-M-15-producing E. coli isolates of phylogroups B2 and D were assigned to the sequence type ST131. The hospital-acquired CTX-M-15-producing E. coli isolates of phylogroups B2, B1, A and D corresponded to ST131, ST617, ST48 and ST405, respectively. In conclusion, CTX-M-type ESBLs have emerged as the predominant class of ESBLs produced by E. coli isolates in the hospital and community in Belgium. Of particular concern is the predominant presence of the CTX-M-15 enzyme in ST131 community-acquired E. coli

    Variation of inflammatory dynamics and mediators in primiparous cows after intramammary challenge with Escherichia coli

    Get PDF
    The objective of the current study was to investigate (i) the outcome of experimentally induced Escherichia coli mastitis in primiparous cows during early lactation in relation with production of eicosanoids and inflammatory indicators, and (ii) the validity of thermography to evaluate temperature changes on udder skin surface after experimentally induced E. coli mastitis. Nine primiparous Holstein Friesian cows were inoculated 24 +/- 6 days (d) after parturition in both left quarters with E. coli P4 serotype O32:H37. Blood and milk samples were collected before and after challenge with E. coli. The infrared images were taken from the caudal view of the udder following challenge with E. coli. No relationship was detected between severity of mastitis and changes of thromboxane B2 (TXB2), leukotriene B4 (LTB4) and lipoxin A4 (LXA4). However, prostaglandin E2 (PGE2) was related to systemic disease severity during E. coli mastitis. Moreover, reduced somatic cell count (SCC), fewer circulating basophils, increased concentration of tumor necrosis factor-alpha (TNF-alpha) and higher milk sodium and lower milk potassium concentrations were related to systemic disease severity. The thermal camera was capable of detecting 2-3 degrees C temperature changes on udder skin surface of cows inoculated with E. coli. Peak of udder skin temperature occurred after peak of rectal temperature and appearance of local signs of induced E. coli mastitis. Although infrared thermography was a successful method for detecting the changes in udder skin surface temperature following intramammary challenge with E. coli, it did not show to be a promising tool for early detection of mastitis

    Escherichia coli K1 RS218 Interacts with Human Brain Microvascular Endothelial Cells via Type 1 Fimbria Bacteria in the Fimbriated State

    Get PDF
    Escherichia coli K1 is a major gram-negative organism causing neonatal meningitis. E. coli K1 binding to and invasion of human brain microvascular endothelial cells (HBMEC) are a prerequisite for E. coli penetration into the central nervous system in vivo. In the present study, we showed using DNA microarray analysis that E. coli K1 associated with HBMEC expressed significantly higher levels of the fim genes compared to nonassociated bacteria. We also showed that E. coli K1 binding to and invasion of HBMEC were significantly decreased with its fimH deletion mutant and type 1 fimbria locked-off mutant, while they were significantly increased with its type 1 fimbria locked-on mutant. E. coli K1 strains associated with HBMEC were predominantly type 1 fimbria phase-on (i.e., fimbriated) bacteria. Taken together, we showed for the first time that type 1 fimbriae play an important role in E. coli K1 binding to and invasion of HBMEC and that type 1 fimbria phase-on E. coli is the major population interacting with HBMEC

    Characterization of GDP-mannose Pyrophosphorylase from Escherichia Coli O157:H7 EDL933 and Its Broad Substrate Specificity

    Full text link
    GDP-mannose pyrophosphorylase gene (ManC) of Escherichia coli (E. coli) O157 was cloned and expressed as a highly soluble protein in E. coli BL21 (DE3). The enzyme was subsequently purified using hydrophobic and ion exchange chromatographies. ManC showed very broad substrate specificities for four nucleotides and various hexose-1-phosphates, yielding ADP-mannose, CDP-mannose, UDP-mannose, GDP-mannose, GDP-glucose and GDP-2-deoxy-glucose

    Assessing the occurrence and transfer dynamics of ESBL/pAmpC-producing Escherichia coli across the broiler production pyramid

    Get PDF
    Extended-spectrum \u3b2-lactamase (ESBL)- and plasmid mediated AmpC-type cephalosporinase (pAmpC)-producing Escherichia coli (ESBL/pAmpC E. coli) in food-producing animals is a major public health concern. This study aimed at quantifying ESBL/pAmpC-E. coli occurrence and transfer in Italy's broiler production pyramid. Three production chains of an integrated broiler company were investigated. Cloacal swabs were taken from parent stock chickens and offspring broiler flocks in four fattening farms per chain. Carcasses from sampled broiler flocks were collected at slaughterhouse. Samples were processed on selective media, and E. coli colonies were screened for ESBL/pAmpC production. ESBL/pAmpC genes and E. coli phylogroups were determined by PCR and sequencing. Average pairwise overlap of ESBL/pAmpC E. coli gene and phylogroup occurrences between subsequent production stages was estimated using the proportional similarity index, modelling uncertainty in a Monte Carlo simulation setting. In total, 820 samples were processed, from which 513 ESBL/pAmpC E. coli isolates were obtained. We found a high prevalence (92.5%, 95%CI 72.1-98.3%) in day-old parent stock chicks, in which blaCMY-2 predominated; prevalence then dropped to 20% (12.9-29.6%) at laying phase. In fattening broilers, prevalence was 69.2% (53.6-81.3%) at the start of production, 54.2% (38.9-68.6%) at slaughter time, and 61.3% (48.1-72.9%) in carcasses. Significantly decreasing and increasing trends for respectively blaCMY-2 and blaCTX-M-1 gene occurrences were found across subsequent production stages. ESBL/pAmpC E. coli genetic background appeared complex and bla-gene/phylogroup associations indicated clonal and horizontal transmission. Modelling revealed that the average transfer of ESBL/pAmpC E. coli genes between subsequent production stages was 47.7% (42.3-53.4%). We concluded that ESBL/pAmpC E. coli in the broiler production pyramid is prevalent, with substantial transfer between subsequent production levels

    Remarkable stability of an instability-prone lentiviral vector plasmid in Escherichia coli Stbl3

    Get PDF
    Large-scale production of plasmid DNA to prepare therapeutic gene vectors or DNA-based vaccines requires a suitable bacterial host, which can stably maintain the plasmid DNA during industrial cultivation. Plasmid loss during bacterial cell divisions and structural changes in the plasmid DNA can dramatically reduce the yield of the desired recombinant plasmid DNA. While generating an HIV-based gene vector containing a bicistronic expression cassette 5′-Olig2cDNA-IRES-dsRed2-3′, we encountered plasmid DNA instability, which occurred in homologous recombination deficient recA1 Escherichia coli strain Stbl2 specifically during large-scale bacterial cultivation. Unexpectedly, the new recombinant plasmid was structurally changed or completely lost in 0.5 L liquid cultures but not in the preceding 5 mL cultures. Neither the employment of an array of alternative recA1 E. coli plasmid hosts, nor the lowering of the culture incubation temperature prevented the instability. However, after the introduction of this instability-prone plasmid into the recA13E. coli strain Stbl3, the transformed bacteria grew without being overrun by plasmid-free cells, reduction in the plasmid DNA yield or structural changes in plasmid DNA. Thus, E. coli strain Stbl3 conferred structural and maintenance stability to the otherwise instability-prone lentivirus-based recombinant plasmid, suggesting that this strain can be used for the faithful maintenance of similar stability-compromised plasmids in large-scale bacterial cultivations. In contrast to Stbl2, which is derived wholly from the wild type isolate E. coli K12, E. coli Stbl3 is a hybrid strain of mixed E. coli K12 and E. coli B parentage. Therefore, we speculate that genetic determinants for the benevolent properties of E. coli Stbl3 for safe plasmid propagation originate from its E. coli B ancestor
    corecore