63 research outputs found

    Self-Adaptive Hierarchical Sentence Model

    Full text link
    The ability to accurately model a sentence at varying stages (e.g., word-phrase-sentence) plays a central role in natural language processing. As an effort towards this goal we propose a self-adaptive hierarchical sentence model (AdaSent). AdaSent effectively forms a hierarchy of representations from words to phrases and then to sentences through recursive gated local composition of adjacent segments. We design a competitive mechanism (through gating networks) to allow the representations of the same sentence to be engaged in a particular learning task (e.g., classification), therefore effectively mitigating the gradient vanishing problem persistent in other recursive models. Both qualitative and quantitative analysis shows that AdaSent can automatically form and select the representations suitable for the task at hand during training, yielding superior classification performance over competitor models on 5 benchmark data sets.Comment: 8 pages, 7 figures, accepted as a full paper at IJCAI 201

    Neural Ranking Models with Weak Supervision

    Get PDF
    Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection(Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.Comment: In proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2017

    Defining the Genetic Features of O-Antigen Biosynthesis Gene Cluster and Performance of an O-Antigen Serotyping Scheme for Escherichia albertii

    Get PDF
    Escherichia albertii is a newly described and emerging diarrheagenic pathogen responsible for outbreaks of gastroenteritis. Serotyping plays an important role in diagnosis and epidemiological studies for pathogens of public health importance. The diversity of O-antigen biosynthesis gene clusters (O-AGCs) provides the primary basis for serotyping. However, little is known about the distribution and diversity of O-AGCs of E. albertii strains. Here, we presented a complete sequence set for the O-AGCs from 52 E. albertii strains and identified seven distinct O-AGCs. Six of these were also found in 15 genomes of E. albertii strains deposited in the public database. Possession of wzy/wzx genes in each O-AGC strongly suggest that O-antigens of E. albertii were synthesized by the Wzx/Wzy-dependent pathway. Furthermore, we performed an O-antigen serotyping scheme for E. albertii based on specific antisera against seven O-antigens and a high throughput xTAG Luminex assay to simultaneously detect seven O-AGCs. Both methods accurately identified serotypes of 64 tested E. albertii strains. Our data revealed the high-level diversity of O-AGCs in E. albertii. We also provide valuable methods to reliably identify and serotype this bacterium

    Several Cancer Susceptibility Variants Also Affect Melanoma Risk

    Get PDF
    <div><p>Background</p><p>Several regions of the genome show pleiotropic associations with multiple cancers. We sought to evaluate whether 181 single-nucleotide polymorphisms previously associated with various cancers in genome-wide association studies were also associated with melanoma risk.</p><p>Methods</p><p>We evaluated 2,131 melanoma cases and 20,353 controls from three studies in the Population Architecture using Genomics and Epidemiology (PAGE) study (EAGLE-BioVU, MEC, WHI) and two collaborating studies (HPFS, NHS). Overall and sex-stratified analyses were performed across studies.</p><p>Results</p><p>We observed statistically significant associations with melanoma for two lung cancer SNPs in the <i>TERT-CLPTM1L</i> locus (Bonferroni-corrected p<2.8x10<sup>-4</sup>), replicating known pleiotropic effects at this locus. In sex-stratified analyses, we also observed a potential male-specific association between prostate cancer risk variant rs12418451 and melanoma risk (OR=1.22, p=8.0x10<sup>-4</sup>). No other variants in our study were associated with melanoma after multiple comparisons adjustment (p>2.8e<sup>-4</sup>).</p><p>Conclusions</p><p>We provide confirmatory evidence of pleiotropic associations with melanoma for two SNPs previously associated with lung cancer, and provide suggestive evidence for a male-specific association with melanoma for prostate cancer variant rs12418451. This SNP is located near <i>TPCN2</i>, an ion transport gene containing SNPs which have been previously associated with hair pigmentation but not melanoma risk. Previous evidence provides biological plausibility for this association, and suggests a complex interplay between ion transport, pigmentation, and melanoma risk that may vary by sex. If confirmed, these pleiotropic relationships may help elucidate shared molecular pathways between cancers and related phenotypes.</p></div

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    A Review of Tungsten Resources and Potential Extraction from Mine Waste

    No full text
    Tungsten is recognized as a critical metal due to its unique properties, economic importance, and limited sources of supply. It has wide applications where hardness, high density, high wear, and high-temperature resistance are required, such as in mining, construction, energy generation, electronics, aerospace, and defense sectors. The two primary tungsten minerals, and the only minerals of economic importance, are wolframite and scheelite. Secondary tungsten minerals are rare and generated by hydrothermal or supergene alteration rather than by atmospheric weathering. There are no reported concerns for tungsten toxicity. However, tungsten tailings and other residues may represent severe risks to human health and the environment. Tungsten metal scrap is the only secondary source for this metal but reprocessing of tungsten tailings may also become important in the future. Enhanced gravity separation, wet high-intensity magnetic separation, and flotation have been reported to be successful in reprocessing tungsten tailings, while bioleaching can assist with removing some toxic elements. In 2020, the world’s tungsten mine production was estimated at 84 kt of tungsten (106 kt WO3), with known tungsten reserves of 3400 kt. In addition, old tungsten tailings deposits may have great potential for exploration. The incomplete statistics indicate about 96 kt of tungsten content in those deposits, with an average grade of 0.1% WO3 (versus typical grades of 0.3–1% in primary deposits). This paper aims to provide an overview of tungsten minerals, tungsten primary and secondary resources, and tungsten mine waste, including its environmental risks and potential for reprocessing

    Cell Penetrating Peptide-Based Self-Assembly for PD-L1 Targeted Tumor Regression

    No full text
    Cell penetrating peptides (CPPs) are peptides that can directly adapt to cell membranes and then permeate into cells. CPPs are usually covalently linked to the surface of nanocarriers to endow their permeability to the whole system. However, hybrids with lipids or polymers make the metabolism much more sophisticated and even more difficult to determine. In this study, we present a continuous sequence of 18 amino acids (FFAARTMIWY(d-P)GAWYKRI). It forms nanospheres around 170 nm, which increase slightly after loading with siRNA and DOX. Notably, it can be internalized by cancer cells mainly through electronic interactions and PD-L1-mediated endocytosis. Compared with poly-l-lysine and polyethyleneimine, it has a much higher efficiency (about four times) of gene transduction while lowering toxicity. In the treatment of cancer, it causes apoptosis (21%) and inhibits the expression of SURVIVIN protein in vitro. In vivo, it shows good biocompatibility as there are no changes in mice&rsquo;s body weight. When administering peptide-siRNA-DOX, tumor growth is inhibited the most (about three times). These results above prove the sequence to be a good candidate for gene therapy and drug delivery
    • …
    corecore