356 research outputs found

    Extending a Fuzzy Polarity Propagation Method for Multi-Domain Sentiment Analysis with Word Embedding and POS Tagging

    Get PDF
    International audienceWithin multi-domain sentiment analysis, we study how different domain-dependent polarities can be learned for the same concepts. To this aim, we extend an existing approach based on the propagation of fuzzy polarities over a semantic graph capturing background linguistic knowledge to learn concept polarities with respect to various domains and their uncertainty from labeled datasets. In particular, we use POS tagging to refine the association between terms and concepts and word embedding to enhance the construction of the semantic graph. The proposed approach is then evaluated on a standard benchmark, showing that the combined use of POS tagging and word embedding improves its performance. One particularly strong point of the proposed approach is its recall, which is always very close to 100%. In addition, we observe that it exhibits good cross-domain generalization capabilities

    Prediction of protein interactions on HIV-1-human PPI data using a novel closure-based integrated approach

    Get PDF
    Discovering Protein-Protein Interactions (PPI) is a new interesting challenge in computational biology. Identifying interactions among proteins was shown to be useful for finding new drugs and preventing several kinds of diseases. The identification of interactions between HIV-1 proteins and Human proteins is a particular PPI problem whose study might lead to the discovery of drugs and important interactions responsible for AIDS. We present the FIST algorithm for extracting hierarchical bi-clusters and minimal covers of association rules in one process. This algorithm is based on the frequent closed itemsets framework to efficiently generate a hierarchy of conceptual clusters and non-redundant sets of association rules with supporting object lists. Experiments conducted on a HIV-1 and Human proteins interaction dataset show that the approach efficiently identifies interactions previously predicted in the literature and can be used to predict new interactions based on previous biological knowledge

    Prediction of protein interactions on HIV-1-human PPI data using a novel closure-based integrated approach

    Get PDF
    Discovering Protein-Protein Interactions (PPI) is a new interesting challenge in computational biology. Identifying interactions among proteins was shown to be useful for finding new drugs and preventing several kinds of diseases. The identification of interactions between HIV-1 proteins and Human proteins is a particular PPI problem whose study might lead to the discovery of drugs and important interactions responsible for AIDS. We present the FIST algorithm for extracting hierarchical bi-clusters and minimal covers of association rules in one process. This algorithm is based on the frequent closed itemsets framework to efficiently generate a hierarchy of conceptual clusters and non-redundant sets of association rules with supporting object lists. Experiments conducted on a HIV-1 and Human proteins interaction dataset show that the approach efficiently identifies interactions previously predicted in the literature and can be used to predict new interactions based on previous biological knowledge

    Origin and Evolution of TRIM Proteins: New Insights from the Complete TRIM Repertoire of Zebrafish and Pufferfish

    Get PDF
    Tripartite motif proteins (TRIM) constitute a large family of proteins containing a RING-Bbox-Coiled Coil motif followed by different C-terminal domains. Involved in ubiquitination, TRIM proteins participate in many cellular processes including antiviral immunity. The TRIM family is ancient and has been greatly diversified in vertebrates and especially in fish. We analyzed the complete sets of trim genes of the large zebrafish genome and of the compact pufferfish genome. Both contain three large multigene subsets - adding the hsl5/trim35-like genes (hltr) to the ftr and the btr that we previously described - all containing a B30.2 domain that evolved under positive selection. These subsets are conserved among teleosts. By contrast, most human trim genes of the other classes have only one or two orthologues in fish. Loss or gain of C-terminal exons generated proteins with different domain organizations; either by the deletion of the ancestral domain or, remarkably, by the acquisition of a new C-terminal domain. Our survey of fish trim genes in fish identifies subsets with different evolutionary dynamics. trims encoding RBCC-B30.2 proteins show the same evolutionary trends in fish and tetrapods: they evolve fast, often under positive selection, and they duplicate to create multigenic families. We could identify new combinations of domains, which epitomize how new trim classes appear by domain insertion or exon shuffling. Notably, we found that a cyclophilin-A domain replaces the B30.2 domain of a zebrafish fintrim gene, as reported in the macaque and owl monkey antiretroviral TRIM5α. Finally, trim genes encoding RBCC-B30.2 proteins are preferentially located in the vicinity of MHC or MHC gene paralogues, which suggests that such trim genes may have been part of the ancestral MHC

    The Baryon Oscillation Spectroscopic Survey of SDSS-III

    Get PDF
    The Baryon Oscillation Spectroscopic Survey (BOSS) is designed to measure the scale of baryon acoustic oscillations (BAO) in the clustering of matter over a larger volume than the combined efforts of all previous spectroscopic surveys of large scale structure. BOSS uses 1.5 million luminous galaxies as faint as i=19.9 over 10,000 square degrees to measure BAO to redshifts z<0.7. Observations of neutral hydrogen in the Lyman alpha forest in more than 150,000 quasar spectra (g<22) will constrain BAO over the redshift range 2.15<z<3.5. Early results from BOSS include the first detection of the large-scale three-dimensional clustering of the Lyman alpha forest and a strong detection from the Data Release 9 data set of the BAO in the clustering of massive galaxies at an effective redshift z = 0.57. We project that BOSS will yield measurements of the angular diameter distance D_A to an accuracy of 1.0% at redshifts z=0.3 and z=0.57 and measurements of H(z) to 1.8% and 1.7% at the same redshifts. Forecasts for Lyman alpha forest constraints predict a measurement of an overall dilation factor that scales the highly degenerate D_A(z) and H^{-1}(z) parameters to an accuracy of 1.9% at z~2.5 when the survey is complete. Here, we provide an overview of the selection of spectroscopic targets, planning of observations, and analysis of data and data quality of BOSS.Comment: 49 pages, 16 figures, accepted by A

    The Ninth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the SDSS-III Baryon Oscillation Spectroscopic Survey

    Get PDF
    The Sloan Digital Sky Survey III (SDSS-III) presents the first spectroscopic data from the Baryon Oscillation Spectroscopic Survey (BOSS). This ninth data release (DR9) of the SDSS project includes 535,995 new galaxy spectra (median z=0.52), 102,100 new quasar spectra (median z=2.32), and 90,897 new stellar spectra, along with the data presented in previous data releases. These spectra were obtained with the new BOSS spectrograph and were taken between 2009 December and 2011 July. In addition, the stellar parameters pipeline, which determines radial velocities, surface temperatures, surface gravities, and metallicities of stars, has been updated and refined with improvements in temperature estimates for stars with T_eff<5000 K and in metallicity estimates for stars with [Fe/H]>-0.5. DR9 includes new stellar parameters for all stars presented in DR8, including stars from SDSS-I and II, as well as those observed as part of the SDSS-III Sloan Extension for Galactic Understanding and Exploration-2 (SEGUE-2). The astrometry error introduced in the DR8 imaging catalogs has been corrected in the DR9 data products. The next data release for SDSS-III will be in Summer 2013, which will present the first data from the Apache Point Observatory Galactic Evolution Experiment (APOGEE) along with another year of data from BOSS, followed by the final SDSS-III data release in December 2014.Comment: 9 figures; 2 tables. Submitted to ApJS. DR9 is available at http://www.sdss3.org/dr

    The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the extended Baryon Oscillation Spectroscopic Survey and from the second phase of the Apache Point Observatory Galactic Evolution Experiment

    Get PDF
    The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in operation since July 2014. This paper describes the second data release from this phase, and the fourteenth from SDSS overall (making this, Data Release Fourteen or DR14). This release makes public data taken by SDSS-IV in its first two years of operation (July 2014-2016). Like all previous SDSS releases, DR14 is cumulative, including the most recent reductions and calibrations of all data taken by SDSS since the first phase began operations in 2000. New in DR14 is the first public release of data from the extended Baryon Oscillation Spectroscopic Survey (eBOSS); the first data from the second phase of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2), including stellar parameter estimates from an innovative data driven machine learning algorithm known as "The Cannon"; and almost twice as many data cubes from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous release (N = 2812 in total). This paper describes the location and format of the publicly available data from SDSS-IV surveys. We provide references to the important technical papers describing how these data have been taken (both targeting and observation details) and processed for scientific use. The SDSS website (www.sdss.org) has been updated for this release, and provides links to data downloads, as well as tutorials and examples of data use. SDSS-IV is planning to continue to collect astronomical data until 2020, and will be followed by SDSS-V.Comment: SDSS-IV collaboration alphabetical author data release paper. DR14 happened on 31st July 2017. 19 pages, 5 figures. Accepted by ApJS on 28th Nov 2017 (this is the "post-print" and "post-proofs" version; minor corrections only from v1, and most of errors found in proofs corrected

    HIV infection of the male genital tract – consequences for sexual transmission and reproduction

    Get PDF
    Despite semen being the main vector of human immunodeficiency virus (HIV) dissemination worldwide, the origin of the virus in this bodily fluid remains unclear. It was recently shown that several organs of the male genital tract (MGT) are infected by HIV/simian immunodeficiency virus (SIV) and likely to contribute to semen viral load during the primary and chronic stages of the infection. These findings are important in helping answer the following questions: (i) does the MGT constitute a viral reservoir responsible for the persistence of virus release into the semen of a subset of HIV-infected men under antiretroviral therapy, who otherwise show an undetectable blood viral load? (ii) What is the aetiology of the semen abnormalities observed in asymptomatic HIV-infected men? (iii) What is the exact nature of the interactions between the spermatozoa, their testicular progenitors and HIV, an important issue in the context of assisted reproductive techniques proposed for HIV-seropositive (HIV+) men? Answers to these questions are crucial for the design of new therapeutic strategies aimed at eradicating the virus from the genital tract of HIV+ men – thus reducing its sexual transmission – and for improving the care of serodiscordant couples wishing to have children. This review summarizes the most recent literature on HIV infection of the male genital tract, discusses the above issues in light of the latest findings and highlights future directions of research
    corecore