369 research outputs found
Extending a Fuzzy Polarity Propagation Method for Multi-Domain Sentiment Analysis with Word Embedding and POS Tagging
International audienceWithin multi-domain sentiment analysis, we study how different domain-dependent polarities can be learned for the same concepts. To this aim, we extend an existing approach based on the propagation of fuzzy polarities over a semantic graph capturing background linguistic knowledge to learn concept polarities with respect to various domains and their uncertainty from labeled datasets. In particular, we use POS tagging to refine the association between terms and concepts and word embedding to enhance the construction of the semantic graph. The proposed approach is then evaluated on a standard benchmark, showing that the combined use of POS tagging and word embedding improves its performance. One particularly strong point of the proposed approach is its recall, which is always very close to 100%. In addition, we observe that it exhibits good cross-domain generalization capabilities
Prediction of protein interactions on HIV-1-human PPI data using a novel closure-based integrated approach
Discovering Protein-Protein Interactions (PPI) is a new interesting challenge in computational biology. Identifying interactions among proteins was shown to be useful for finding new drugs and preventing several kinds of diseases. The identification of interactions between HIV-1 proteins and Human proteins is a particular PPI problem whose study might lead to the discovery of drugs and important interactions responsible for AIDS. We present the FIST algorithm for extracting hierarchical bi-clusters and minimal covers of association rules in one process. This algorithm is based on the frequent closed itemsets framework to efficiently generate a hierarchy of conceptual clusters and non-redundant sets of association rules with supporting object lists. Experiments conducted on a HIV-1 and Human proteins interaction dataset show that the approach efficiently identifies interactions previously predicted in the literature and can be used to predict new interactions based on previous biological knowledge
Prediction of protein interactions on HIV-1-human PPI data using a novel closure-based integrated approach
Discovering Protein-Protein Interactions (PPI) is a new interesting challenge in computational biology. Identifying interactions among proteins was shown to be useful for finding new drugs and preventing several kinds of diseases. The identification of interactions between HIV-1 proteins and Human proteins is a particular PPI problem whose study might lead to the discovery of drugs and important interactions responsible for AIDS. We present the FIST algorithm for extracting hierarchical bi-clusters and minimal covers of association rules in one process. This algorithm is based on the frequent closed itemsets framework to efficiently generate a hierarchy of conceptual clusters and non-redundant sets of association rules with supporting object lists. Experiments conducted on a HIV-1 and Human proteins interaction dataset show that the approach efficiently identifies interactions previously predicted in the literature and can be used to predict new interactions based on previous biological knowledge
Origin and Evolution of TRIM Proteins: New Insights from the Complete TRIM Repertoire of Zebrafish and Pufferfish
Tripartite motif proteins (TRIM) constitute a large family of proteins containing a RING-Bbox-Coiled Coil motif followed by different C-terminal domains. Involved in ubiquitination, TRIM proteins participate in many cellular processes including antiviral immunity. The TRIM family is ancient and has been greatly diversified in vertebrates and especially in fish. We analyzed the complete sets of trim genes of the large zebrafish genome and of the compact pufferfish genome. Both contain three large multigene subsets - adding the hsl5/trim35-like genes (hltr) to the ftr and the btr that we previously described - all containing a B30.2 domain that evolved under positive selection. These subsets are conserved among teleosts. By contrast, most human trim genes of the other classes have only one or two orthologues in fish. Loss or gain of C-terminal exons generated proteins with different domain organizations; either by the deletion of the ancestral domain or, remarkably, by the acquisition of a new C-terminal domain. Our survey of fish trim genes in fish identifies subsets with different evolutionary dynamics. trims encoding RBCC-B30.2 proteins show the same evolutionary trends in fish and tetrapods: they evolve fast, often under positive selection, and they duplicate to create multigenic families. We could identify new combinations of domains, which epitomize how new trim classes appear by domain insertion or exon shuffling. Notably, we found that a cyclophilin-A domain replaces the B30.2 domain of a zebrafish fintrim gene, as reported in the macaque and owl monkey antiretroviral TRIM5α. Finally, trim genes encoding RBCC-B30.2 proteins are preferentially located in the vicinity of MHC or MHC gene paralogues, which suggests that such trim genes may have been part of the ancestral MHC
The Baryon Oscillation Spectroscopic Survey of SDSS-III
The Baryon Oscillation Spectroscopic Survey (BOSS) is designed to measure the
scale of baryon acoustic oscillations (BAO) in the clustering of matter over a
larger volume than the combined efforts of all previous spectroscopic surveys
of large scale structure. BOSS uses 1.5 million luminous galaxies as faint as
i=19.9 over 10,000 square degrees to measure BAO to redshifts z<0.7.
Observations of neutral hydrogen in the Lyman alpha forest in more than 150,000
quasar spectra (g<22) will constrain BAO over the redshift range 2.15<z<3.5.
Early results from BOSS include the first detection of the large-scale
three-dimensional clustering of the Lyman alpha forest and a strong detection
from the Data Release 9 data set of the BAO in the clustering of massive
galaxies at an effective redshift z = 0.57. We project that BOSS will yield
measurements of the angular diameter distance D_A to an accuracy of 1.0% at
redshifts z=0.3 and z=0.57 and measurements of H(z) to 1.8% and 1.7% at the
same redshifts. Forecasts for Lyman alpha forest constraints predict a
measurement of an overall dilation factor that scales the highly degenerate
D_A(z) and H^{-1}(z) parameters to an accuracy of 1.9% at z~2.5 when the survey
is complete. Here, we provide an overview of the selection of spectroscopic
targets, planning of observations, and analysis of data and data quality of
BOSS.Comment: 49 pages, 16 figures, accepted by A
The Ninth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the SDSS-III Baryon Oscillation Spectroscopic Survey
The Sloan Digital Sky Survey III (SDSS-III) presents the first spectroscopic
data from the Baryon Oscillation Spectroscopic Survey (BOSS). This ninth data
release (DR9) of the SDSS project includes 535,995 new galaxy spectra (median
z=0.52), 102,100 new quasar spectra (median z=2.32), and 90,897 new stellar
spectra, along with the data presented in previous data releases. These spectra
were obtained with the new BOSS spectrograph and were taken between 2009
December and 2011 July. In addition, the stellar parameters pipeline, which
determines radial velocities, surface temperatures, surface gravities, and
metallicities of stars, has been updated and refined with improvements in
temperature estimates for stars with T_eff<5000 K and in metallicity estimates
for stars with [Fe/H]>-0.5. DR9 includes new stellar parameters for all stars
presented in DR8, including stars from SDSS-I and II, as well as those observed
as part of the SDSS-III Sloan Extension for Galactic Understanding and
Exploration-2 (SEGUE-2).
The astrometry error introduced in the DR8 imaging catalogs has been
corrected in the DR9 data products. The next data release for SDSS-III will be
in Summer 2013, which will present the first data from the Apache Point
Observatory Galactic Evolution Experiment (APOGEE) along with another year of
data from BOSS, followed by the final SDSS-III data release in December 2014.Comment: 9 figures; 2 tables. Submitted to ApJS. DR9 is available at
http://www.sdss3.org/dr
The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the extended Baryon Oscillation Spectroscopic Survey and from the second phase of the Apache Point Observatory Galactic Evolution Experiment
The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in
operation since July 2014. This paper describes the second data release from
this phase, and the fourteenth from SDSS overall (making this, Data Release
Fourteen or DR14). This release makes public data taken by SDSS-IV in its first
two years of operation (July 2014-2016). Like all previous SDSS releases, DR14
is cumulative, including the most recent reductions and calibrations of all
data taken by SDSS since the first phase began operations in 2000. New in DR14
is the first public release of data from the extended Baryon Oscillation
Spectroscopic Survey (eBOSS); the first data from the second phase of the
Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2),
including stellar parameter estimates from an innovative data driven machine
learning algorithm known as "The Cannon"; and almost twice as many data cubes
from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous
release (N = 2812 in total). This paper describes the location and format of
the publicly available data from SDSS-IV surveys. We provide references to the
important technical papers describing how these data have been taken (both
targeting and observation details) and processed for scientific use. The SDSS
website (www.sdss.org) has been updated for this release, and provides links to
data downloads, as well as tutorials and examples of data use. SDSS-IV is
planning to continue to collect astronomical data until 2020, and will be
followed by SDSS-V.Comment: SDSS-IV collaboration alphabetical author data release paper. DR14
happened on 31st July 2017. 19 pages, 5 figures. Accepted by ApJS on 28th Nov
2017 (this is the "post-print" and "post-proofs" version; minor corrections
only from v1, and most of errors found in proofs corrected
HIV infection of the male genital tract – consequences for sexual transmission and reproduction
Despite semen being the main vector of human immunodeficiency virus (HIV) dissemination worldwide, the origin of the virus in this bodily fluid remains unclear. It was recently shown that several organs of the male genital tract (MGT) are infected by HIV/simian immunodeficiency virus (SIV) and likely to contribute to semen viral load during the primary and chronic stages of the infection. These findings are important in helping answer the following questions: (i) does the MGT constitute a viral reservoir responsible for the persistence of virus release into the semen of a subset of HIV-infected men under antiretroviral therapy, who otherwise show an undetectable blood viral load? (ii) What is the aetiology of the semen abnormalities observed in asymptomatic HIV-infected men? (iii) What is the exact nature of the interactions between the spermatozoa, their testicular progenitors and HIV, an important issue in the context of assisted reproductive techniques proposed for HIV-seropositive (HIV+) men? Answers to these questions are crucial for the design of new therapeutic strategies aimed at eradicating the virus from the genital tract of HIV+ men – thus reducing its sexual transmission – and for improving the care of serodiscordant couples wishing to have children. This review summarizes the most recent literature on HIV infection of the male genital tract, discusses the above issues in light of the latest findings and highlights future directions of research
- …