    Going beyond traditional QA systems: challenges and keys in opinion question answering

    The treatment of factual data has been widely studied in different areas of Natural Language Processing (NLP). However, processing subjective information still poses important challenges. This paper presents research aimed at assessing techniques that have been suggested as appropriate in the context of subjective - Opinion Question Answering (OQA). We evaluate the performance of an OQA with these new components and propose methods to optimally tackle the issues encountered. We assess the impact of including additional resources and processes with the purpose of improving the system performance on two distinct blog datasets. The improvements obtained for the different combination of tools are statistically significant. We thus conclude that the proposed approach is adequate for the OQA task, offering a good strategy to deal with opinionated questions.This paper has been partially supported by Ministerio de Ciencia e Innovación - Spanish Government (grant no. TIN2009-13391-C04-01), and Conselleria d'Educación - Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/286)

    KHDC1B Is a Novel CPEB Binding Partner Specifically Expressed in Mouse Oocytes and Early Embryos

    The expression and activities of two members of a novel KH domain protein family were tested. KHDC1A and 1B are highly expressed in oocytes. Based on ectopic expression, KHDC1A regulates apoptosis whereas 1B interacts with the cytoplasmic polyadenylation machinery. KHDC1B may serve as a translational repressor during oocyte maturation

    QMOS: Query-based multi-documents opinion-oriented summarization

    Sentiment analysis concerns the study of opinions expressed in a text. This paper presents the QMOS method, which employs a combination of sentiment analysis and summarization approaches. It is a lexicon-based method to query-based multi-documents summarization of opinion expressed in reviews. QMOS combines multiple sentiment dictionaries to improve word coverage limit of the individual lexicon. A major problem for a dictionary-based approach is the semantic gap between the prior polarity of a word presented by a lexicon and the word polarity in a specific context. This is due to the fact that, the polarity of a word depends on the context in which it is being used. Furthermore, the type of a sentence can also affect the performance of a sentiment analysis approach. Therefore, to tackle the aforementioned challenges, QMOS integrates multiple strategies to adjust word prior sentiment orientation while also considers the type of sentence. QMOS also employs the Semantic Sentiment Approach to determine the sentiment score of a word if it is not included in a sentiment lexicon. On the other hand, the most of the existing methods fail to distinguish the meaning of a review sentence and user's query when both of them share the similar bag-of-words; hence there is often a conflict between the extracted opinionated sentences and users’ needs. However, the summarization phase of QMOS is able to avoid extracting a review sentence whose similarity with the user's query is high but whose meaning is different. The method also employs the greedy algorithm and query expansion approach to reduce redundancy and bridge the lexical gaps for similar contexts that are expressed using different wording, respectively. Our experiment shows that the QMOS method can significantly improve the performance and make QMOS comparable to other existing methods

    IBEREVAL OM: Minería de opiniones en los nuevos géneros textuales

    The increasing amount of subjective data on the Web is creating the need to develop effective Question Answering systems able to discriminate such information from factual data, and subsequently process it with specific methods. The participants in the IBEREVAL OM tasks will be given a set of opinion questions (in Spanish and English). Optionally, they will also be able to receive the same set of opinion questions, in which the source, target and expected polarity, as well as the time span the question is referring to are given. They will also be provided with a collection of blog posts, extracted using the Technorati blog search engine (in Spanish and English), in which the answers to the opinion questions should be found The gold standard for this blog posts collection will previously be annotated using the EmotiBlog scheme, by a number of 3 annotators. The EmotiBlog corpus and the set of questions presented in (Balahur et al., 2009) – in their present state will be provided for system training. The participants will be able to participate in two subtasks : 1) in the first one, they will be asked to provide the list of answers to each of the questions (in the same language as the questions, or in the other language); 2) in the second one, they will be asked to provide a summary of the question answers – the top x% of the most important answers, in a non-redundant manner. The Gold Standard for the summaries will be automatically extracted from the manual annotations, taking into account the “intensity” parameter of the opinions expressed.Con el grande aumento de la información subjetiva en la Web, hay una importante necesidad de desarrollar sistemas de Question Answering que sean eficientes y capaces de discriminar entre datos objetivos y subjetivos. Los participantes tendrán una colección de preguntas de opinión (Español e Inglés) en las cuales se deberán encontrar las respuestas. El Gold Standard será anotado previamente con el esquema de anotación EmotiBlog por 3 anotadores. El corpus EmotiBlog y la colección de preguntas presentados en (Balahur et al. 2009) se pondrá a disposición para el entrenamiento del sistema. Los participantes deberán devolver un listado de respuestas para cada una de las preguntas, (en el mismo idioma que la pregunta o en otro), un resumen de las respuestas –de las x% de las respuestas más importantes, de una manera no redundante, el Gold Standard para los resúmenes será extraído automáticamente de las anotaciones manuales teniendo en consideración el parámetro de “intensidad” de la opinión expresada.This evaluation task proposal has been partially supported by Ministerio de Ciencia e Innovación - Spanish Government (grant no. TIN2009-13391-C04-01), and Conselleria d'Educació - Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/288

    Hepatitis C Virus-Related Lymphomagenesis in a Mouse Model

    B cell non-Hodgkin lymphoma is a typical extrahepatic manifestation frequently associated with hepatitis C virus (HCV) infection. The mechanism by which HCV infection leads to lymphoproliferative disorder remains unclear. Our group established HCV transgenic mice that expressed the full HCV genome in B cells (RzCD19Cre mice). We observed a 25.0% incidence of diffuse large B cell non-Hodgkin lymphomas (22.2% in male and 29.6% in female mice) within 600 days of birth. Interestingly, RzCD19Cre mice with substantially elevated serum-soluble interleukin-2 receptor α-subunit (sIL-2Rα) levels (>1000 pg/mL) developed B cell lymphomas. Another mouse model of lymphoproliferative disorder was established by persistent expression of HCV structural proteins through disruption of interferon regulatory factor-1 (irf-1_/_/CN2 mice). Irf-1_/_/CN2 mice showed extremely high incidences of lymphomas and lymphoproliferative disorders. Moreover, these mice showed increased levels of interleukin (IL)-2, IL-10, and Bcl-2 as well as increased Bcl-2 expression, which promoted oncogenic transformation of lymphocytes

    The helicase-like domain from "Thermotoga maritima" reverse gyrase : catalytic cycle and contribution to DNA supercoiling

    Reverse gyrases are the only topoisomerases capable of introducing positive supercoils into circular DNA. Their exclusive presence in thermophilic and hyperthermophilic organisms indicates a DNA thermoprotective role in vivo. In spite of the efforts to improve our knowledge of reverse gyrase, modest progress has been made since its discovery. Currently, only one crystal structure of the enzyme is available, and the most widely accepted reaction mechanism is a hypothetical one, mostly derived from the functions of enzymes related to reverse gyrase domains. In the present work we address mechanistic aspects of the reaction by exploiting the capabilities of a wide range of techniques, to elucidate the role of one module of T. maritima reverse gyrase. Reverse gyrase consists of an N-terminal helicase-like domain, fused to a C-terminal topoisomerase domain. We selected the helicase-like domain as a model of study due to its capacity to couple ATP binding and hydrolysis to DNA processing. Exploiting of these features by reverse gyrase turns this region into a key player at virtually every step of DNA supercoiling. Steady-state ATPase assays and equilibrium binding titrations with the helicase-like domain and the full-length enzyme, enabled us to prove for the first time a harnessing effect of the topoisomerase over the helicase-like domain. We showed that properties intrinsic to the helicase-like domain, like DNA-stimulated ATP hydrolysis, nucleotide-dependent affinity switch for DNA, and thermodynamic coupling between DNA binding and ATP binding and hydrolysis, are strongly reduced in the context of reverse gyrase. At that time apparent contradictions arose, from reports stating that the isolated helicase-like domain is less active than within the context of the full-length enzyme. We reconciled these differences by demonstrating that the presence of the putative N-terminal Zn-finger in the helicase-like domain construct is the cause for the decreased activity. Furthermore, we have elucidated the thermodynamic and conformational cycle of the helicase-like domain, and predicted the stages fulfilling the requirements for interdomain communication, local duplex DNA unwinding, and the stages where DNA is in a suitable state to support the supercoiling reaction. Finally, besides the use of smFRET as a tool to investigate conformational changes in solution, we have also provided high-resolution snapshots of the helicase-like domain via X-ray crystallography. We have provided the most detailed structures of this region to this date, in the apo and ADP-bound forms. They also revealed high flexibility of the linker joining the RecA domains with relative orientations far from random, and local differences in secondary structure motifs that discard the assumption of all reverse gyrases having a “monolithic” build-up. We also created a deletion mutant of the latch, region with a sui generis location, perfectly suited for interdomain communication. Previous reports stated that its deletion from reverse gyrase abolishes positive supercoiling. We demonstrated its strong involvement in DNA binding, DNA-stimulated ATP hydrolysis, and thermodynamic coupling between these processes in the isolated helicase-like domain. We also revealed its role in presenting the ssDNA to the topoisomerase domain and in guiding the strand passage and resealing, ensuring the directionality leading to the introduction of positive supercoils. Additionally, we also elucidated the nucleotide cycle and conformational transitions for this helicase-like domain mutant, which gave the first indications of why no positive supercoiling can be performed by the full-length reverse gyrase lacking the latch, and only DNA relaxation is allowed. Finally, our pre steady-state kinetic studies allowed us to fully describe the unstimulated ATPase activity of the isolated helicase-like domain. We also demonstrated for the first time its DNA unwinding activity, shedding light on the rarely documented local B-DNA duplex destabilization of helicase-like modules, appended to bigger enzymes. Additionally, the sequence of ssDNA strand release, and identification of secondary structure motifs involved in ssDNA binding at different stages were determined. Together with the finding of new conformational states via smFRET, and “targeted” supercoiling assays with the full-length enzyme, we end up proposing a detailed catalytic mechanism, similar to the one derived from the reverse gyrase structure, only this time based on and supported by a combination of kinetic, thermodynamic, and structural data

    Résumé automatique de textes d'opinion

    International audienceIn this paper, we present a summarization system that is specifically designed to process blog posts, where factual information is mixed with opinions on the discussed facts. Our approach combines redundancy analysis with new information tracking and is enriched by a module that computes the polarity of textual fragments in order to summarize blog posts more efficiently. The system is evaluated against English data, especially through the participation in TAC (Text Analysis Conference), an international evaluation framework for automatic summarization, in which our system obtained interesting results.Nous présentons dans cet article un système de résumé automatique tourné vers l'analyse de blogs, où sont exprimées à la fois des informations factuelles et des prises de position sur les faits considérés. Notre système de résumé est fondé sur une approche nouvelle qui mêle analyse de la redondance et repérage des informations nouvelles dans les textes ; ce système générique est en outre enrichi d'un module de calcul de la polarité de l'opinion véhiculée afin de traiter de façon appropriée la subjectivité qui est le propre des billets de blogs. Le système est évalué sur l'anglais, à travers la participation à la campagne d'évaluation internationale TAC (Text Analysis Conference) où notre système a obtenu des performances satisfaisantes