    OTMM based Proposal Classification and Clustering

    In the current environment, important task in any agencies (government, private) are to be selection proper search proposal. The proposal groups into the respective discipline when the large number of proposals are received .There is need to classify research proposals into proper categories automatically this will speed up the research proposal classification work. The technique used for proposals classification is OTTM Based on this respective discipline proposal assign to their expert for verification and review purpose

    Learning to identify single-snippet answers to definition questions

    We present a learning-based method to identify single-snippet answers to definition questions in question answering systems for document collections. Our method combines and extends two previous techniques that were based mostly on manually crafted lexical patterns and WordNet hypernyms. We train a Support Vector Machine (SVM) on vectors comprising the verdicts or attributes of the previous techniques, and additional phrasal attributes that we acquire automatically. The SVM is then used to identify and rank single 250-character snippets that contain answers to definition questions. Experimental results indicate that our method clearly outperforms the techniques it builds upon.

    Apports de la linguistique dans les systèmes de recherche d'informations précises

    International audienceSearching for precise answers to questions, also called "question-answering", is an evolution of information retrieval systems: can it, as its predecessors, rely mostly on numeric methods, using exceedingly little linguistic knowledge? After a presentation of the question-answering task and the issues it raises, we examine to which extent it can be performed with very little linguistic knowledge. We then review the different kinds of linguistic knowledge that researchers have been using in their systems: syntactic and semantic knowledge for sentence analysis, role of "named entity" recognition, taking into account of the textual dimension of documents. A discussion on the respective contributions of linguistic and non-linguistic methods concludes the paper.La recherche de réponses précises à des questions, aussi appelée « questions-réponses », est une évolution des systèmes de recherche d'information : peut-elle, comme ses prédécesseurs, se satisfaire de méthodes essentiellement numériques, utilisant extrêmement peu de connaissances linguistiques ? Après avoir présenté la tâche de questions-réponses et les enjeux qu'elle soulève, nous examinons jusqu'où on peut la réaliser avec très peu de connaissances linguistiques. Nous passons ensuite en revue les différents types de connaissances linguistiques que les équipes ont été amenées à mobiliser : connaissances syntaxiques et sémantiques pour l'analyse de phrases, rôle de la reconnaissance d'« entités nommées », prise en compte de la dimension textuelle des documents. Une discussion sur les contributions respectives des méthodes linguistiques et non linguistiques clôt l'article

    Enhanced web-based summary generation for search.

    After a user types in a search query on a major search engine, they are presented with a number of search results. Each search result is made up of a title, brief text summary and a URL. It is then the user\u27s job to select documents for further review. Our research aims to improve the accuracy of users selecting relevant documents by improving the way these web pages are summarized. Improvements in accuracy will lead to time improvements and user experience improvements. We propose ReClose, a system for generating web document summaries. ReClose generates summary content through combining summarization techniques from query-biased and query-independent summary generation. Query-biased summaries generally provide query terms in context. Query-independent summaries focus on summarizing documents as a whole. Combining these summary techniques led to a 10% improvement in user decision making over Google generated summaries. Color-coded ReClose summaries provide keyword usage depth at a glance and also alert users to topic departures. Color-coding further enhanced ReClose results and led to a 20% improvement in user decision making over Google generated summaries. Many online documents include structure and multimedia of various forms such as tables, lists, forms and images. We propose to include this structure in web page summaries. We found that the expert user was insignificantly slowed in decision making while the majority of average users made decisions more quickly using summaries including structure without any decrease in decision accuracy. We additionally extended ReClose for use in summarizing large numbers of tweets in tracking flu outbreaks in social media. The resulting summaries have variable length and are effective at summarizing flu related trends. Users of the system obtained an accuracy of 0.86 labeling multi-tweet summaries. This showed that the basis of ReClose is effective outside of web documents and that variable length summaries can be more effective than fixed length. Overall the ReClose system provides unique summaries that contain more informative content than current search engines produce, highlight the results in a more meaningful way, and add structure when meaningful. The applications of ReClose extend far beyond search and have been demonstrated in summarizing pools of tweets