1,275 research outputs found
Key Phrase Extraction of Lightly Filtered Broadcast News
This paper explores the impact of light filtering on automatic key phrase
extraction (AKE) applied to Broadcast News (BN). Key phrases are words and
expressions that best characterize the content of a document. Key phrases are
often used to index the document or as features in further processing. This
makes improvements in AKE accuracy particularly important. We hypothesized that
filtering out marginally relevant sentences from a document would improve AKE
accuracy. Our experiments confirmed this hypothesis. Elimination of as little
as 10% of the document sentences lead to a 2% improvement in AKE precision and
recall. AKE is built over MAUI toolkit that follows a supervised learning
approach. We trained and tested our AKE method on a gold standard made of 8 BN
programs containing 110 manually annotated news stories. The experiments were
conducted within a Multimedia Monitoring Solution (MMS) system for TV and radio
news/programs, running daily, and monitoring 12 TV and 4 radio channels.Comment: In 15th International Conference on Text, Speech and Dialogue (TSD
2012
A Factoid Question Answering System for Vietnamese
In this paper, we describe the development of an end-to-end factoid question
answering system for the Vietnamese language. This system combines both
statistical models and ontology-based methods in a chain of processing modules
to provide high-quality mappings from natural language text to entities. We
present the challenges in the development of such an intelligent user interface
for an isolating language like Vietnamese and show that techniques developed
for inflectional languages cannot be applied "as is". Our question answering
system can answer a wide range of general knowledge questions with promising
accuracy on a test set.Comment: In the proceedings of the HQA'18 workshop, The Web Conference
Companion, Lyon, Franc
A Review on Extraction and Recommendation of Educational Resources from WWW
Keyphrases give a basic method for portraying a report, giving the peruser a few pieces of information about its substance. Wrapper adjustment goes for consequently adjusting a formerly took in wrapper from the source Web webpage to another concealed website for data extraction. It depends on a generative model for the age of content parts identified with characteristic things and designing information in a Web page. To take care of the wrapper adjustment issue, we consider two sorts of data from the source Web webpage. The principal sort of data is the extraction information contained in the already took in wrapper from the source Web webpage. The second sort of data is the beforehand separated or gathered things. Utilize a Bayesian learning way to deal with naturally select an arrangement of preparing cases for adjusting a wrapper for the new concealed site. To take care of the new property revelation issue, we build up a model which breaks down the encompassing content sections of the qualities in the new inconspicuous site. A Bayesian learning strategy is produced to find the new qualities and their headers. The direct broad investigations from various genuine Web locales to show the viability of our structure. Keyphrases can be helpful in a different applications, for example, recovery motors, perusing interfaces, thesaurus development, content mining and so on. There are likewise different errands for which keyphrases are helpful
- …