494 research outputs found

    Learning to harvest information for the semantic web

    Get PDF
    This work was carried out within the AKT project (www.aktors.org), sponsored by the UK Engineering and Physical Sciences Research Council (grant GR/N15764/01), and the Dot.Kom project (www.dot-kom.org), sponsored by the EU IST asp part of Framework V (grant IST-2001-34038).In this paper we describe a methodology for harvesting in- formation from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodology is based on a combination of information extraction, information integration and machine learning techniques. Learning is seeded by extracting information from structured sources (e.g. databases and digital libraries) or a user-defined lexicon. Retrieved information is then used to partially annotate documents. An- notated documents are used to bootstrap learning for simple Information Extraction (IE) methodologies, which in turn will produce more annotation to annotate more documents that will be used to train more complex IE engines and so on. In this paper we describe the methodology and its implementation in the Armadillo system, compare it with the current state of the art, and describe the details of an implemented application. Finally we draw some conclusions and highlight some challenges and future work.peer-reviewe

    Machine aided indexing from natural language text

    Get PDF
    The NASA Lexical Dictionary (NLD) Machine Aided Indexing (MAI) system was designed to (1) reuse the indexing of the Defense Technical Information Center (DTIC); (2) reuse the indexing of the Department of Energy (DOE); and (3) reduce the time required for original indexing. This was done by automatically generating appropriate NASA thesaurus terms from either the other agency's index terms, or, for original indexing, from document titles and abstracts. The NASA STI Program staff devised two different ways to generate thesaurus terms from text. The first group of programs identified noun phrases by a parsing method that allowed for conjunctions and certain prepositions, on the assumption that indexable concepts are found in such phrases. Results were not always satisfactory, and it was noted that indexable concepts often occurred outside of noun phrases. The first method also proved to be too slow for the ultimate goal of interactive (online) MAI. The second group of programs used the knowledge base (KB), word proximity, and frequency of word and phrase occurrence to identify indexable concepts. Both methods are described and illustrated. Online MAI has been achieved, as well as several spinoff benefits, which are also described

    Annual Report 1999 / Department for Computer Science

    Get PDF
    Selbstdarstellung des Instituts für Informatik der BTU Cottbus und Berichte der Lehrstühle für das Jahr 1999.Presentation of the Department for Computer Science of the BTU Cottbus and reports of the chairs at the department for the year 1999

    Using NLP to Define the Scope for Stakeholder Assessment of Simulated Service Qualities

    Get PDF
    The paper is devoted to defining the scope of research activities aimed at involving business stakeholders in a software process in a form of assessing the perceived quality of the service-oriented system in its usage context when the initial specification of the system is available in natural language form. We propose to use NLP techniques to extract the scope from this specification and to represent it in the format of specific predesign models compatible with the rest of the simulation solution.Стаття присвячена визначенню області проведення досліджень, пов’язаних з підключенням зацікавлених осіб до процесу розробки програмного забезпечення через оцінювання сприйманої якості сервіс-оріентованих систем в контексті їхнього використання, коли початкова специфікація системи задана природною мовою. Пропонується використання технології аналізу природної мови для отримання інформації про область застосування з цієї специфікації у форматі спеціальних моделей предпроектування, які є сумісними з основними модулями імітаційного рішення.Статья посвящена определению области проведения исследований, связанных с подключением заинтересованных лиц к процессу разработки программного обеспечения через оценивание воспринимаемого качества сервис-ориентированных систем в контексте их использования, в случае, если начальная спецификация системы задана естественным языком. Предлагается использование технологии анализа естественного языка для получения информации об области применения из этой спецификации в формате специальных моделей предпроектирования, которые являются совместимыми с основными модулями имитационного решения

    Composite Semantic Relation Classification

    Get PDF
    Different semantic interpretation tasks such as text entailment and question answering require the classification of semantic relations between terms or entities within text. However, in most cases it is not possible to assign a direct semantic relation between entities/terms. This paper proposes an approach for composite semantic relation classification, extending the traditional semantic relation classification task. Different from existing approaches, which use machine learning models built over lexical and distributional word vector features, the proposed model uses the combination of a large commonsense knowledge base of binary relations, a distributional navigational algorithm and sequence classification to provide a solution for the composite semantic relation classification problem

    Reglas de conversión entre el diagrama de clases y los grafos conceptuales de Sowa

    Get PDF
    La conversión entre modelos de un nivel de abstracción inferior a otro de nivel de abstracción superior facilita la comunicación entre los involucrados en un proceso de desarrollo de software. Los grafos conceptuales son diagramas que presentan la información modelada de una manera semiformal, y pueden llegar a ser comprensibles tanto por el humano como por el computador. El diagrama de clases, en cambio, presenta las clases, atributos, operaciones y relaciones principales de un sistema en un lenguaje propio de los expertos en modelamiento de productos de software. En este artículo se propone un conjunto de reglas de conversión para traducir el diagrama de clases (más detallado y, en consecuencia, de bajo nivel de abstracción) en una forma más comprensible al interesado (y de más alto nivel de abstracción) como lo son los grafos conceptuales de Sowa
    corecore