29 research outputs found

    Ontologies and Information Extraction

    Full text link
    This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

    A Survey on Data Integration in Data Warehouse

    Get PDF
    Data warehousing embraces technology of integrating data from multiple distributed data sources and using that at an in annotated and aggregated form to support business decision-making and enterprise management. Although many techniques have been revisited or newly  developed in the context of data warehouses, such as view maintenance and OLAP, little attention has been paid to data mining techniques for supporting the most important and costly tasks of data integration for data warehouse design

    From logical forms to SPARQL query with GETARUNS

    Get PDF
    We present a system for Question Answering which computes a prospective answer from Logical Forms produced by a full-fledged NLP for text understanding, and then maps the result onto schemata in SPARQL to be used for accessing the Semantic Web. As an intermediate step, and whenever there are complex concepts to be mapped, the system looks for a corresponding amalgam in YAGO classes. It is just by the internal structure of the Logical Form that we are able to produce a suitable and meaningful context for concept disambiguation. Logical Forms are the final output of a complex system for text understanding - GETARUNS - which can deal with different levels of syntactic and semantic ambiguity in the generation of a final structure, by accessing computational lexical equipped with sub-categorization frames and appropriate selectional restrictions applied to the attachment of complements and adjuncts. The system also produces pronominal binding and instantiates the implicit arguments, if needed, in order to complete the required Predicate Argument structure which is licensed by the semantic component

    TEXT MINING AND TEMPORAL TREND DETECTION ON THE INTERNET FOR TECHNOLOGY ASSESSMENT: MODEL AND TOOL

    Get PDF
    In today´s world, organizations conduct technology assessment (TAS) prior to decision making about investments in existing, emerging, and hot technologies to avoid costly mistakes and survive in the hyper-competitive business environment. Relying on web search engines in looking for relevant information for TAS processes, decision makers face abundant unstructured information that limit their ability to assess technologies within a reasonable time frame. Thus the following qustion arises: how to extract valuable TAS knowledge from a diverse corpus of textual data on the web? To cope with this qustion, this paper presents a web-based model and tool for knowledge mapping. The proposed knowledge maps are constructed on the basis of a novel method of co-word analysis, based on webometric web counts and a temporal trend detection algorithm which employs the vector space model (VSM). The approach is demonstrated and validated for a spectrum of information technologies. Results show that the research model assessments are highly correlated with subjective expert (n=136) assessment (r \u3e 0.91), and with predictive validity valu above 85%. Thus, it seems safe to assume that this work can probably be generalized to other domains. The model contribution is emphasized by the current growing attention to the big-data phenomenon

    Extração de informação como base para descoberta de conhecimento em dados não estruturados

    Get PDF
    Métodos de Descoberta de Conhecimento em Texto ou Knowledge Discovery inText - KDT tem sido aplicados a uma grande variedade de domínios, desde artigos paracongressos, até receituários médicos. KDT é o processo de encontrar padrões e informaçõesimplícitas interessantes ou úteis em um corpo de informação textual não estruturado[LOH 97]. Este processo combina muitas das técnicas de Extração de  Informação,Recuperação de Informação, Processamento da Linguagem Natural e Sumarização deDocumentos com os métodos de Data Mining (DM).Os dados estruturados, armazenados na maioria dos Sistemas de Gerência deBancos de Dados, são mais fáceis de serem tratados por meios computacionais, porqueexistem linguagens formais, como SQL e QBE, que permitem sua manipulação e consultade forma mais concisa e precisa [LOH 97]. Os dados não estruturados, por outro lado,necessitam de mecanismos computacionais diferentes dos tradicionalmente usados, paraque possam ser coletados, armazenados, manipulados e consultados. Para aplicar métodostradicionais de DM sobre textos, é necessário impor alguma estrutura para os dados[DIX 97]. Ou seja, alguém deve definir a estrutura destes dados, coletá-los e armazená-losnum Banco de Dados convencional. Entretanto, tal processo necessita de apoioautomatizado, pois é difícil, tedioso e sujeito a erros se feito por pessoas. Neste sentido,Descoberta de Conhecimento em Textos é uma área bastante relacionada com a área de Extração de Informação, bem como a de Recuperação de Informação, e realmente pode-seconsiderar que sistemas de KDT são  construídos a partir de componentes que executam estas tarefas [FEL 99]

    Semantic Knowledge Extraction from Research Documents

    Full text link

    Контент-аналіз. Історія розвитку і світовий досвід

    Get PDF
    Монографія присвячена проблемам розвитку одного з найпоширеніших методів аналізу масових комунікацій – контент-аналізу. Розглядаються етапи розвитку контент-аналізу, дається характеристика застосування його на кожному етапі, описуються особливості методики та напрями вдосконалення. Особлива увага приділяється комп’ютерному контент-аналізу, який поступово перетворює контент-аналіз з наукового методу в сучасну технологію, яка знаходить повсюдне масове застосування. Однією з технологій, яка має в основі контент-аналіз, є Text Mining. Про її можливості та застосування також ведеться мова в роботі. Дослідження може прислужитися викладачам, науковцям, політикам, аспірантам, студентам, усім, хто цікавиться проблемами і методами аналізу текстів
    corecore