5 research outputs found

    Google Knows Who is Famous Today -- Building an Ontology from Search Engine Knowledge and DBpedia

    No full text

    Designing novel abstraction networks for ontology summarization and quality assurance

    Get PDF
    Biomedical ontologies are complex knowledge representation systems. Biomedical ontologies support interdisciplinary research, interoperability of medical systems, and Electronic Healthcare Record (EHR) encoding. Ontologies represent knowledge using concepts (entities) linked by relationships. Ontologies may contain hundreds of thousands of concepts and millions of relationships. For users, the size and complexity of ontologies make it difficult to comprehend “the big picture” of an ontology\u27s content. For ontology editors, size and complexity make it difficult to uncover errors and inconsistencies. Errors in an ontology will ultimately affect applications that utilize the ontology. In prior studies abstraction networks (AbNs) were developed to provide a compact summary of an ontology\u27s content and structure. AbNs have been shown to successfully support ontology summarization and quality assurance (QA), e.g., for SNOMED CT and NCIt. Despite the success of these previous studies, several major, unaddressed issues affect the applicability and usability of AbNs. This thesis is broken into five major parts, each addressing one issue. The first part of this dissertation addresses the scalability of AbN-based QA techniques to large SNOMED CT hierarchies. Previous studies focused on relatively small hierarchies. The QA techniques developed for these small hierarchies do not scale to large hierarchies, e.g., Procedure and Clinical finding. A new type of AbN, called a subtaxonomy, is introduced to address this problem. Subtaxonomies summarize a subset of an ontology\u27s content. Several types of subtaxonomies and subtaxonomy-based QA studies are discussed. The second part of this dissertation addresses the need for summarization and QA methods for the twelve SNOMED CT hierarchies with no lateral relationships. Previously developed SNOMED CT AbN derivation methodologies, which require lateral relationships, cannot be applied to these hierarchies. The Tribal Abstraction Network (TAN) is a new type of AbN derived using only hierarchical relationships. A TAN-based QA methodology is introduced and the results of a QA review of the Observable entity hierarchy are reported. The third part focuses on the development of generic AbN derivation methods that are applicable to groups of structurally similar ontologies, e.g., those developed in the Web Ontology Language (OWL) format. Previously, AbN derivation techniques were applicable to only a single ontology at a time. AbNs that are applicable to many OWL ontologies are introduced, a preliminary study on OWL AbN granularity is reported on, and the results of several QA studies are presented. The fourth part describes Diff Abstraction Networks, which summarize and visualize the structural differences between two ontology releases. Diff Area Taxonomy and Diff Partial-area Taxonomy derivation methodologies are introduced and Diff Partial-area taxonomies are derived for three OWL ontologies. The Diff Abstraction Network approach is compared to the traditional ontology diff approach. Lastly, tools for deriving and visualizing AbNs are described. The Biomedical Layout Utility Framework is introduced to support the automatic creation, visualization, and exploration of abstraction networks for SNOMED CT and OWL ontologies

    Using an ontology to improve the web search experience

    Get PDF
    The search terms that a user passes to a search engine are often ambiguous, referring to homonyms. The results in these cases are a mixture of links to documents that contain different meanings of the search terms. Current search engines provide suggested query completions in a dropdown list. However, such lists are not well organized, mixing completions for different meanings. In addition, the suggested search phrases are not discriminating enough. Moreover, current search engines often return an unexpected number of results. Zero hits are naturally undesirable, while too many hits are likely to be overwhelming and of low precision. This dissertation work aims at providing a better Web search experience for the users by addressing the above described problems.To improve the search for homonyms, suggested completions are well organized and visually separated. In addition, this approach supports the use of negative terms to disambiguate the suggested completions in the list. The dissertation presents an algorithm to generate the suggested search completion terms using an ontology and new ways of displaying homonymous search results. These algorithms have been implemented in the Ontology-Supported Web Search (OSWS) System for famous people. This dissertation presents a method for dynamically building the necessary ontology of famous people based on mining the suggested completions of a search engine. This is combined with data from DBpedia. To enhance the OSWS ontology, Facebook is used as a secondary data source. Information from people public pages is mined and Facebook attributes are cleaned up and mapped to the OSWS ontology. To control the size of the result sets returned by the search engines, this dissertation demonstrates a query rewriting method for generating alternative query strings and implements a model for predicting the number of search engine hits for each alternative query string, based on the English language frequencies of the words in the search terms. Evaluation experiments of the hit count prediction model are presented for three major search engines. The dissertation also discusses and quantifies how far the Google, Yahoo! and Bing search engines diverge from monotonic behavior, considering negative and positive search terms separately

    Análise de redes sociais e o sucesso académico: um estudo com ‘grupos’ de alunos no facebook

    Get PDF
    Os adolescentes são cada vez mais utilizadores assíduos dos sites de redes sociais, disponibilizados na Web 2.0. Assim, desenvolvemos um projeto de investigação sobre o impacto que a colaboração e comunicação online, efetuada através da rede social Facebook, pode ter no rendimento escolar dos alunos na disciplina de Introdução às Tecnologias de Informação e Comunicação (ITIC). Este projeto foi implementado no ano letivo de 2011/2012, a alunos de duas turmas do 9º ano de escolaridade, pertencentes a duas escolas públicas dos concelhos de Oeiras e Cascais. A escolha destas duas escolas prende-se com o facto de, em cada uma destas, ter sido adotado abordagem pedagógicas distintas: uma mais expositiva e outra socioconstrutivista. Os dados recolhidos demonstram que a interação online revela-se positiva e significativamente associada ao rendimento académico dos alunos.info:eu-repo/semantics/publishedVersio

    Tendências nas metodologias de investigação na área das tecnologias na educação: uma análise da investigação de cursos pós-graduados entre 2005 e 2013

    Get PDF
    A problemática das abordagens metodológicas adotadas em estudos de natureza académica na área das Tecnologias de Informação e Comunicação (TIC) na Educação é uma preocupação que, enquanto investigadores, devemos ter sempre presentes. Assume-se que uma apreciação das escolhas efetuadas no domínio das metodologias de investigação tem implicações profundas e determinantes na qualidade dos resultados produzidos. Para além da explicitação de uma agenda de trabalho visando a análise da investigação realizada em Portugal em cursos de pós-graduação universitários, neste artigo apresenta-se os resultados de uma análise realizada aos trabalhos académicos de pós-graduação aprovados entre 2005 e 2013 em provas públicas de mestrado e doutoramento na área genérica de TIC na Educação que se encontram publicamente disponíveis nos repositórios das instituições de ensino superior portuguesas visando assim apresentar, de forma abrangente, o panorama metodológico dos trabalhos académicos atualmente desenvolvidos na área.info:eu-repo/semantics/publishedVersio
    corecore