10 research outputs found

    Document Clustering based on Topic Maps

    Full text link
    Importance of document clustering is now widely acknowledged by researchers for better management, smart navigation, efficient filtering, and concise summarization of large collection of documents like World Wide Web (WWW). The next challenge lies in semantically performing clustering based on the semantic contents of the document. The problem of document clustering has two main components: (1) to represent the document in such a form that inherently captures semantics of the text. This may also help to reduce dimensionality of the document, and (2) to define a similarity measure based on the semantic representation such that it assigns higher numerical values to document pairs which have higher semantic relationship. Feature space of the documents can be very challenging for document clustering. A document may contain multiple topics, it may contain a large set of class-independent general-words, and a handful class-specific core-words. With these features in mind, traditional agglomerative clustering algorithms, which are based on either Document Vector model (DVM) or Suffix Tree model (STC), are less efficient in producing results with high cluster quality. This paper introduces a new approach for document clustering based on the Topic Map representation of the documents. The document is being transformed into a compact form. A similarity measure is proposed based upon the inferred information through topic maps data and structures. The suggested method is implemented using agglomerative hierarchal clustering and tested on standard Information retrieval (IR) datasets. The comparative experiment reveals that the proposed approach is effective in improving the cluster quality

    Descrizioni archivistiche e web semantico: un connubio possibile?

    Get PDF
    The article presents a reflection on the possibility to insert and manage archival description within the semantic web. The chosen language for this attempt is the one of Topic Maps, standard ISO 13250, a web semantic technology collateral to RDF. Through using TMCL (Topic Maps Constraint Language), the main construction will be presented, highlighting the scalability and flexibility options offered. The data model of Topic Maps defines some elements (constructions in the vocabulary) that, if combined, allow to express and codify a illimited range of descriptions and relations; moreover, they allows to manage multiple indexes disambiguating different terms. This contributions examines carefully agreements and possibilities of a mapping between topic maps and the main archival descriptive standards, presenting and example using ISAAR(CPF). The data in a description like this should integrate with the choosen language, showing practical advantages in extra supply tools, in exchanging data with other archivistic informative systems, and in in building flexible interface

    Konzeption und Implementierung einer semantischen Suchmaschine für Topic Maps

    Get PDF
    In den vergangenen Jahren hat die Topic-Maps-Technologie eine zunehmende Bedeutung unter den Datenintegrationstechnologien gewonnen. Für die direkte Abfrage von Informationen auseiner Topic Map existiert mit der Topic-Maps-Abfragesprache TMQL ein mächtiges Werkzeug. Um diese nutzen zu können, muss der Benutzer jedoch sowohl über Kenntnisse der Abfragesprache verfügen als auch das Schema der Topic Map kennen. Deshalb wird eine Suchmaschine benötigt, mit der auch unerfahrene Benutzer die Topic-Maps-Datenbasis durchsuchen können. Nach einer Einführung in die relevanten Topic-Maps-Grundlagen werden zunächst verschiedene auf Topic-Maps-Daten spezialisierte Indexierungsalgorithmen untersucht. Einen Spezialfall stellt dabei die Indexierung virtuell zusammengeführter Topic Maps dar. Zu diesem Problem werden verschiedene Lösungsmöglichkeiten untersucht. Auf Basis der Suchmaschinenbibliothek Lucene wird eine semantische Suchmaschine entwickelt, welche die Topic-Maps-immanenten Elemente mit expliziter als auch mit impliziter Bedeutung sowohl bei der Indexierung als auch bei der Gewichtung der Suchergebnisse nutzt. Darüber hinaus wird ein allgemeines Modell zur Beschreibung von Topic-Maps-basierten Facetten vorgestellt. Darauf aufbauend werden Möglichkeiten der Erstellung generischer Facetten untersucht. Weiterhin wird mit Hilfe der Topic-Maps-Abfragesprache TMQL eine Methode zur Definition von domänen-spezifischen Facetten entworfen und erläutert. Mit der prototypischen Implementierung einer Schnittstelle, mit der die entstandene Suchmaschine in Topic-Maps-basiertenWebapplikationen genutzt werden kann, wird die einfache Integration der entwickelten Suchmaschine in bestehende Web-Applikationen demonstriert. Dies wird durchdie Schaffung einesneuen Pakets für die Middleware RTM ermöglicht

    Aplicación del modelo Topic Maps a la documentación educativa en los Centros de Recursos para el Aprendizaje y la Investigación (CRAI)

    Get PDF
    Las bibliotecas educativas, en general, y las universitarias, en particular, tienen como función principal servir de apoyo a la razón de ser de las instituciones en las que se inscriben: la enseñanza. Dicha función educativa está adquiriendo un papel preeminente en una Sociedad de la Información que hace necesario que sus ciudadanos sean competentes en el medio digital, nuevo espacio soporte de la información, y protagonistas de su aprendizaje a lo largo de la vida, y por la implantación del Espacio Europeo de Enseñanza Superior, en particular, con la adopción de un nuevo modelo educativo que se sustenta en la adquisición de competencias, en el concepto de "aprender a aprender" y donde el elemento activo y central del aprendizaje es el alumno. El nuevo espacio digital tiene características distintivas propias e impone nuevas formas de lectura lo que plantea la necesidad de una revisión de las herramientas asociativas de utilidad documental para su adaptación a la organización de los recursos educativos electrónicos a este medio. En este marco, se propone la idoneidad de la norma ISO/IEC 13250:2000 Topic Maps como modelo para la consecución de este fin. Adoptando una metodología de investigación cualitativa, descriptiva en esencia, como base para la comparación y posterior interpretación y puesta en relación de los resultados obtenidos, se realiza el análisis de modelos asociativos provenientes de diversas áreas (tesauros, mapas conceptuales y ontologías) enfrentándolos a su adaptación al modelo Topic maps. Se estudia el modelo Topic maps en profundidad: su desarrollo histórico y estado actual, sus componentes y desarrollos relacionados así como su relación con los conceptos documentales clásicos; se analizan los proyectos realizados, con especial atención a los documentales y educativos y finalmente se hace una recopilación de las herramientas disponibles y un análisis de aquellas que puedan ser de utilidad como usuarios finales para la organización de recursos educativos en los Centros de Recursos para el Aprendizaje y la Investigación (CRAI), aspecto de aplicación no estudiado con anterioridad. Se concluye que la utilización del modelo Topic Maps, con una estructura conceptual semejante a los tesauros y pudiendo ser usados como mapas conceptuales, de demostrada eficacia educativa, por ser en sí mismo grafos, lleva aparejadas algunas otras ventajas como la interoperabilidad y la independencia de los recursos que organiza. Así, aporta un valor añadido poco explotado, por las funcionalidades que auna, en relación con otros posibles modelos en la organización de los recursos educativos en lo referente a: sus posibilidades de identificación de materias por los humanos vía PSI; su funcionalidad de unión (y "desunión", uso de fragmentos) de mapas, única frente a otras herramientas; la incorporación del concepto "scope" que permite su uso facetado y su independencia de los recursos que organiza, lo que permite su manejo y compartición de forma separada. Una propuesta de modelo de aplicación práctica final gratuito, interoperable y escalable para el entorno del CRAI que permita la utilización integrada de los recursos de éste, movilizándolos alrededor de la materia de la que tratan, y mostrando sus conexiones conceptuales cierra el trabajo. Queda en parte en el plano teórico por la falta de herramientas integrables con facilidad en web pero no deja de ser posible ya, con los conocimientos técnicos necesarios, y a corto/medio plazo desarrollando pequeñas piezas necesarias.As bibliotecas educativas, em geral, e as universitárias, em particular, têm como função principal servir de apoio a razão de ser das instituições as quais estão vinculadas: o ensino. Esta função educativa está adquirindo um papel preeminente numa Sociedade da Informação em que seus cidadãos necessitam ser hábiles no meio digital, novo espaço suporte da información, ademais de protagonistas de seu aprendizado ao longo da vida e, pela implantação do Espaço Europeu de Ensino Superior, em particular, com a adoção de um novo modelo educativo que se sustenta no aquisição de competências, no conceito de "aprender a aprender" e onde o elemento ativo e central da aprendizagem é o aluno. O novo espaço digital possui características distintivas próprias e impõe novas formas de leitura, o que planteia a necessidade de uma revisão das ferramentas associativas de utilidade documental para sua adaptação à organização dos recursos educativos electrônicos a esse medio. Neste contexto, propõe-se a idoneidade da norma ISO/IEC 13250:2000 Topic Maps como modelo para a consecução desse fim. Adotando uma metodologia de pesquisa qualitativa e descritiva em essência, como base para a comparação e posterior interpretação e estabelecimento das relações dos resultados obtidos, faz-se a análise de modelos associativos provenientes de diversas áreas (tesauros, mapas conceituais e ontologias) confrontando sua adaptação ao modelo Topic maps. Estuda-se o modelo Topic maps em profundidade: sua evolução histórica e estado atual, seus componentes e desenvolvimentos relacionados assimn como sua relação com os conceitos documentais clássicos; analizam-se os projetos realizados, com especial atenção aos documentais e educativos e, finalmente, faz-se uma recopilação das ferramentas disponíveis e uma análise daquelas que podem ser de utilidade como usuários finais para a organização de recursos educativos nos Centros de Recursos para a Aprendizagem e a Investigação (CRAI), aspecto de aplicação não estudado com anterioridade. Conclui-se que a utilização do modelo Topic Maps, com uma estrutura conceitual semelhante a dos tesauros e podendo ser usados como mapas conceituais, de demonstrada eficácia educativa, por ser em si mesmo grafos, traz consigo algumas outras vantagens como a interoperabilidade e a independência dos recursos que organiza. Assim, agrega um diferencial pouco explorado, pelas funcionalidades que combina, em relação a outros possíveis modelos na organização dos recursos educacionais no que diz respeito: as possibilidades de identificação de matérias pelos humanos via PSI; a funcionalidade de união (e "desunião", uso de fragmentos) de mapas, única frente a outras ferramentas; a incorporação do conceito "scope" que permite o uso facetado e a independência dos recursos que organiza, o que permite seu manejo e compartilhamento de forma separada. Uma proposta de modelo de aplicação prática final gratuita, interoperacional e escalonável em um ambiente CRAI que permita a utilização integrada dos recursos deste, mobilizando-os em relação a matéria de que tratan, e mostrando suas conexões concetuais encerra o trabalho. Fica, em parte, no plano teórico por falta de ferramentas integrais com facilidade na web, mas não deixa de ser possível já, com os conhecimentos técnicos necessários e desenvolvendo, a curto/médio prazo, pequenas peças necessárias.Educational libraries in general and those of universities in particular have as main function to support the raison d'être of institutions they belong: teaching. This educational function is taking a leading role in an Information Society that requires from its citizens to be competent in digital media, a new space of information support, and leadership in their own lifelong learning. The implementation of the European Higher Education Area that is adopting a new educational model based on the skills acquisition has brought the concept of "learning to learn" and is making the student to be the central and active element of learning. The new digital space has its own characteristics that imposes new ways of reading. That fact raises the need for a review of associative library tools to adapt the organization of electronic educational resources to this medium. In this context, we propose the suitability of ISO / IEC 13250:2000 Topic Maps as a model for achieving this goal. Using a qualitative research methodology, mainly descriptive, as a basis for comparison and for subsequent interpretation and relationship among the obtained results, we analyzed the associative models used in different areas (thesauri, concept maps and ontologies) and confront them with their adaptation to Topic maps model. We studied Topic maps model in depth, with its historical development and current status, its components and related developments and its relationship with classics library science concepts. We analyze the realized projects, with a focus on those of library and education. Finally, we have made a compilation of available tools and an analysis of those useful to end users for the organization of educational resources in Learning and Research Resources Centers (LRRC), an application not previously studied. We conclude that the use of Topic Maps model, with a conceptual framework similar to thesauri and that can be used as concept maps (with proven effectiveness in education) because they are graphs, add some other advantages to those, such as interoperability and independence of the organizing resources. It provides an untapped value-added for the functionality, in relation to other possible models in the organization of educational resources regarding to: identification of topics by humans via PSI, that combines its merge (and "split", fragments use) functionality, a unique feature compared to other tools, the incorporation of "scope" that allows to use facets and its independence of the resources to organize, allowing their management and sharing separately. A proposed final model ends the paper. It is a free, interoperable and scalable one for that environment, allowing the integrated use of library resources, around the subject, and showing their conceptual connections. It stands on a theoretical level by the lack of tools for easy web integration but it is still possible to have it, having the appropriate skills, and at short/medium term developing other small interconnecting pieces

    Topic Maps and library and information science : an exploratory study of Topic Maps principles from a Knowledge and Information Organization perspective

    Get PDF
    Purpose: This master thesis attempts to present a ‘state of the art’ of the placement of Topic Maps (ISO13250) in Library and Information Science, through an extensive literature review and a synthesis based on their principles. It was sited from a Knowledge and Information Organization perspective, represented by the work by Elain Svenonius The Intellectual Foundation of Information Organization and some of the concepts of Knowledge Organization. This thesis also intends to present a conceptual and theoretical framework for future research. Design/methodology/approach: The study under review presents a qualitative approach based on Grounded Theory principles to analyse the literature and build the conceptual framework for its analysis. The literature reviewed consisted of more than sixty documents, which included, among others, journal articles, conference presentations and papers, student reports and thesis, as well as a book chapter. Moreover, this was complemented with information obtained from mailing lists, blog postings and websites, and some unstructured interviews. Findings: Topic Maps appears to be a development aligned within the tradition of Knowledge and Information Organization but is completely adapted to the context of the Web and the digital environments. In a LIS perspective, it is bibliographic meta-language able to represent, extend and mostly integrate all the existing Knowledge Organization Systems in a standards-based generic model applicable to digital content and online presentation. Conceptually, Topic Maps is in the borders of the LIS discipline with Knowledge Representation and Computer Science, where LIS conceptual models play the role of intermediaries by providing the ontologies to the ‘bibliographic universe’. Topic Maps questions traditional LIS views and principles. Even though some of them still remain the same, as the meaning-based identification of entities, the notions of ‘document’ and ‘subject’ require further studies. Some important applications give account of the capabilities and potentials for further developments and research on Topic Maps in LIS. The main field of application is the Digital Humanities and TEIcodified texts presentation.Joint Master Degree in Digital Library Learning (DILL

    Scaling Topic Maps

    No full text

    Scaling Topic Maps

    No full text

    Large scale knowlege representation of distributed biomedical information scaling topic maps.

    No full text
    Within the last years the Web dramatically influenced biomedical research. Although it allows for almost instantaneous access to a huge amount of distributed information the problem how to retrieve useful information still persist. With semantic technologies (especially Topic Maps) the solution becomes tangible. We will discuss in this paper concepts and a technical realization for knowledge representation within the biomedical domain. This includes not only the semantic access of distributed and heterogeneous resources based on state-of-the-art enterprise integration technologies (J2EE, Web Services) but also an approach for Topic Map based views on unstructured information from scientific publications. We will furthermore present the implementation of an information portal based on the seamless semantic integration of ~ 500 genome databases and ~16.000.000 abstracts