152 research outputs found

    Creation of a Style Independent Intelligent Autonomous Citation Indexer to Support Academic Research

    Get PDF
    This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. By building RUgle with the most relaxed restrictions possible on the formatting of the documents it can process, we hope to create a system that can combine the best features of currently available closed library searches that are designed to facilitate academic research with the inclusive nature of general purpose search engines that continually crawl the web and add documents to their indexed database

    Visualization of individual's knowledge by analyzing the citation networks

    Full text link
    Visual analysis of knowledge domain is an emerging field of study as science is highly dynamic and constantly evolving. Behind the scene, a knowledge domain is formed and contributed by enormous researchers' publications that describe the common subject of the domain. There is large number of significant activities have been carried out to visualize and identify the knowledge domains of research projects, groups and communities. However, the research on visualizing the knowledge structure at individual level is relative inactive. It is difficult to track down the individual's contribution to the subject and the degree of the knowledge they possess. In this paper, we are attempting to visualize the individual's knowledge structure by analyzing the citation and co-authorship relational structures. We try to analyze and map author's documents to the knowledge domains. By mapping the documents to knowledge domain, we obtain the skeleton of knowledge structure of an individual. Then, we apply the visualization technique to present the result. © 2007 IEEE

    Analysis and visualization of co-authorship networks for understanding academic collaboration and knowledge domain of individual researchers

    Full text link
    This paper proposed a new approach for collecting, analyzing and visualizing co-authoring data of individuals. This approach can be used for understanding the academic collaboration and knowledge domain of individual researchers in a past period through repetitive co-published works. Particularly we extracted the co-authoring data from the DBLP which is one of the largest on-line Computer Science bibliographic databases available on the Internet. To help users to understand the academic collaboration and knowledge domain of individuals, we developed an InterRing visualizer which shows not only the weight of co-authorship of an individual with other researchers in particular academic year, but also the knowledge domain of the individual that was covered by his/her publications published in a past period. © 2006 IEEE

    On Scripturology

    Get PDF
    In this contribution we present the principles and parameters of a discipline which remains—in our intended meaning—largely yet to be established: scripturology. This discipline concerns the study of different facets of writing, perceived in its generality, as the semiotic apparatus articulating language facts and spatial facts. We refer at the outset to the definition proposed in this volume: “script is a pluricode apparatus having a general usage within a situated human community; its plane..

    A Domain Specific Language for Digital Forensics and Incident Response Analysis

    Get PDF
    One of the longstanding conceptual problems in digital forensics is the dichotomy between the need for verifiable and reproducible forensic investigations, and the lack of practical mechanisms to accomplish them. With nearly four decades of professional digital forensic practice, investigator notes are still the primary source of reproducibility information, and much of it is tied to the functions of specific, often proprietary, tools. The lack of a formal means of specification for digital forensic operations results in three major problems. Specifically, there is a critical lack of: a) standardized and automated means to scientifically verify accuracy of digital forensic tools; b) methods to reliably reproduce forensic computations (their results); and c) framework for inter-operability among forensic tools. Additionally, there is no standardized means for communicating software requirements between users, researchers and developers, resulting in a mismatch in expectations. Combined with the exponential growth in data volume and complexity of applications and systems to be investigated, all of these concerns result in major case backlogs and inherently reduce the reliability of the digital forensic analyses. This work proposes a new approach to the specification of forensic computations, such that the above concerns can be addressed on a scientific basis with a new domain specific language (DSL) called nugget. DSLs are specialized languages that aim to address the concerns of particular domains by providing practical abstractions. Successful DSLs, such as SQL, can transform an application domain by providing a standardized way for users to communicate what they need without specifying how the computation should be performed. This is the first effort to build a DSL for (digital) forensic computations with the following research goals: 1) provide an intuitive formal specification language that covers core types of forensic computations and common data types; 2) provide a mechanism to extend the language that can incorporate arbitrary computations; 3) provide a prototype execution environment that allows the fully automatic execution of the computation; 4) provide a complete, formal, and auditable log of computations that can be used to reproduce an investigation; 5) demonstrate cloud-ready processing that can match the growth in data volumes and complexity

    Automatic indexing : an approach using an index term corpus and combining linguistic and statistical methods

    Get PDF
    This thesis discusses the problems and the methods of finding relevant information in large collections of documents. The contribution of this thesis to this problem is to develop better content analysis methods which can be used to describe document content with index terms. Index terms can be used as meta-information that describes documents, and that is used for seeking information. The main point of this thesis is to illustrate the process of developing an automatic indexer which analyses the content of documents by combining evidence from word frequencies and evidence from linguistic analysis provided by a syntactic parser. The indexer weights the expressions of a text according to their estimated importance for describing the content of a given document on the basis of the content analysis. The typical linguistic features of index terms were explored using a linguistically analysed text collection where the index terms are manually marked up. This text collection is referred to as an index term corpus. Specific features of the index terms provided the basis for a linguistic term-weighting scheme, which was then combined with a frequency-based term-weighting scheme. The use of an index term corpus like this as training material is a new method of developing an automatic indexer. The results of the experiments were promising

    Integrative Levels of Knowing

    Get PDF
    Diese Dissertation beschäftigt sich mit einer systematischen Organisation der epistemologischen Dimension des menschlichen Wissens in Bezug auf Perspektiven und Methoden. Insbesondere wird untersucht inwieweit das bekannte Organisationsprinzip der integrativen Ebenen, das eine Hierarchie zunehmender Komplexität und Integration beschreibt, geeignet ist für eine grundlegende Klassifikation von Perspektiven bzw. epistemischen Bezugsrahmen. Die zentrale These dieser Dissertation geht davon aus, dass eine angemessene Analyse solcher epistemischen Kontexte in der Lage sein sollte, unterschiedliche oder gar konfligierende Bezugsrahmen anhand von kontextübergreifenden Standards und Kriterien vergleichen und bewerten zu können. Diese Aufgabe erfordert theoretische und methodologische Grundlagen, welche die Beschränkungen eines radikalen Kontextualismus vermeiden, insbesondere die ihm innewohnende Gefahr einer Fragmentierung des Wissens aufgrund der angeblichen Inkommensurabilität epistemischer Kontexte. Basierend auf Jürgen Habermas‘ Theorie des kommunikativen Handelns und seiner Methodologie des hermeneutischen Rekonstruktionismus, wird argumentiert, dass epistemischer Pluralismus nicht zwangsläufig zu epistemischem Relativismus führen muss und dass eine systematische Organisation der Perspektivenvielfalt von bereits existierenden Modellen zur kognitiven Entwicklung profitieren kann, wie sie etwa in der Psychologie oder den Sozial- und Kulturwissenschaften rekonstruiert werden. Der vorgestellte Ansatz versteht sich als ein Beitrag zur multi-perspektivischen Wissensorganisation, der sowohl neue analytische Werkzeuge für kulturvergleichende Betrachtungen von Wissensorganisationssystemen bereitstellt als auch neue Organisationsprinzipien vorstellt für eine Kontexterschließung, die dazu beitragen kann die Ausdrucksstärke bereits vorhandener Dokumentationssprachen zu erhöhen. Zudem enthält der Anhang eine umfangreiche Zusammenstellung von Modellen integrativer Wissensebenen.This dissertation is concerned with a systematic organization of the epistemological dimension of human knowledge in terms of viewpoints and methods. In particular, it will be explored to what extent the well-known organizing principle of integrative levels that presents a developmental hierarchy of complexity and integration can be applied for a basic classification of viewpoints or epistemic outlooks. The central thesis pursued in this investigation is that an adequate analysis of such epistemic contexts requires tools that allow to compare and evaluate divergent or even conflicting frames of reference according to context-transcending standards and criteria. This task demands a theoretical and methodological foundation that avoids the limitation of radical contextualism and its inherent threat of a fragmentation of knowledge due to the alleged incommensurability of the underlying frames of reference. Based on Jürgen Habermas’s Theory of Communicative Action and his methodology of hermeneutic reconstructionism, it will be argued that epistemic pluralism does not necessarily imply epistemic relativism and that a systematic organization of the multiplicity of perspectives can benefit from already existing models of cognitive development as reconstructed in research fields like psychology, social sciences, and humanities. The proposed cognitive-developmental approach to knowledge organization aims to contribute to a multi-perspective knowledge organization by offering both analytical tools for cross-cultural comparisons of knowledge organization systems (e.g., Seven Epitomes and Dewey Decimal Classification) and organizing principles for context representation that help to improve the expressiveness of existing documentary languages (e.g., Integrative Levels Classification). Additionally, the appendix includes an extensive compilation of conceptions and models of Integrative Levels of Knowing from a broad multidisciplinary field

    Support for taxonomic data in systematics

    Get PDF
    The Systematics community works to increase our understanding of biological diversity through identifying and classifying organisms and using phylogenies to understand the relationships between those organisms. It has made great progress in the building of phylogenies and in the development of algorithms. However, it has insufficient provision for the preservation of research outcomes and making those widely accessible and queriable, and this is where database technologies can help. This thesis makes a contribution in the area of database usability, by addressing the query needs present in the community, as supported by the analysis of query logs. It formulates clearly the user requirements in the area of phylogeny and classification queries. It then reports on the use of warehousing techniques in the integration of data from many sources, to satisfy those requirements. It shows how to perform query expansion with synonyms and vernacular names, and how to implement hierarchical query expansion effectively. A detailed analysis of the improvements offered by those query expansion techniques is presented. This is supported by the exposition of the database techniques underlying this development, and of the user and programming interfaces (web services) which make this novel development available to both end-users and programs

    From social tagging to polyrepresentation: a study of expert annotating behavior of moving images

    Get PDF
    Mención Internacional en el título de doctorThis thesis investigates “nichesourcing” (De Boer, Hildebrand, et al., 2012), an emergent initiative of cultural heritage crowdsoucing in which niches of experts are involved in the annotating tasks. This initiative is studied in relation to moving image annotation, and in the context of audiovisual heritage, more specifically, within the sector of film archives. The work presents a case study of film and media scholars to investigate the types of annotations and attribute descriptions that they could eventually contribute, as well as the information needs, and seeking and searching behaviors of this group, in order to determine what the role of the different types of annotations in supporting their expert tasks would be. The study is composed of three independent but interconnected studies using a mixed methodology and an interpretive approach. It uses concepts from the information behavior discipline, and the "Integrated Information Seeking and Retrieval Framework" (IS&R) (Ingwersen and Järvelin, 2005) as guidance for the investigation. The findings show that there are several types of annotations that moving image experts could contribute to a nichesourcing initiative, of which time-based tags are only one of the possibilities. The findings also indicate that for the different foci in film and media research, in-depth indexing at the content level is only needed for supporting a specific research focus, for supporting research in other domains, or for engaging broader audiences. The main implications at the level of information infrastructure are the requirement for more varied annotating support, more interoperability among existing metadata standards and frameworks, and the need for guidelines about crowdsoucing and nichesourcing implementation in the audiovisual heritage sector. This research presents contributions to the studies of social tagging applied to moving images, to the discipline of information behavior, by proposing new concepts related to the area of use behavior, and to the concept of “polyrepresentation” (Ingwersen, 1992, 1996) applied to the humanities domain.Esta tesis investiga la iniciativa del nichesourcing (De Boer, Hildebrand, et al., 2012), como una forma de crowdsoucing en sector del patrimonio cultural, en la cuál grupos de expertos participan en las tareas de anotación de las colecciones. El ámbito de aplicación es la anotación de las imágenes en movimiento en el contexto del patrimonio audiovisual, más específicamente, en el caso de los archivos fílmicos. El trabajo presenta un estudio de caso aplicado a un dominio específico de expertos en el ámbito audiovisual: los académicos de cine y medios. El análisis se centra en dos aspectos específicos del problema: los tipos de anotaciones y atributos en las descripciones que podrían obtenerse de este nicho de expertos; y en las necesidades de información y el comportamiento informacional de dicho grupo, con el fin de determinar cuál es el rol de los diferentes tipos de anotaciones en sus tareas de investigación. La tesis se compone de tres estudios independientes e interconectados; se usa una metodología mixta e interpretativa. El marco teórico se compone de conceptos del área de estudios de comportamiento informacional (“information behavior”) y del “Marco integrado de búsqueda y recuperación de la información” ("Integrated Information Seeking and Retrieval Framework" (IS&R)) propuesto por Ingwersen y Järvelin (2005), que sirven de guía para la investigación. Los hallazgos indican que existen diversas formas de anotación de la imagen en movimiento que podrían generarse a partir de las contribuciones de expertos, de las cuáles las etiquetas a nivel de plano son sólo una de las posibilidades. Igualmente, se identificaron diversos focos de investigación en el área académica de cine y medios. La indexación detallada de contenidos sólo es requerida por uno de esos grupos y por investigadores de otras disciplinas, o como forma de involucrar audiencias más amplias. Las implicaciones más relevantes, a nivel de la infraestructura informacional, se refieren a los requisitos de soporte a formas más variadas de anotación, el requisito de mayor interoperabilidad de los estándares y marcos de metadatos, y la necesidad de publicación de guías de buenas prácticas sobre de cómo implementar iniciativas de crowdsoucing o nichesourcing en el sector del patrimonio audiovisual. Este trabajo presenta aportes a la investigación sobre el etiquetado social aplicado a las imágenes en movimiento, a la disciplina de estudios del comportamiento informacional, a la que se proponen nuevos conceptos relacionados con el área de uso de la información, y al concepto de “poli-representación” (Ingwersen, 1992, 1996) en las disciplinas humanísticas.Programa Oficial de Doctorado en Documentación: Archivos y Bibliotecas en el Entorno DigitalPresidente: Peter Emil Rerup Ingwersen.- Secretario: Antonio Hernández Pérez.- Vocal: Nils Phar
    corecore