405 research outputs found

    Analyzing Large Collections of Electronic Text Using OLAP

    Full text link
    Computer-assisted reading and analysis of text has various applications in the humanities and social sciences. The increasing size of many electronic text archives has the advantage of a more complete analysis but the disadvantage of taking longer to obtain results. On-Line Analytical Processing is a method used to store and quickly analyze multidimensional data. By storing text analysis information in an OLAP system, a user can obtain solutions to inquiries in a matter of seconds as opposed to minutes, hours, or even days. This analysis is user-driven allowing various users the freedom to pursue their own direction of research

    The Symbiotic Relationship Between Information Retrieval and Informetrics

    Get PDF
    Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other

    A Review of Theory and Practice in Scientometrics

    Get PDF
    Scientometrics is the study of the quantitative aspects of the process of science as a communication system. It is centrally, but not only, concerned with the analysis of citations in the academic literature. In recent years it has come to play a major role in the measurement and evaluation of research performance. In this review we consider: the historical development of scientometrics, sources of citation data, citation metrics and the “laws" of scientometrics, normalisation, journal impact factors and other journal metrics, visualising and mapping science, evaluation and policy, and future developments

    The Structure and Dynamics of Co-Citation Clusters: A Multiple-Perspective Co-Citation Analysis

    Full text link
    A multiple-perspective co-citation analysis method is introduced for characterizing and interpreting the structure and dynamics of co-citation clusters. The method facilitates analytic and sense making tasks by integrating network visualization, spectral clustering, automatic cluster labeling, and text summarization. Co-citation networks are decomposed into co-citation clusters. The interpretation of these clusters is augmented by automatic cluster labeling and summarization. The method focuses on the interrelations between a co-citation cluster's members and their citers. The generic method is applied to a three-part analysis of the field of Information Science as defined by 12 journals published between 1996 and 2008: 1) a comparative author co-citation analysis (ACA), 2) a progressive ACA of a time series of co-citation networks, and 3) a progressive document co-citation analysis (DCA). Results show that the multiple-perspective method increases the interpretability and accountability of both ACA and DCA networks.Comment: 33 pages, 11 figures, 10 tables. To appear in the Journal of the American Society for Information Science and Technolog

    A Super-Dimension Approach in ROLAP Environments

    Get PDF
    Often the designer of ROLAP applications follows up with the question “can I create a little joiner table with just the two dimension keys and then connect that table to the fact table?” In a classic dimensional model there are two options - (a) both dimensions are modeled independently or (b) two dimensions are combined into a super-dimension with a single key. The second approach is not widely used in ROLAP environments but it is an important sparsity handling method in MOLAP systems. In ROLAP this design technique can also bring storage and performance benefits, although the model becomes more complicated. The dependency between dimensions is a key factor that the designers have to consider when choosing between the two options. In this paper we present the results of our storage and performance experiments over a real life data cubes in reference to these design approaches. Some conclusions are drawn
    • 

    corecore