405 research outputs found
Analyzing Large Collections of Electronic Text Using OLAP
Computer-assisted reading and analysis of text has various applications in
the humanities and social sciences. The increasing size of many electronic text
archives has the advantage of a more complete analysis but the disadvantage of
taking longer to obtain results. On-Line Analytical Processing is a method used
to store and quickly analyze multidimensional data. By storing text analysis
information in an OLAP system, a user can obtain solutions to inquiries in a
matter of seconds as opposed to minutes, hours, or even days. This analysis is
user-driven allowing various users the freedom to pursue their own direction of
research
The Symbiotic Relationship Between Information Retrieval and Informetrics
Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the authorâs previous work (Wolfram 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other
A Review of Theory and Practice in Scientometrics
Scientometrics is the study of the quantitative aspects of the process of science as a communication system. It is centrally, but not only, concerned with the analysis of citations in the academic literature. In recent years it has come to play a major role in the measurement and evaluation of research performance. In this review we consider: the historical development of scientometrics, sources of citation data, citation metrics and the âlaws" of scientometrics, normalisation, journal impact factors and other journal metrics, visualising and mapping science, evaluation and policy, and future developments
The Structure and Dynamics of Co-Citation Clusters: A Multiple-Perspective Co-Citation Analysis
A multiple-perspective co-citation analysis method is introduced for
characterizing and interpreting the structure and dynamics of co-citation
clusters. The method facilitates analytic and sense making tasks by integrating
network visualization, spectral clustering, automatic cluster labeling, and
text summarization. Co-citation networks are decomposed into co-citation
clusters. The interpretation of these clusters is augmented by automatic
cluster labeling and summarization. The method focuses on the interrelations
between a co-citation cluster's members and their citers. The generic method is
applied to a three-part analysis of the field of Information Science as defined
by 12 journals published between 1996 and 2008: 1) a comparative author
co-citation analysis (ACA), 2) a progressive ACA of a time series of
co-citation networks, and 3) a progressive document co-citation analysis (DCA).
Results show that the multiple-perspective method increases the
interpretability and accountability of both ACA and DCA networks.Comment: 33 pages, 11 figures, 10 tables. To appear in the Journal of the
American Society for Information Science and Technolog
A Super-Dimension Approach in ROLAP Environments
Often the designer of ROLAP applications follows up with the question âcan I create a little joiner table with just the two dimension keys
and then connect that table to the fact table?â In a classic dimensional model there are two options - (a) both dimensions are modeled independently or (b) two dimensions are combined into a super-dimension with a single key. The second approach is not widely used in ROLAP environments but it is an important sparsity handling method in MOLAP systems. In ROLAP this design technique can also bring storage and performance benefits, although the model becomes more complicated. The dependency between dimensions is a key factor that the designers have to consider when choosing between the two options. In this paper we present the results of our storage
and performance experiments over a real life data cubes in reference to these design approaches. Some conclusions are drawn
- âŠ