5,695 research outputs found

    Data Cube Approximation and Mining using Probabilistic Modeling

    Get PDF
    On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches

    Techniques of Translating Thesis Abstracts of Economics Department Students in Medan State University

    Full text link
    The study deals with the techniques of translation on thesis abstracts inEconomics Department. The objectives of study were to identify the types oftranslation techniques, to find out the most dominant type of translationtechniques used, and to describe the reasons of the translation techniques used intranslating thesis abstract. The study used descriptive qualitative design.Nazir(1998: 34) states that descriptive qualitative is a method of research thatmakes the description of the situation of events or occurrences clearer. It isunderstood that descriptive qualitative is a method of research which provides thedescription of situation, events or occurrences, so this method is an intention toaccumulate the basic data. Qualitative research involves analysis of data such aswords and phrases written in abstracts. The data were taken from twentytranslated thesis abstracts of Economic Department. The findings show that therewere eight techniques of eighteen techniques used in thesis abstracts. The mostdominant type of translation techniques was established equivalent due to thetranslator intention to avoid misunderstanding by using the dictionaries andparticular equivalent known by target language. It is recommended that in doingany translation, the most essential thing is to keep the meaning or the message ofthe source language remains the same when it is being translated into the targetlanguage

    Speeding-up qq-gram mining on grammar-based compressed texts

    Full text link
    We present an efficient algorithm for calculating qq-gram frequencies on strings represented in compressed form, namely, as a straight line program (SLP). Given an SLP T\mathcal{T} of size nn that represents string TT, the algorithm computes the occurrence frequencies of all qq-grams in TT, by reducing the problem to the weighted qq-gram frequencies problem on a trie-like structure of size m=Tdup(q,T)m = |T|-\mathit{dup}(q,\mathcal{T}), where dup(q,T)\mathit{dup}(q,\mathcal{T}) is a quantity that represents the amount of redundancy that the SLP captures with respect to qq-grams. The reduced problem can be solved in linear time. Since m=O(qn)m = O(qn), the running time of our algorithm is O(min{Tdup(q,T),qn})O(\min\{|T|-\mathit{dup}(q,\mathcal{T}),qn\}), improving our previous O(qn)O(qn) algorithm when q=Ω(T/n)q = \Omega(|T|/n)
    corecore