426,635 research outputs found

    Online analytical processing (OLAP)

    Get PDF
    Big Data mining is the capacity of getting valuable data from expansive datasets or floods of information. Huge Data has new elements of 5Vs i.e. Volume, Variety, Velocity, Variability and quality. For Big Data, there is a HACE hypothesis it implies Big Data begins with heterogeneous vast measure of information, self-sufficient sources with circulated and decentralized control and attempt to discover complex and advancing connections among information. Big Data system incorporates three levels for handling i.e. information getting to and figuring (Tier I), information protection and space learning (Tier II) and Big Data mining calculations (Tier III). There are numerous devices for Big Data like Apache Hadoop, Apache Pig, Cascading, Scribe, Apache Base, Apache S4, Storm, Apache Mahout, MOA, R, Vowpal Wabbit and Graph lab. This proposal is pushes to give an answer for enhance the versatility and reaction times of the RDF inquiry motors. The issue is significant to the appearance of the Semantic web, which is still a dream. We target SPARQL which is a RDF inquiry dialect that has been benchmarked by SP2Bench for execution and versatility. Our speculation is based after utilizing a MapReduce model of parallelization for quick and adaptable conveyed SPARQL question motor, which beats the benchmarks genius voided by SP2Bench. We quickly contemplated the current writing to find out about various methodologies that have been utilized by the specialists and enterprises. We developed ARQ, which is a SPARQL motor gave by the Jena system, to utilize a circulated question handling approach taking into account the Hadoop structure, which gives a simple usage of MapReduce. We talked about in point of interest the current Jena ARQ outline and the configuration modifications expected to make it conveyed. We clarified the calculation for Basic Graph Pattern coordinating utilizing a MapReduce model. We have presented novel procedures of enhancing the RDF question motor, which are based upon record ordering and pre-calculation of joins. We assessed our execution and advancement techniques utilizing tests and performed investigation of the outcomes by contrasting it and the SP2Bench benchmarks

    ARRANGEMENT AND MODULATION OF ETL PROCESS IN THE STORAGE

    Get PDF
    Data warehouse (DW) is the basis of systems for operational data analysis (OLAP-Online Analytical Processing). Data extracted from different sources transforms and load in DW. Proper organization of this process, which is called ETL (Extract, Transform, Load) has important significance in creation of DW and analytical data processing. Forms of organization, methods of realization and modeling of ETL processes are considered in this paper.Data warehouse (DW) is the basis of systems for operational data analysis (OLAP-Online Analytical Processing). Data extracted from different sources transforms and load in DW. Proper organization of this process, which is called ETL (Extract, Transform, Load) has important significance in creation of DW and analytical data processing. Forms of organization, methods of realization and modeling of ETL processes are considered in this paper

    Pembangunan Aplikasi Pelaporan Perijinan Dengan Online Analytical Processing (OLAP)

    Full text link
    Business licensing data in a district that has been collected in a database, it will be useful when it is analyzed, so a lot of important information will be obtained. The government need a system that can help them to analyze data easily. Online Analytical Processing (OLAP) is an implementation of Data warehousing that can help reporting and analyzing well. OLAP can map the data with cube dimensions, each dimension can be easily compared, so the decision maker can find the problems that faced easily and quickly. Some of the problems that solved in this research are : Progress Report on Company Registration in several years, the Number of Companies by the type of business, Development of Investment, the number of SIUP by District, , the number of SIUP by business class, the number of license based on its type, Construction Permit (IMB) in several years, and the Number of Disorders Permits (HO) by several classification. Reporting application in data Licensing are built in this research with OLAP technology. Based on the Chi-Square testing performed for business licensing data which contain 100-1161 records, the Chi-Square value are 45,89 – 80, greater than Chi-Square table (13,28), there is significantly different for time consuming between SQL and OLAP. When data are reduced for 90 records, the Chi-Square value is 2.01, less than Chi-Square table, there is no significantly different for time consuming between SQL and OLA

    Sequence Online Analytical Processing System

    Get PDF
    A Sequence Online Analytical Processing (S-Olap) System 50 For Analysing An Event Database (41) Storing Events (12), The System (50) Comprising: An S-Olap Engine (53) To Compute An S-Cuboid (49) For A Query On The Event Database (41); A Sequence Query Engine (54) To Form Part Of The S-Cuboid (49) By Performing The Steps Of: Selection, Clustering, Sequence Formation And Sequence Grouping; A Cuboid Repository (52) To Store Computed S-Cuboids (49) And To Be Searched By The S-Olap Engine (53) For An S-Cuboid Query To Determine Whether An S-Cuboid Has Previously Been Computed; And A Sequence Cache (56) To Cache Constructed Sequence Groups.published_or_final_versio

    The Pole Behaviour of the Phase Derivative of the Short-Time Fourier Transform

    Full text link
    The short-time Fourier transform (STFT) is a time-frequency representation widely used in applications, for example in audio signal processing. Recently it has been shown that not only the amplitude, but also the phase of this representation can be successfully exploited for improved analysis and processing. In this paper we describe a rather peculiar pole phenomenon in the phase derivative, a recurring pattern that appears in a characteristic way in the neighborhood around any of the zeros of the STFT, a negative peak followed by a positive one. We describe this phenomenon numerically and provide a complete analytical explanation.Comment: 15 pages, 4 figures; Applied and Computational Harmonic Analysis (in press), available online 22 October 201

    Online Updating of Statistical Inference in the Big Data Setting

    Full text link
    We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness-of-fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting.Comment: Submitted to Technometric

    An online analytical processing multi-dimensional data warehouse for malaria data

    Get PDF
    Malaria is a vector-borne disease that contributes substantially to the global burden of morbidity and mortality. The management of malaria-related data from heterogeneous, autonomous, and distributed data sources poses unique challenges and requirements. Although online data storage systems exist that address specific malaria-related issues, a globally integrated online resource to address different aspects of the disease does not exist. In this article, we describe the design, implementation, and applications of a multidimensional, online analytical processing data warehouse, named the VecNet Data Warehouse (VecNet-DW). It is the first online, globally-integrated platform that provides efficient search, retrieval and visualization of historical, predictive, and static malaria-related data, organized in data marts. Historical and static data are modelled using star schemas, while predictive data are modelled using a snowflake schema. The major goals, characteristics, and components of the DW are described along with its data taxonomy and ontology, the external data storage systems and the logical modelling and physical design phases. Results are presented as screenshots of a Dimensional Data browser, a Lookup Tables browser, and a Results Viewer interface. The power of the DW emerges from integrated querying of the different data marts and structuring those queries to the desired dimensions, enabling users to search, view, analyse, and store large volumes of aggregated data, and responding better to the increasing demands of users
    • …
    corecore