Search CORE

165 research outputs found

Tupleware: Redefining Modern Analytics

Author: Cetintemel Ugur
Crotty Andrew
Dursun Kayhan
Galakatos Alex
Kraska Tim
Zdonik Stan
Publication venue
Publication date: 30/07/2014
Field of study

There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

arXiv.org e-Print Archive

CiteSeerX

Bayesian Classifiers Programmed In SQL Using PCA

Author: Dr. K.Venkat Nagarjuna
Publication venue: Global Journals Inc. (US)
Publication date: 15/03/2012
Field of study

The Bayesian classifier is a fundamental classification technique We also consider different concepts regarding Dimensionality Reduction techniques for retrieving lossless data In this paper we proposed a new architecture for pre-processing the data Here we improved our Bayesian classifier to produce more accurate models with skewed distributions data sets with missing information and subsets of points having significant overlap with each other which are known issues for clustering algorithms so we are interested in combining Dimensionality Reduction technique like PCA with Bayesian Classifiers to accelerate computations and evaluate complex mathematical equations The proposed architecture in this project contains the following stages pre-processing of input data Na ve Bayesian classifier Bayesian classifier Principal component analysis and database Principal Component Analysis PCA is the process of reducing components by calculating Eigen values and Eigen Vectors We consider two algorithms in this paper Bayesian Classifier based on KMeans BKM and Na ve Bayesian Classifier Algorithm N

Global Journal of Computer Science and Technology (GJCST)

Type Ahead Search in Database using SQL

Author: Salunke Shrikant Dadasaheb, Prof. Bere Sachin Sukhadeo
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 26/02/2015
Field of study

A type ahead search system computes answers on the fly as a user types in a keyword query character by character. We are going to study how to support type ahead search on data in a relational DBMS. We focus on how to help this type of search using the SQL. A prominent task that tests is how to influence existing database functionalities to meet the high performance to achieve an interactive speed. We extended the efficient way to the case of fuzzy queries, and suggested various techniques to improve query performance. We suggested incremental computation method to answer multi keyword queries, and calculated how to support first N queries and incremental updates. Our experimental results on large and real data sets showed that the proposed techniques can enables DBMS systems to support search as you type on large tables. DOI: 10.17762/ijritcc2321-8169.15024

International Journal on Recent and Innovation Trends in Computing and Communication