181 research outputs found
Robust Query Optimization Methods With Respect to Estimation Errors: A Survey
International audienceThe quality of a query execution plan chosen by a Cost-Based Optimizer (CBO) depends greatly on the estimation accuracy of input parameter values. Many research results have been produced on improving the estimation accuracy, but they do not work for every situation. Therefore, "robust query optimization" was introduced, in an effort to minimize the sub-optimality risk by accepting the fact that estimates could be inaccurate. In this survey, we aim to provide an overview of robust query optimization methods by classifying them into different categories, explaining the essential ideas, listing their advantages and limitations, and comparing them with multiple criteria
Data Management Challenges in Cloud Environments
Recently the cloud computing paradigm has been receiving special excitement and attention in the new researches. Cloud computing has the potential to change a large part of the IT activity, making software even more interesting as a service and shaping the way IT hardware is proposed and purchased. Developers with novel ideas for new Internet services no longer require the large capital outlays in hardware to present their service or the human expense to do it. These cloud applications apply large data centers and powerful servers that host Web applications and Web services. This report presents an overview of what cloud computing means, its history along with the advantages and disadvantages. In this paper we describe the problems and opportunities of deploying data management issues on these emerging cloud computing platforms. We study that large scale data analysis jobs, decision support systems, and application specific data marts are more likely to take benefit of cloud computing platforms than operational, transactional database systems.
 
Methods to Improve Applicability and Efficiency of Distributed Data-Centric Compute Frameworks
The success of modern applications depends on the insights they collect from their data repositories. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size, as they collect data from varied sources - web applications, mobile phones, sensors and other connected devices. Distributed storage and data-centric compute frameworks have been invented to store and analyze these large datasets. This dissertation focuses on extending the applicability and improving the efficiency of distributed data-centric compute frameworks
Scalable and responsive real time event processing using cloud computing
PhD ThesisCloud computing provides the potential for scalability and adaptability in a cost e ective
manner. However, when it comes to achieving scalability for real time applications
response time cannot be high. Many applications require good performance and low
response time, which need to be matched with the dynamic resource allocation. The
real time processing requirements can also be characterized by unpredictable rates
of incoming data streams and dynamic outbursts of data. This raises the issue of
processing the data streams across multiple cloud computing nodes. This research
analyzes possible methodologies to process the real time data in which applications
can be structured as multiple event processing networks and be partitioned over the
set of available cloud nodes. The approach is based on queuing theory principles
to encompass the cloud computing. The transformation of the raw data into useful
outputs occurs in various stages of processing networks which are distributed across
the multiple computing nodes in a cloud. A set of valid options is created to understand
the response time requirements for each application. Under a given valid set of
conditions to meet the response time criteria, multiple instances of event processing
networks are distributed in the cloud nodes. A generic methodology to scale-up and
scale-down the event processing networks in accordance to the response time criteria
is de ned. The real time applications that support sophisticated decision support
mechanisms need to comply with response time criteria consisting of interdependent
data
ow paradigms making it harder to improve the performance. Consideration is
given for ways to reduce the latency,improve response time and throughput of the real
time applications by distributing the event processing networks in multiple computing
nodes
- …