17,943 research outputs found
Interactive visual exploration of a large spatio-temporal dataset: Reflections on a geovisualization mashup
Exploratory visual analysis is useful for the preliminary investigation of large structured, multifaceted spatio-temporal datasets. This process requires the selection and aggregation of records by time, space and attribute, the ability to transform data and the flexibility to apply appropriate visual encodings and interactions. We propose an approach inspired by geographical 'mashups' in which freely-available functionality and data are loosely but flexibly combined using de facto exchange standards. Our case study combines MySQL, PHP and the LandSerf GIS to allow Google Earth to be used for visual synthesis and interaction with encodings described in KML. This approach is applied to the exploration of a log of 1.42 million requests made of a mobile directory service. Novel combinations of interaction and visual encoding are developed including spatial 'tag clouds', 'tag maps', 'data dials' and multi-scale density surfaces. Four aspects of the approach are informally evaluated: the visual encodings employed, their success in the visual exploration of the clataset, the specific tools used and the 'rnashup' approach. Preliminary findings will be beneficial to others considering using mashups for visualization. The specific techniques developed may be more widely applied to offer insights into the structure of multifarious spatio-temporal data of the type explored here
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
Measuring and Managing Answer Quality for Online Data-Intensive Services
Online data-intensive services parallelize query execution across distributed
software components. Interactive response time is a priority, so online query
executions return answers without waiting for slow running components to
finish. However, data from these slow components could lead to better answers.
We propose Ubora, an approach to measure the effect of slow running components
on the quality of answers. Ubora randomly samples online queries and executes
them twice. The first execution elides data from slow components and provides
fast online answers; the second execution waits for all components to complete.
Ubora uses memoization to speed up mature executions by replaying network
messages exchanged between components. Our systems-level implementation works
for a wide range of platforms, including Hadoop/Yarn, Apache Lucene, the
EasyRec Recommendation Engine, and the OpenEphyra question answering system.
Ubora computes answer quality much faster than competing approaches that do not
use memoization. With Ubora, we show that answer quality can and should be used
to guide online admission control. Our adaptive controller processed 37% more
queries than a competing controller guided by the rate of timeouts.Comment: Technical Repor
Implementation of an efficient Fuzzy Logic based Information Retrieval System
This paper exemplifies the implementation of an efficient Information
Retrieval (IR) System to compute the similarity between a dataset and a query
using Fuzzy Logic. TREC dataset has been used for the same purpose. The dataset
is parsed to generate keywords index which is used for the similarity
comparison with the user query. Each query is assigned a score value based on
its fuzzy similarity with the index keywords. The relevant documents are
retrieved based on the score value. The performance and accuracy of the
proposed fuzzy similarity model is compared with Cosine similarity model using
Precision-Recall curves. The results prove the dominance of Fuzzy Similarity
based IR system.Comment: arXiv admin note: substantial text overlap with
http://ntz-develop.blogspot.in/ ,
http://www.micsymposium.org/mics2012/submissions/mics2012_submission_8.pdf ,
http://www.slideshare.net/JeffreyStricklandPhD/predictive-modeling-and-analytics-selectchapters-41304405
by other author
Collaborative e-science architecture for Reaction Kinetics research community
This paper presents a novel collaborative e-science architecture (CeSA) to address two challenging issues in e-science that arise from the management of heterogeneous distributed environments: (i) how to provide individual scientists an integrated environment to collaborate with each other in distributed, loosely coupled research communities where each member might be using a disparate range of tools; and (ii) how to provide easy access to a range of computationally intensive resources from a desktop. The Reaction Kinetics research community was used to capture the requirements and in the evaluation of the proposed architecture. The result demonstrated the feasibility of the approach and the potential benefits of the CeSA
A Case for Cooperative and Incentive-Based Coupling of Distributed Clusters
Research interest in Grid computing has grown significantly over the past
five years. Management of distributed resources is one of the key issues in
Grid computing. Central to management of resources is the effectiveness of
resource allocation as it determines the overall utility of the system. The
current approaches to superscheduling in a grid environment are non-coordinated
since application level schedulers or brokers make scheduling decisions
independently of the others in the system. Clearly, this can exacerbate the
load sharing and utilization problems of distributed resources due to
suboptimal schedules that are likely to occur. To overcome these limitations,
we propose a mechanism for coordinated sharing of distributed clusters based on
computational economy. The resulting environment, called
\emph{Grid-Federation}, allows the transparent use of resources from the
federation when local resources are insufficient to meet its users'
requirements. The use of computational economy methodology in coordinating
resource allocation not only facilitates the QoS based scheduling, but also
enhances utility delivered by resources.Comment: 22 pages, extended version of the conference paper published at IEEE
Cluster'05, Boston, M
Ontological Matchmaking in Recommender Systems
The electronic marketplace offers great potential for the recommendation of
supplies. In the so called recommender systems, it is crucial to apply
matchmaking strategies that faithfully satisfy the predicates specified in the
demand, and take into account as much as possible the user preferences. We
focus on real-life ontology-driven matchmaking scenarios and identify a number
of challenges, being inspired by such scenarios. A key challenge is that of
presenting the results to the users in an understandable and clear-cut fashion
in order to facilitate the analysis of the results. Indeed, such scenarios
evoke the opportunity to rank and group the results according to specific
criteria. A further challenge consists of presenting the results to the user in
an asynchronous fashion, i.e. the 'push' mode, along with the 'pull' mode, in
which the user explicitly issues a query, and displays the results. Moreover,
an important issue to consider in real-life cases is the possibility of
submitting a query to multiple providers, and collecting the various results.
We have designed and implemented an ontology-based matchmaking system that
suitably addresses the above challenges. We have conducted a comprehensive
experimental study, in order to investigate the usability of the system, the
performance and the effectiveness of the matchmaking strategies with real
ontological datasets.Comment: 28 pages, 8 figure
Conceptual information processing: A robust approach to KBS-DBMS integration
Integrating the respective functionality and architectural features of knowledge base and data base management systems is a topic of considerable interest. Several aspects of this topic and associated issues are addressed. The significance of integration and the problems associated with accomplishing that integration are discussed. The shortcomings of current approaches to integration and the need to fuse the capabilities of both knowledge base and data base management systems motivates the investigation of information processing paradigms. One such paradigm is concept based processing, i.e., processing based on concepts and conceptual relations. An approach to robust knowledge and data base system integration is discussed by addressing progress made in the development of an experimental model for conceptual information processing
- …