Search CORE

447 research outputs found

Regrouping metric-space search index for search engine size adaptation

Author: AN Papadopoulos
D Novak
D Novak
E Chávez
M Marin
UV Catalyurek
V Gil-Costa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/10/2015
Field of study

This work contributes to the development of search engines that self-adapt their size in response to fluctuations in workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computational resources to or from the engine. In this paper, we focus on the problem of regrouping the metric-space search index when the number of virtual machines used to run the search engine is modified to reflect changes in workload. We propose an algorithm for incrementally adjusting the index to fit the varying number of virtual machines. We tested its performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud, while calibrating the results to compensate for the performance fluctuations of the platform. Our experiments show that, when compared with computing the index from scratch, the incremental algorithm speeds up the index computation 2–10 times while maintaining a similar search performance

Crossref

Aston Publications Explorer

Self-adapting parallel metric-space search engine for variable query loads

Author: Al Ruqeishi Khalil
Publication venue
Publication date
Field of study

This research focuses on automatically adapting a search engine size in response to fluctuations in query workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computer resources to or from the engine. Our solution is to contribute an adaptive search engine that will repeatedly re-evaluate its load and, when appropriate, switch over to a dierent number of active processors. We focus on three aspects and break them out into three sub-problems as follows: Continually determining the Number of Processors (CNP), New Grouping Problem (NGP) and Regrouping Order Problem (ROP). CNP means that (in the light of the changes in the query workload in the search engine) there is a problem of determining the ideal number of processors p active at any given time to use in the search engine and we call this problem CNP. NGP happens when changes in the number of processors are determined and it must also be determined which groups of search data will be distributed across the processors. ROP is how to redistribute this data onto processors while keeping the engine responsive and while also minimising the switchover time and the incurred network load. We propose solutions for these sub-problems. For NGP we propose an algorithm for incrementally adjusting the index to t the varying number of virtual machines. For ROP we present an ecient method for redistributing data among processors while keeping the search engine responsive. Regarding the solution for CNP, we propose an algorithm determining the new size of the search engine by re-evaluating its load. We tested the solution performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud. Our experiments show that when we compare our NGP solution with computing the index from scratch, the incremental algorithm speeds up the index computation 2{10 times while maintaining a similar search performance. The chosen redistribution method is 25% to 50% faster than other methods and reduces the network load around by 30%. For CNP we present a deterministic algorithm that shows a good ability to determine a new size of search engine. When combined, these algorithms give an adapting algorithm that is able to adjust the search engine size with a variable workload

Aston Publications Explorer

Indexing Metric Spaces for Exact Similarity Search

Author: Chen Lu
Gao Yunjun
Jensen Christian S.
Li Zheng
Miao Xiaoye
Song Xuan
Zhu Yifan
Publication venue
Publication date: 07/05/2020
Field of study

With the continued digitalization of societal processes, we are seeing an explosion in available data. This is referred to as big data. In a research setting, three aspects of the data are often viewed as the main sources of challenges when attempting to enable value creation from big data: volume, velocity and variety. Many studies address volume or velocity, while much fewer studies concern the variety. Metric space is ideal for addressing variety because it can accommodate any type of data as long as its associated distance notion satisfies the triangle inequality. To accelerate search in metric space, a collection of indexing techniques for metric data have been proposed. However, existing surveys each offers only a narrow coverage, and no comprehensive empirical study of those techniques exists. We offer a survey of all the existing metric indexes that can support exact similarity search, by i) summarizing all the existing partitioning, pruning and validation techniques used for metric indexes, ii) providing the time and storage complexity analysis on the index construction, and iii) report on a comprehensive empirical comparison of their similarity query processing performance. Here, empirical comparisons are used to evaluate the index performance during search as it is hard to see the complexity analysis differences on the similarity query processing and the query performance depends on the pruning and validation abilities related to the data distribution. This article aims at revealing different strengths and weaknesses of different indexing techniques in order to offer guidance on selecting an appropriate indexing technique for a given setting, and directing the future research for metric indexes

arXiv.org e-Print Archive

VBN

NASA Tech Briefs, September 2008

Author
Publication venue
Publication date
Field of study

Topics covered include: Nanotip Carpets as Antireflection Surfaces; Nano-Engineered Catalysts for Direct Methanol Fuel Cells; Capillography of Mats of Nanofibers; Directed Growth of Carbon Nanotubes Across Gaps; High-Voltage, Asymmetric-Waveform Generator; Magic-T Junction Using Microstrip/Slotline Transitions; On-Wafer Measurement of a Silicon-Based CMOS VCO at 324 GHz; Group-III Nitride Field Emitters; HEMT Amplifiers and Equipment for their On-Wafer Testing; Thermal Spray Formation of Polymer Coatings; Improved Gas Filling and Sealing of an HC-PCF; Making More-Complex Molecules Using Superthermal Atom/Molecule Collisions; Nematic Cells for Digital Light Deflection; Improved Silica Aerogel Composite Materials; Microgravity, Mesh-Crawling Legged Robots; Advanced Active-Magnetic-Bearing Thrust- Measurement System; Thermally Actuated Hydraulic Pumps; A New, Highly Improved Two-Cycle Engine; Flexible Structural-Health-Monitoring Sheets; Alignment Pins for Assembling and Disassembling Structures; Purifying Nucleic Acids from Samples of Extremely Low Biomass; Adjustable-Viewing-Angle Endoscopic Tool for Skull Base and Brain Surgery; UV-Resistant Non-Spore-Forming Bacteria From Spacecraft-Assembly Facilities; Hard-X-Ray/Soft-Gamma-Ray Imaging Sensor Assembly for Astronomy; Simplified Modeling of Oxidation of Hydrocarbons; Near-Field Spectroscopy with Nanoparticles Deposited by AFM; Light Collimator and Monitor for a Spectroradiometer; Hyperspectral Fluorescence and Reflectance Imaging Instrument; Improving the Optical Quality Factor of the WGM Resonator; Ultra-Stable Beacon Source for Laboratory Testing of Optical Tracking; Transmissive Diffractive Optical Element Solar Concentrators; Delaying Trains of Short Light Pulses in WGM Resonators; Toward Better Modeling of Supercritical Turbulent Mixing; JPEG 2000 Encoding with Perceptual Distortion Control; Intelligent Integrated Health Management for a System of Systems; Delay Banking for Managing Air Traffic; and Spline-Based Smoothing of Airfoil Curvatures

NASA Technical Reports Server

Anisotropic Adaptation on Unstructured Grids

Author: Xia Guoping
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2003
Field of study

The efficient representation of the highly directional features in a flow field with adapted anisotropic grids forms the focus of the analysis. Anisotropic adaptation is more effective than isotropic adaptation and requires more degrees of freedom from the mesh, which also demands the use of unstructured grids in the adaptation. The size and orientation of an anisotropic element require a matrix-like local feature indicator. The Hessian, a matrix composed of the second derivatives of an appropriate flow variable, is defined and used as a feature indicator in the adaptation. The Hessian provides a metric that defines the length of an edge and the lengths of all edges are equal in the optimized mesh. The techniques to minimize the differences among edge lengths are discussed and those chosen include node enrichment, node removal, edge swapping and point smoothing. The results indicate that the mesh in which the edge lengths are equalized is not correct for three major flow features one frequently encounters. The inflections existing near the wall in a boundary layer result in coarse grids there. A “wall” Hessian is defined to replace the second derivatives and give a more appropriate spacing for high Reynolds number flow modeling. Difficulties in the adaptation of discontinuities are addressed. Remedies proposed are to limit the minimum physical edge length and smooth the Hessian such that the discontinuity refinement encompasses more layers of elements. The methodology to refine the discontinuity equally is also proposed. The invalidity of the Hessian in a free stream is corrected to give a reasonable grid size in that region. The concepts involved in the extension of the length-based approach to three dimensions are addressed. The difference and difficulties in three-dimensional adaptation are discussed

University of Tennessee, Knoxville: Trace

Intelligent Web Services Architecture Evolution Via An Automated Learning-Based Refactoring Framework

Author: Wang Hanzhang
Publication venue
Publication date: 08/12/2017
Field of study

Architecture degradation can have fundamental impact on software quality and productivity, resulting in inability to support new features, increasing technical debt and leading to significant losses. While code-level refactoring is widely-studied and well supported by tools, architecture-level refactorings, such as repackaging to group related features into one component, or retrofitting files into patterns, remain to be expensive and risky. Serval domains, such as Web services, heavily depend on complex architectures to design and implement interface-level operations, provided by several companies such as FedEx, eBay, Google, Yahoo and PayPal, to the end-users. The objectives of this work are: (1) to advance our ability to support complex architecture refactoring by explicitly defining Web service anti-patterns at various levels of abstraction, (2) to enable complex refactorings by learning from user feedback and creating reusable/personalized refactoring strategies to augment intelligent designers’ interaction that will guide low-level refactoring automation with high-level abstractions, and (3) to enable intelligent architecture evolution by detecting, quantifying, prioritizing, fixing and predicting design technical debts. We proposed various approaches and tools based on intelligent computational search techniques for (a) predicting and detecting multi-level Web services antipatterns, (b) creating an interactive refactoring framework that integrates refactoring path recommendation, design-level human abstraction, and code-level refactoring automation with user feedback using interactive mutli-objective search, and (c) automatically learning reusable and personalized refactoring strategies for Web services by abstracting recurring refactoring patterns from Web service releases. Based on empirical validations performed on both large open source and industrial services from multiple providers (eBay, Amazon, FedEx and Yahoo), we found that the proposed approaches advance our understanding of the correlation and mutual impact between service antipatterns at different levels, revealing when, where and how architecture-level anti-patterns the quality of services. The interactive refactoring framework enables, based on several controlled experiments, human-based, domain-specific abstraction and high-level design to guide automated code-level atomic refactoring steps for services decompositions. The reusable refactoring strategy packages recurring refactoring activities into automatable units, improving refactoring path recommendation and further reducing time-consuming and error-prone human intervention.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/142810/1/Wang Final Dissertation.pdfDescription of Wang Final Dissertation.pdf : Dissertatio

Deep Blue Documents at the University of Michigan

Database Optimization Aspects for Information Retrieval

Author: Blok H.E.
Publication venue: Twente University Press
Publication date: 01/01/2002
Field of study

There is a growing need for systems that can process queries, combining both structured data and text. One way to provide such functionality is to integrate information retrieval (IR) techniques in a database management system (DBMS). However, both IR and database research have been separate research fields for decades, resulting in different - even conflicting - approaches to data management. Each DBMS has a component called a "query optimizer", which plays a crucial role in the efficiency and flexibility of the system. So, for successful integration the IR techniques and data structures, as well as the DBMS query optimizer, should be adapted to enable mutual cooperation. The author concentrates on top-N queries - a common class of IR queries. An IR top-N query asks for the N best documents given a set of keywords. The author proposes processing the data in batches as a compromise between IR and DBMS query processing. Experiments with this technique show that porting IR optimization techniques is (still) not a promising option due to the additional administrative overhead. Two new mathematical models are introduced to eliminate this overhead: a model that predicts selectivity, which is a crucial factor in the execution costs, and a model that predicts the quality of the top-N

University of Twente Research Information

An adaptive multi-population differential artificial bee colony algorithm for many-objective service composition in cloud manufacturing

Author: Chan Felix T.S.
Li Yun
Lin Yingzi
Yao Xifan
Zhou Jiajun
Publication venue: 'Elsevier BV'
Publication date: 31/08/2018
Field of study

Several conflicting criteria must be optimized simultaneously during the service composition and optimal selection (SCOS) in cloud manufacturing, among which tradeoff optimization regarding the quality of the composite services is a key issue in successful implementation of manufacturing tasks. This study improves the artificial bee colony (ABC) algorithm by introducing a synergetic mechanism for food source perturbation, a new diversity maintenance strategy, and a novel computing resources allocation scheme to handle complicated many-objective SCOS problems. Specifically, differential evolution (DE) operators with distinct search behaviors are integrated into the ABC updating equation to enhance the level of information exchange between the foraging bees, and the control parameters for reproduction operators are adapted independently. Meanwhile, a scalarization based approach with active diversity promotion is used to enhance the selection pressure. In this proposal, multiple size adjustable subpopulations evolve with distinct reproduction operators according to the utility of the generating offspring so that more computational resources will be allocated to the better performing reproduction operators. Experiments for addressing benchmark test instances and SCOS problems indicate that the proposed algorithm has a competitive performance and scalability behavior compared with contesting algorithms

University of Strathclyde Institutional Repository