108,746 research outputs found
Selection and sorting with limited storage
AbstractWhen selecting from, or sorting, a file stored on a read-only tape and the internal storage is rather limited, several passes of the input tape may be required. We study the relation between the amount of internal storage available and the number of passes required to select the Kth highest of N inputs. We show, for example, that to find the median in two passes requires at least ω(N12) and at most O(N12log N) internal storage. For probabilistic methods, θ(N12) internal storage is necessary and sufficient for a single pass method which finds the median with arbitrarily high probability
Systematic reduction of Hyperspectral Images for high-throughput Plastic Characterization
Hyperspectral Imaging (HSI) combines microscopy and spectroscopy to assess
the spatial distribution of spectroscopically active compounds in objects, and
has diverse applications in food quality control, pharmaceutical processes, and
waste sorting. However, due to the large size of HSI datasets, it can be
challenging to analyze and store them within a reasonable digital
infrastructure, especially in waste sorting where speed and data storage
resources are limited. Additionally, as with most spectroscopic data, there is
significant redundancy, making pixel and variable selection crucial for
retaining chemical information. Recent high-tech developments in chemometrics
enable automated and evidence-based data reduction, which can substantially
enhance the speed and performance of Non-Negative Matrix Factorization (NMF), a
widely used algorithm for chemical resolution of HSI data. By recovering the
pure contribution maps and spectral profiles of distributed compounds, NMF can
provide evidence-based sorting decisions for efficient waste management. To
improve the quality and efficiency of data analysis on hyperspectral imaging
(HSI) data, we apply a convex-hull method to select essential pixels and
wavelengths and remove uninformative and redundant information. This process
minimizes computational strain and effectively eliminates highly mixed pixels.
By reducing data redundancy, data investigation and analysis become more
straightforward, as demonstrated in both simulated and real HSI data for
plastic sorting
Histogram-Aware Sorting for Enhanced Word-Aligned Compression in Bitmap Indexes
Bitmap indexes must be compressed to reduce input/output costs and minimize
CPU usage. To accelerate logical operations (AND, OR, XOR) over bitmaps, we use
techniques based on run-length encoding (RLE), such as Word-Aligned Hybrid
(WAH) compression. These techniques are sensitive to the order of the rows: a
simple lexicographical sort can divide the index size by 9 and make indexes
several times faster. We investigate reordering heuristics based on computed
attribute-value histograms. Simply permuting the columns of the table based on
these histograms can increase the sorting efficiency by 40%.Comment: To appear in proceedings of DOLAP 200
Selection from read-only memory with limited workspace
Given an unordered array of elements drawn from a totally ordered set and
an integer in the range from to , in the classic selection problem
the task is to find the -th smallest element in the array. We study the
complexity of this problem in the space-restricted random-access model: The
input array is stored on read-only memory, and the algorithm has access to a
limited amount of workspace. We prove that the linear-time prune-and-search
algorithm---presented in most textbooks on algorithms---can be modified to use
bits instead of words of extra space. Prior to our
work, the best known algorithm by Frederickson could perform the task with
bits of extra space in time. Our result separates
the space-restricted random-access model and the multi-pass streaming model,
since we can surpass the lower bound known for the latter
model. We also generalize our algorithm for the case when the size of the
workspace is bits, where . The running time
of our generalized algorithm is ,
slightly improving over the
bound of Frederickson's algorithm. To obtain the improvements mentioned above,
we developed a new data structure, called the wavelet stack, that we use for
repeated pruning. We expect the wavelet stack to be a useful tool in other
applications as well.Comment: 16 pages, 1 figure, Preliminary version appeared in COCOON-201
Write-limited sorts and joins for persistent memory
To mitigate the impact of the widening gap between the memory needs of CPUs and what standard memory technology can deliver, system architects have introduced a new class of memory technology termed persistent memory. Persistent memory is byteaddressable, but exhibits asymmetric I/O: writes are typically one order of magnitude more expensive than reads. Byte addressability combined with I/O asymmetry render the performance profile of persistent memory unique. Thus, it becomes imperative to find new ways to seamlessly incorporate it into database systems. We do so in the context of query processing. We focus on the fundamental operations of sort and join processing. We introduce the notion of write-limited algorithms that effectively minimize the I/O cost. We give a high-level API that enables the system to dynamically optimize the workflow of the algorithms; or, alternatively, allows the developer to tune the write profile of the algorithms. We present four different techniques to incorporate persistent memory into the database processing stack in light of this API. We have implemented and extensively evaluated all our proposals. Our results show that the algorithms deliver on their promise of I/O-minimality and tunable performance. We showcase the merits and deficiencies of each implementation technique, thus taking a solid first step towards incorporating persistent memory into query processing. 1
Optical Micromanipulation Techniques Combined with Microspectroscopic Methods
Předložená dizertační práce se zabývá kombinací optických mikromanipulací s mikrospektroskopickými metodami. Využili jsme laserovou pinzetu pro transport a třídění živých mikroorganismů, například jednobuněčných řas, či kvasinek. Ramanovskou spektroskopií jsme analyzovali chemické složení jednotlivých buněk a tyto informace jsme využili k automatické selekci buněk s vybranými vlastnostmi. Zkombinovali jsme pulsní amplitudově modulovanou fluorescenční mikrospektroskopii, optické mikromanipulace a jiné techniky ke zmapování stresové odpovědi opticky zachycených buněk při různých časech působení, vlnových délkách a intenzitách chytacího laseru. Vyrobili jsme různé typy mikrofluidních čipů a zkonstruovali jsme Ramanovu pinzetu pro třídění mikro-objektů, především živých buněk, v mikrofluidním prostředí.The subject of the presented Ph.D. thesis is a combination of optical micromanipulation and microspectroscopic methods. We used laser tweezers to transport and sort various living microorganisms, such as microalgal or yeast cells. We employed Raman microspectroscopy to analyze chemical composition of individual cells and we used the information about chemical composition to automatically select the cells of interest. We combined pulsed amplitude modulation fluorescence microspectroscopy, optical micromanipulation and other techniques to map the stress response of cells to various laser wavelengths, intensities and durations of optical trapping. We fabricated microfluidic chips of various designs and we constructed Raman-tweezers sorter of micro-objects such as living cells on a microfluidic platform.
A Database for Fast Access to Particle-Gated Event Data
In nuclear physics experiments involving in-flight fragmentation of ions,
usually a large number of different nuclei is produced and various detection
systems are employed to identify the species event by event, e.g. by measuring
their specific energy loss and time-of-flight. For such cases -- not
necessarily limited to nuclear physics -- where subsets of a large dataset can
be identified using a small number of measured signals a software for fast
access to varying subsets of such a dataset has been developed. The software
has been used successfully in the analysis of a one neutron knock-out
experiment at GANIL
An In-Place Sorting with O(n log n) Comparisons and O(n) Moves
We present the first in-place algorithm for sorting an array of size n that
performs, in the worst case, at most O(n log n) element comparisons and O(n)
element transports.
This solves a long-standing open problem, stated explicitly, e.g., in [J.I.
Munro and V. Raman, Sorting with minimum data movement, J. Algorithms, 13,
374-93, 1992], of whether there exists a sorting algorithm that matches the
asymptotic lower bounds on all computational resources simultaneously
- …