864 research outputs found
In Pursuit of a Good Glass and Good Company
While glass appears rather homogeneous compared to ceramics and pipes, these small bits of amorphous solid silica can still reveal hidden information when aspects of their chemical composition are tested using a means as simple as short-wave UV light or as complex as X-Ray Fluorescence. Using short-wave UV light and a comparative approach, this thesis reevaluates archaeological table glass collections from Southern Maryland and the Northern Neck of Virginia dating from the mid-17th century to the early 18th century to find evidence for the presence and absence of English lead glass (flint glass). Using these data, the patterns in access, acquisition, and use of glass tableware in this Chesapeake region show a steep difference in the occurrence of lead glass in assemblages before and after the turn of the 18th century. Before 1700, lead glass at these sites tends to comprise less than half the tableware assemblages, yet on sites with occupations extending into the 18th century, more than three quarters of the glassware contains lead. Some inhabitants of this region may have begun consuming English lead glass by the 1680s, primarily in the form of drinking glasses and other beverage related tableware. By the 1690s, lead glass was taking over table space, and by 1700, it was the dominant type of glass tableware
Tailoring r-index for Document Listing Towards Metagenomics Applications
A basic problem in metagenomics is to assign a sequenced read to the correct species in the reference collection. In typical applications in genomic epidemiology and viral metagenomics the reference collection consists of a set of species with each species represented by its highly similar strains. It has been recently shown that accurate read assignment can be achieved with k-mer hashing-based pseudoalignment: a read is assigned to species A if each of its k-mer hits to a reference collection is located only on strains of A. We study the underlying primitives required in pseudoalignment and related tasks. We propose three space-efficient solutions building upon the document listing with frequencies problem. All the solutions use an r-index (Gagie et al., SODA 2018) as an underlying index structure for the text obtained as concatenation of the set of species, as well as for each species. Given t species whose concatenation length is n, and whose Burrows-Wheeler transform contains r runs, our first solution, based on a grammar-compressed document array with precomputed queries at non terminal symbols, reports the frequencies for the distinct documents in which the pattern of length m occurs in time. Our second solution is also based on a grammar-compressed document array, but enhanced with bitvectors and reports the frequencies in time, over a machine with wordsize w. Our third solution, based on the interleaved LCP array, answers the same query in time. We implemented our solutions and tested them on real-world and synthetic datasets. The results show that all the solutions are fast on highly-repetitive data, and the size overhead introduced by the indexes are comparable with the size of the r-index.Peer reviewe
UFSS 2020 Program Book
The 2020 Urban Food Systems Symposium (UFSS) Nourishing Cities in a Changing Climate was held to bring together a national and international audience of academic and research-oriented professionals to share and gain knowledge on urban food systems and the role they play in a changing climate. The symposium included knowledge on: urban agricultural production, local food systems distribution, urban farmer education, urban agriculture policy, planning and development, food access and justice, and food sovereignty. The program book provides the full program of plenary talks, concurrent oral and poster sessions,for the 2020 UFSS
Recommended from our members
Text Indexing for Long Patterns: Anchors are All you Need
PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at https://github.com/lorrainea/BDA- index.Copyright © 2023 the owner/author(s). In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such string datasets in a compact form but also to simultaneously enable fast pattern matching queries. This is the classic text indexing problem. The four absolute measures anyone should pay attention to when designing or implementing a text index are: (i) index space; (ii) query time; (iii) construction space; and (iv) construction time. Unfortunately, however, most (if not all) widely-used indexes (e.g., suffix tree, suffix array, or their compressed counterparts) are not optimized for all four measures simultaneously, as it is difficult to have the best of all four worlds. Here, we take an important step in this direction by showing that text indexing with locally consistent anchors (lc-anchors) offers remarkably good performance in all four measures, when we have at hand a lower bound l on the length of the queried patterns --- which is arguably a quite reasonable assumption in practical applications. Specifically, we improve on the construction of the index proposed by Loukides and Pissis, which is based on bidirectional string anchors (bd-anchors), a new type of lc-anchors, by: (i) designing an average-case linear-time algorithm to compute bd-anchors; and (ii) developing a semi-external-memory implementation to construct the index in small space using near-optimal work. We then present an extensive experimental evaluation, based on the four measures, using real benchmark datasets. The results show that, for long patterns, the index constructed using our improved algorithms compares favorably to all classic indexes: (compressed) suffix tree; (compressed) suffix array; and the FM-index.European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreements No 872539 and 956229, respectively; and by UKRI through REPHRAIN (EP/V011189/1)
On Complexity of 1-Center in Various Metrics
We consider the classic 1-center problem: Given a set P of n points in a
metric space find the point in P that minimizes the maximum distance to the
other points of P. We study the complexity of this problem in d-dimensional
-metrics and in edit and Ulam metrics over strings of length d. Our
results for the 1-center problem may be classified based on d as follows.
Small d: We provide the first linear-time algorithm for 1-center
problem in fixed-dimensional metrics. On the other hand, assuming the
hitting set conjecture (HSC), we show that when , no
subquadratic algorithm can solve 1-center problem in any of the
-metrics, or in edit or Ulam metrics.
Large d. When , we extend our conditional lower bound
to rule out sub quartic algorithms for 1-center problem in edit metric
(assuming Quantified SETH). On the other hand, we give a
-approximation for 1-center in Ulam metric with running time
.
We also strengthen some of the above lower bounds by allowing approximations
or by reducing the dimension d, but only against a weaker class of algorithms
which list all requisite solutions. Moreover, we extend one of our hardness
results to rule out subquartic algorithms for the well-studied 1-median problem
in the edit metric, where given a set of n strings each of length n, the goal
is to find a string in the set that minimizes the sum of the edit distances to
the rest of the strings in the set
How Fast Can We Play Tetris Greedily With Rectangular Pieces?
Consider a variant of Tetris played on a board of width and infinite
height, where the pieces are axis-aligned rectangles of arbitrary integer
dimensions, the pieces can only be moved before letting them drop, and a row
does not disappear once it is full. Suppose we want to follow a greedy
strategy: let each rectangle fall where it will end up the lowest given the
current state of the board. To do so, we want a data structure which can always
suggest a greedy move. In other words, we want a data structure which maintains
a set of rectangles, supports queries which return where to drop the
rectangle, and updates which insert a rectangle dropped at a certain position
and return the height of the highest point in the updated set of rectangles. We
show via a reduction to the Multiphase problem [P\u{a}tra\c{s}cu, 2010] that on
a board of width , if the OMv conjecture [Henzinger et al., 2015]
is true, then both operations cannot be supported in time
simultaneously. The reduction also implies polynomial bounds from the 3-SUM
conjecture and the APSP conjecture. On the other hand, we show that there is a
data structure supporting both operations in time on
boards of width , matching the lower bound up to a factor.Comment: Correction of typos and other minor correction
- …