382,506 research outputs found
String Searching with Ranking Constraints and Uncertainty
Strings play an important role in many areas of computer science. Searching pattern in a string or string collection is one of the most classic problems. Different variations of this problem such as document retrieval, ranked document retrieval, dictionary matching has been well studied. Enormous growth of internet, large genomic projects, sensor networks, digital libraries necessitates not just efficient algorithms and data structures for the general string indexing, but indexes for texts with fuzzy information and support for queries with different constraints. This dissertation addresses some of these problems and proposes indexing solutions. One such variation is document retrieval query for included and excluded/forbidden patterns, where the objective is to retrieve all the relevant documents that contains the included patterns and does not contain the excluded patterns. We continue the previous work done on this problem and propose more efficient solution. We conjecture that any significant improvement over these results is highly unlikely. We also consider the scenario when the query consists of more than two patterns. The forbidden pattern problem suffers from the drawback that linear space (in words) solutions are unlikely to yield a solution better than O(root(n/occ)) per document reporting time, where n is the total length of the documents and occ is the number of output documents. Continuing this path, we introduce a new variation, namely document retrieval with forbidden extension query, where the forbidden pattern is an extension of the included pattern.We also address the more general top-k version of the problem, which retrieves the top k documents, where the ranking is based on PageRank relevance metric. This problem finds motivation from search applications. It also holds theoretical interest as we show that the hardness of forbidden pattern problem is alleviated in this problem. We achieve linear space and optimal query time for this variation. We also propose succinct indexes for both these problems. Position restricted pattern matching considers the scenario where only part of the text is searched. We propose succinct index for this problem with efficient query time. An important application for this problem stems from searching in genomic sequences, where only part of the gene sequence is searched for interesting patterns. The problem of computing discriminating(resp. generic) words is to report all minimal(resp. maximal) extensions of a query pattern which are contained in at most(resp. at least) a given number of documents. These problems are motivated from applications in computational biology, text mining and automated text classification. We propose succinct indexes for these problems. Strings with uncertainty and fuzzy information play an important role in increasingly many applications. We propose a general framework for indexing uncertain strings such that a deterministic query string can be searched efficiently. String matching becomes a probabilistic event when a string contains uncertainty, i.e. each position of the string can have different probable characters with associated probability of occurrence for each character. Such uncertain strings are prevalent in various applications such as biological sequence data, event monitoring and automatic ECG annotations. We consider two basic problems of string searching, namely substring searching and string listing. We formulate these well known problems for uncertain strings paradigm and propose exact and approximate solution for them. We also discuss a constrained variation of orthogonal range searching. Given a set of points, the task of orthogonal range searching is to build a data structure such that all the points inside a orthogonal query region can be reported. We introduce a new variation, namely shared constraint range searching which naturally arises in constrained pattern matching applications. Shared constraint range searching is a special four sided range reporting query problem where two constraints has sharing among them, effectively reducing the number of independent constraints. For this problem, we propose a linear space index that can match the best known bound for three dimensional dominance reporting problem. We extend our data structure in the external memory model
Recommended from our members
Measuring Uncertainty in Games: Design and Preliminary Validation
Uncertainty is an important element of game play, which is widely believed to act as a precondition for player experience (PX). To investigate the concept and examine its relation to other PX concepts, we should be able to measure it. We present the design and preliminary results of the validation of the Player Uncertainty in Games (PUG) questionnaire. Based on various sources from games user research and work done with regards to searching digital archives, we designed a questionnaire that measures the experience of uncertainty in games. The scale was refined down to 66 items via interviews with players and expert reviews, which was then validated and further refined based on data gathered from gamers in an online survey. The Principal Component Analysis showed high level of internal consistency for the scale and each of its four factors: Disorientation, Exploration, Prospect, and Randomness. This work demonstrates the initial findings towards a validated tool for measuring uncertainty of players in digital games
Scalable Multiagent Coordination with Distributed Online Open Loop Planning
We propose distributed online open loop planning (DOOLP), a general framework
for online multiagent coordination and decision making under uncertainty. DOOLP
is based on online heuristic search in the space defined by a generative model
of the domain dynamics, which is exploited by agents to simulate and evaluate
the consequences of their potential choices.
We also propose distributed online Thompson sampling (DOTS) as an effective
instantiation of the DOOLP framework. DOTS models sequences of agent choices by
concatenating a number of multiarmed bandits for each agent and uses Thompson
sampling for dealing with action value uncertainty. The Bayesian approach
underlying Thompson sampling allows to effectively model and estimate
uncertainty about (a) own action values and (b) other agents' behavior. This
approach yields a principled and statistically sound solution to the
exploration-exploitation dilemma when exploring large search spaces with
limited resources.
We implemented DOTS in a smart factory case study with positive empirical
results. We observed effective, robust and scalable planning and coordination
capabilities even when only searching a fraction of the potential search space
The possible test of the calculations of nuclear matrix elements of the -decay
The existing calculations of the nuclear matrix elements of the neutrinoless
double -decay differ by about a factor three. This uncertainty prevents
quantitative interpretation of the results of experiments searching for this
process. We suggest here that the observation of the neutrinoless double
-decay of {\em several} nuclei could allow to test calculations of the
nuclear matrix elements through the comparison of the ratios of the calculated
lifetimes with experimental data. It is shown that the ratio of the lifetimes
is very sensitive to different models
- …