200,918 research outputs found
Automated unique input output sequence generation for conformance testing of FSMs
This paper describes a method for automatically generating unique input output (UIO) sequences for FSM conformance testing. UIOs are used in conformance testing to verify the end state of a transition sequence. UIO sequence generation is represented as a search problem and genetic algorithms are used to search this space. Empirical evidence indicates that the proposed method yields considerably better (up to 62% better) results compared with random UIO sequence generation
All Maximal Independent Sets and Dynamic Dominance for Sparse Graphs
We describe algorithms, based on Avis and Fukuda's reverse search paradigm,
for listing all maximal independent sets in a sparse graph in polynomial time
and delay per output. For bounded degree graphs, our algorithms take constant
time per set generated; for minor-closed graph families, the time is O(n) per
set, and for more general sparse graph families we achieve subquadratic time
per set. We also describe new data structures for maintaining a dynamic vertex
set S in a sparse or minor-closed graph family, and querying the number of
vertices not dominated by S; for minor-closed graph families the time per
update is constant, while it is sublinear for any sparse graph family. We can
also maintain a dynamic vertex set in an arbitrary m-edge graph and test the
independence of the maintained set in time O(sqrt m) per update. We use the
domination data structures as part of our enumeration algorithms.Comment: 10 page
MSPKmerCounter: A Fast and Memory Efficient Approach for K-mer Counting
A major challenge in next-generation genome sequencing (NGS) is to assemble
massive overlapping short reads that are randomly sampled from DNA fragments.
To complete assembling, one needs to finish a fundamental task in many leading
assembly algorithms: counting the number of occurrences of k-mers (length-k
substrings in sequences). The counting results are critical for many components
in assembly (e.g. variants detection and read error correction). For large
genomes, the k-mer counting task can easily consume a huge amount of memory,
making it impossible for large-scale parallel assembly on commodity servers.
In this paper, we develop MSPKmerCounter, a disk-based approach, to
efficiently perform k-mer counting for large genomes using a small amount of
memory. Our approach is based on a novel technique called Minimum Substring
Partitioning (MSP). MSP breaks short reads into multiple disjoint partitions
such that each partition can be loaded into memory and processed individually.
By leveraging the overlaps among the k-mers derived from the same short read,
MSP can achieve astonishing compression ratio so that the I/O cost can be
significantly reduced. For the task of k-mer counting, MSPKmerCounter offers a
very fast and memory-efficient solution. Experiment results on large real-life
short reads data sets demonstrate that MSPKmerCounter can achieve better
overall performance than state-of-the-art k-mer counting approaches.
MSPKmerCounter is available at http://www.cs.ucsb.edu/~yangli/MSPKmerCounte
- ā¦