5,672 research outputs found

    Approximate String Matching With Dynamic Programming and Suffix Trees

    Get PDF
    The importance and the contribution of string matching algorithms to the modern society cannot be overstated. From basic search algorithms such as spell checking and data querying, to advanced algorithms such as DNA sequencing, trend analysis and signal processing, string matching algorithms form the foundation of many aspects in computing that have been pivotal in technological advancement. In general, string matching algorithms can be divided into the categories of exact string matching and approximate string matching. We study each area and examine some of the well known algorithms. We probe into one of the most intriguing data structure in string algorithms, the suffix tree. The lowest common ancestor extension of the suffix tree is the key to many advanced string matching algorithms. With these tools, we are able to solve string problems that were, until recently, thought intractable by many. Another interesting and relatively new data structure in string algorithms is the suffix array, which has significant breakthroughs in its linear time construction in recent years. Primarily, this thesis focuses on approximate string matching using dynamic programming and hybrid dynamic programming with suffix tree. We study both approaches in detail and see how the merger of exact string matching and approximate string matching algorithms can yield synergistic results in our experiments

    Extraction of Blood Vessels Geometric Shape Features with Catheter Localization and Geodesic Distance Transform for Right Coronary Artery Detection.

    Get PDF
    X-ray angiography is considered the standard imaging sensory system for diagnosing coronary artery diseases. For automated, accurate diagnosis of such diseases, coronary vessels’ detection from the captured low quality and noisy angiography images is challenging. It is essential to detect the main branch of the coronary artery, to resolve such limitations along with the problems due to the sudden changes in the lumen diameter, and the abrupt changes in local artery direction. Accordingly, this paper solved these limitations by proposing a computer-aided detection system for the right coronary artery (RCA) extraction, where geometric shape features with catheter localization and geodesic distance transform in the angiography images through two parts. In part 1, the captured image was initially preprocessed for contrast enhancement using singular value decomposition-based contrast adjustment, followed by generating the vesselness map using Jerman filter, and for further segmentation the K-means was introduced. Afterward, in part 2, the geometric shape features of the RCA, as well as the skeleton gradient transform, and the start/end points were determined to extract the main blood vessel of the RCA. The analysis of the skeletonize image was performed using Geodesic distance transform to examine all branches starting from the predetermined start point and cover the branching till the predefined end points. A ranking matrix, and the inverse of skeletonization were finally carried out to get the actual main branch. The performance of the proposed system was then evaluated using different evaluation metrics on the angiography images...

    Path planning and navigation for a mobile robot /

    Get PDF

    Human activity prediction by mapping grouplets to recurrent self-organizing map

    Get PDF
    Human activity prediction is defined as inferring the high-level activity category with the observation of only a few action units. It is very meaningful for time-critical applications such as emergency surveillance. For efficient prediction, we represent the ongoing human activity by using body part movements and taking full advantage of inherent sequentiality, then find the best matching activity template by a proper aligning measurement. In streaming videos, dense spatio-temporal interest points (STIPs) are first extracted as low-level descriptors for their high detection efficiency. Then, sparse grouplets, i.e., clustered point groups, are located to represent body part movements, for which we propose a scale-adaptive mean shift method that can determine grouplet number and scale for each frame adaptively. To learn the sequentiality, located grouplets are successively mapped to Recurrent Self-Organizing Map (RSOM), which has been pre-trained to preserve the temporal topology of grouplet sequences. During this mapping, a growing RSOM trajectory, which represents the ongoing activity, is obtained. For the special structure of RSOM trajectory, a combination of dynamic time warping (DTW) distance and edit distance, called DTW-E distance, is designed for similarity measurement. Four activity datasets with different characteristics such as complex scenes and inter-class ambiguities serve for performance evaluation. Experimental results confirm that our method is very efficient for predicting human activity and yields better performance than state-of-the-art works. (C) 2015 Elsevier B.V. All rights reserved.National Natural Science Foundation of China (NSFC) [61340046]; National High Technology Research and Development Program of China (863 Program) [2006AA04Z247]; Scientific and Technical Innovation Commission of Shenzhen Municipality [JCYJ20120614152234873, JCYJ20130331144716089]; Specialized Research Fund for the Doctoral Program of Higher Education [20130001110011]SCI(E)[email protected]

    Data Structures for Efficient String Algorithms

    Get PDF
    This thesis deals with data structures that are mostly useful in the area of string matching and string mining. Our main result is an O(n)-time preprocessing scheme for an array of n numbers such that subsequent queries asking for the position of a minimum element in a specified interval can be answered in constant time (so-called RMQs for Range Minimum Queries). The space for this data structure is 2n+o(n) bits, which is shown to be asymptotically optimal in a general setting. This improves all previous results on this problem. The main techniques for deriving this result rely on combinatorial properties of arrays and so-called Cartesian Trees. For compressible input arrays we show that further space can be saved, while not affecting the time bounds. For the two-dimensional variant of the RMQ-problem we give a preprocessing scheme with quasi-optimal time bounds, but with an asymptotic increase in space consumption of a factor of log(n). It is well known that algorithms for answering RMQs in constant time are useful for many different algorithmic tasks (e.g., the computation of lowest common ancestors in trees); in the second part of this thesis we give several new applications of the RMQ-problem. We show that our preprocessing scheme for RMQ (and a variant thereof) leads to improvements in the space- and time-consumption of the Enhanced Suffix Array, a collection of arrays that can be used for many tasks in pattern matching. In particular, we will see that in conjunction with the suffix- and LCP-array 2n+o(n) bits of additional space (coming from our RMQ-scheme) are sufficient to find all occ occurrences of a (usually short) pattern of length m in a (usually long) text of length n in O(m*s+occ) time, where s denotes the size of the alphabet. This is certainly optimal if the size of the alphabet is constant; for non-constant alphabets we can improve this to O(m*log(s)+occ) locating time, replacing our original scheme with a data structure of size approximately 2.54n bits. Again by using RMQs, we then show how to solve frequency-related string mining tasks in optimal time. In a final chapter we propose a space- and time-optimal algorithm for computing suffix arrays on texts that are logically divided into words, if one is just interested in finding all word-aligned occurrences of a pattern. Apart from the theoretical improvements made in this thesis, most of our algorithms are also of practical value; we underline this fact by empirical tests and comparisons on real-word problem instances. In most cases our algorithms outperform previous approaches by all means

    Exploring Eye Tracking Data on Source Code via Dual Space Analysis

    Get PDF
    Eye tracking is a frequently used technique to collect data capturing users\u27 strategies and behaviors in processing information. Understanding how programmers navigate through a large number of classes and methods to find bugs is important to educators and practitioners in software engineering. However, the eye tracking data collected on realistic codebases is massive compared to traditional eye tracking data on one static page. The same content may appear in different areas on the screen with users scrolling in an Integrated Development Environment (IDE). Hierarchically structured content and fluid method position compose the two major challenges for visualization. We present a dual-space analysis approach to explore eye tracking data by leveraging existing software visualizations and a new graph embedding visualization. We use the graph embedding technique to quantify the distance between two arbitrary methods, which offers a more accurate visualization of distance with respect to the inherent relations, compared with the direct software structure and the call graph. The visualization offers both naturalness and readability showing time-varying eye movement data in both the content space and the embedded space, and provides new discoveries in developers\u27 eye tracking behaviors. Adviser: Hongfeng Y

    Attention and visual memory in visualization and computer graphics

    Get PDF
    Abstract—A fundamental goal of visualization is to produce images of data that support visual analysis, exploration, and discovery of novel insights. An important consideration during visualization design is the role of human visual perception. How we “see ” details in an image can directly impact a viewer’s efficiency and effectiveness. This paper surveys research on attention and visual perception, with a specific focus on results that have direct relevance to visualization and visual analytics. We discuss theories of low-level visual perception, then show how these findings form a foundation for more recent work on visual memory and visual attention. We conclude with a brief overview of how knowledge of visual attention and visual memory is being applied in visualization and graphics. We also discuss how challenges in visualization are motivating research in psychophysics
    • …
    corecore