    Алгоритм проходження контуром об’єкта з використанням зворотного ходу

    У статті представлено аналітичний огляд сучасних алгоритмів проходження контуром, визначено їх переваги та недоліки. Запропоновано удосконалений алгоритм проходження контуром об’єктів з використанням зворотного ходу «Backward contour tracing». Розроблений алгоритм протестовано на цитологічних зображеннях.В статье представлен аналитический обзор современных алгоритмов прохождения контуром, определены их преимущества и недостатки. Предложен усовершенствованный алгоритм прохождения контуром объектов с использованием обратного хода «Backward contour tracing». Разработанный алгоритм протестирован на цитологических изображениях.Algorithms of contour tracing are represented in the article. Their advantages and disadvantages are described. The algorithm of backward contour tracing was improved “Backward contour tracing”. The developed algorithm is tested on cytological images

    Optimized Block-based Connected Components Labeling with Decision Trees

    In this paper we define a new paradigm for 8-connection labeling, which employes a general approach to improve neighborhood exploration and minimizes the number of memory accesses. Firstly we exploit and extend the decision table formalism introducing OR-decision tables, in which multiple alternative actions are managed. An automatic procedure to synthesize the optimal decision tree from the decision table is used, providing the most effective conditions evaluation order. Secondly we propose a new scanning technique that moves on a 2x2 pixel grid over the image, which is optimized by the automatically generated decision tree.An extensive comparison with the state of art approaches is proposed, both on synthetic and real datasets. The synthetic dataset is composed of different sizes and densities random images, while the real datasets are an artistic image analysis dataset, a document analysis dataset for text detection and recognition, and finally a standard resolution dataset for picture segmentation tasks. The algorithm provides an impressive speedup over the state of the art algorithms

    One DAG to Rule Them All

    In this paper, we present novel strategies for optimizing the performance of many binary image processing algorithms. These strategies are collected in an open-source framework, GRAPHGEN, that is able to automatically generate optimized C++ source code implementing the desired optimizations. Simply starting from a set of rules, the algorithms introduced with the GRAPHGEN framework can generate decision trees with minimum average path-length, possibly considering image pattern frequencies, apply state prediction and code compression by the use of Directed Rooted Acyclic Graphs (DRAGs). Moreover, the proposed algorithmic solutions allow to combine different optimization techniques and significantly improve performance. Our proposal is showcased on three classical and widely employed algorithms (namely Connected Components Labeling, Thinning, and Contour Tracing). When compared to existing approaches —in 2D and 3D—, implementations using the generated optimal DRAGs perform significantly better than previous state-of-the-art algorithms, both on CPU and GPU

    Doctor of Philosophy

    dissertationHigh-order finite element methods, using either the continuous or discontinuous Galerkin formulation, are becoming more popular in fields such as fluid mechanics, solid mechanics and computational electromagnetics. While the use of these methods is becoming increasingly common, there has not been a corresponding increase in the availability and use of visualization methods and software that are capable of displaying visualizations of these volumes both accurately and interactively. A fundamental problem with the majority of existing visualization techniques is that they do not understand nor respect the structure of a high-order field, leading to visualization error. Visualizations of high-order fields are generally created by first approximating the field with low-order primitives and then generating the visualization using traditional methods based on linear interpolation. The approximation step introduces error into the visualization pipeline, which requires the user to balance the competing goals of image quality, interactivity and resource consumption. In practice, visualizations performed this way are often either undersampled, leading to visualization error, or oversampled, leading to unnecessary computational effort and resource consumption. Without an understanding of the sources of error, the simulation scientist is unable to determine if artifacts in the image are due to visualization error, insufficient mesh resolution, or a failure in the underlying simulation. This uncertainty makes it difficult for the scientists to make judgments based on the visualization, as judgments made on the assumption that artifacts are a result of visualization error when they are actually a more fundamental problem can lead to poor decision-making. This dissertation presents new visualization algorithms that use the high-order data in its native state, using the knowledge of the structure and mathematical properties of these fields to create accurate images interactively, while avoiding the error introduced by representing the fields with low-order approximations. First, a new algorithm for cut-surfaces is presented, specifically the accurate depiction of colormaps and contour lines on arbitrarily complex cut-surfaces. Second, a mathematical analysis of the evaluation of the volume rendering integral through a high-order field is presented, as well as an algorithm that uses this analysis to create accurate volume renderings. Finally, a new software system, the Element Visualizer (ElVis), is presented, which combines the ideas and algorithms created in this dissertation in a single software package that can be used by simulation scientists to create accurate visualizations. This system was developed and tested with the assistance of the ProjectX simulation team. The utility of our algorithms and visualization system are then demonstrated with examples from several high-order fluid flow simulations

    A Warp Speed Chain-Code Algorithm Based on Binary Decision Trees

    Contours extraction, also known as chain-code extraction, is one of the most common algorithms of binary image processing. Despite being the raster way the most cache friendly and, consequently, fast way to scan an image, most commonly used chain-code algorithms perform contours tracing, and therefore tend to be fairly inefficient. In this paper, we took a rarely used algorithm that extracts contours in raster scan, and optimized its execution time through template functions, look-up tables and decision trees, in order to reduce code branches and the average number of load/store operations required. The result is a very fast solution that outspeeds the state-of-the-art contours extraction algorithm implemented in OpenCV, on a collection of real case datasets. Contribution: This paper significantly improves the performance of existing chain-code algorithms, by smartly introducing decision trees to reduce code branches and the average number of load/store operations required

    Enabling Seamless Access to Digital Graphical Contents for Visually Impaired Individuals via Semantic-Aware Processing

    Vision is one of the main sources through which people obtain information from the world, but unfortunately, visually-impaired people are partially or completely deprived of this type of information. With the help of computer technologies, people with visual impairment can independently access digital textual information by using text-to-speech and text-to-Braille software. However, in general, there still exists a major barrier for people who are blind to access the graphical information independently in real-time without the help of sighted people. In this paper, we propose a novel multi-level and multi-modal approach aiming at addressing this challenging and practical problem, with the key idea being semantic-aware visual-to-tactile conversion through semantic image categorization and segmentation, and semantic-driven image simplification. An end-to-end prototype system was built based on the approach. We present the details of the approach and the system, report sample experimental results with realistic data, and compare our approach with current typical practice

    Underwater Localization in Complex Environments

    A capacidade de um veículo autónomo submarino (AUV) se localizar num ambiente complexo, bem como de extrair características relevantes do mesmo, é de grande importância para o sucesso da navegação. No entanto, esta tarefa é particularmente desafiante em ambientes subaquáticos devido à rápida atenuação sofrida pelos sinais de sistemas de posicionamento global ou outros sinais de radiofrequência, dispersão e reflexão, sendo assim necessário o uso de processos de filtragem. Ambiente complexo é definido aqui como um cenário com objetos destacados das paredes, por exemplo, o objeto pode ter uma certa variabilidade de orientação, portanto a sua posição nem sempre é conhecida. Exemplos de cenários podem ser um porto, um tanque ou mesmo uma barragem, onde existem paredes e dentro dessas paredes um AUV pode ter a necessidade de se localizar de acordo com os outros veículos na área e se posicionar em relação ao mesmo e analisá-lo. Os veículos autónomos empregam muitos tipos diferentes de sensores para localização e percepção dos seus ambientes e dependem dos computadores de bordo para realizar tarefas de direção autónoma. Para esta dissertação há um problema concreto a resolver, localizar um cabo suspenso numa coluna de água em uma região conhecida do mar e navegar de acordo com ela. Embora a posição do cabo no mundo seja bem conhecida, a dinâmica do cabo não permite saber exatamente onde ele está. Assim, para que o veículo se localize de acordo com este para que possa ser inspecionado, a localização deve ser baseada em sensores ópticos e acústicos. Este estudo explora o processamento e a análise de imagens óticas e acústicas, por meio dos dados adquiridos através de uma câmara e por um sonar de varrimento mecânico (MSIS),respetivamente, a fim de extrair características ambientais relevantes que possibilitem a estimação da localização do veículo. Os pontos de interesse extraídos de cada um dos sensores são utilizados para alimentar um estimador de posição, implementando um Filtro de Kalman Extendido (EKF), de modo a estimar a posição do cabo e através do feedback do filtro melhorar os processos de extração de pontos de interesse utilizados.The ability of an autonomous underwater vehicle (AUV) to locate itself in a complex environment as well as to detect relevant environmental features is of crucial importance for successful navigation. However, it's particularly challenging in underwater environments due to the rapid attenuation suffered by signals from global positioning systems or other radio frequency signals, dispersion and reflection thus needing a filtering process. Complex environment is defined here as a scenario with objects detached from the walls, for example the object can have a certain orientation variability therefore its position is not always known. Examples of scenarios can be a harbour, a tank or even a dam reservoir, where there are walls and within those walls an AUV may have the need to localize itself according to the other vehicles in the area and position itself relative to one to observe, analyse or scan it. Autonomous vehicles employ many different types of sensors for localization and perceiving their environments and they depend on the on-board computers to perform autonomous driving tasks. For this dissertation there is a concrete problem to solve, which is to locate a suspended cable in a water column in a known region in the sea and navigate according to it. Although the cable position in the world is well known, the cable dynamics does not allow knowing where it is exactly. So, in order to the vehicle localize itself according to it so it can be inspected, the localization has to be based on optical and acoustic sensors. This study explores the processing and analysis of optical and acoustic images, through the data acquired through a camera and by a mechanical scanning sonar (MSIS), respectively, in order to extract relevant environmental characteristics that allow the estimation of the location of the vehicle. The points of interest extracted from each of the sensors are used to feed a position estimator, by implementing an Extended Kalman Filter (EKF), in order to estimate the position of the cable and through the feedback of the filter improve the extraction processes of points of interest used

    A versatile pitch tracking algorithm : from human speech to killer whale vocalizations

    Author Posting. © Acoustical Society of America, 2009. This article is posted here by permission of Acoustical Society of America for personal use, not for redistribution. The definitive version was published in Journal of the Acoustical Society of America 126 (2009): 451-459, doi:10.1121/1.3132525.In this article, a pitch tracking algorithm [named discrete logarithmic Fourier transformation-pitch detection algorithm (DLFT-PDA)], originally designed for human telephone speech, was modified for killer whale vocalizations. The multiple frequency components of some of these vocalizations demand a spectral (rather than temporal) approach to pitch tracking. The DLFT-PDA algorithm derives reliable estimations of pitch and the temporal change of pitch from the harmonic structure of the vocal signal. Scores from both estimations are combined in a dynamic programming search to find a smooth pitch track. The algorithm is capable of tracking killer whale calls that contain simultaneous low and high frequency components and compares favorably across most signal to noise ratio ranges to the peak-picking and sidewinder algorithms that have been used for tracking killer whale vocalizations previously.C.W. was supported by DARPA under Contract No. N66001-96-C-8526, monitored through Naval Command, Control, and Ocean Surveillance Center and by the National Science Foundation under Grant No. IRI-9618731. A.D.S. was supported by a National Defense Science and Engineering Graduate Fellowship

    Techniques for document image processing in compressed domain

    The main objective for image compression is usually considered the minimization of storage space. However, as the need to frequently access images increases, it is becoming more important for people to process the compressed representation directly. In this work, the techniques that can be applied directly and efficiently to digital information encoded by a given compression algorithm are investigated. Lossless compression schemes and information processing algorithms for binary document images and text data are two closely related areas bridged together by the fast processing of coded data. The compressed domains, which have been addressed in this work, i.e., the ITU fax standards and JBIG standard, are two major schemes used for document compression. Based on ITU Group IV, a modified coding scheme, MG4, which explores the 2-dimensional correlation between scan lines, is developed. From the viewpoints of compression efficiency and processing flexibility of image operations, the MG4 coding principle and its feature-preserving behavior in the compressed domain are investigated and examined. Two popular coding schemes in the area of bi-level image compression, run-length and Group IV, are studied and compared with MG4 in the three aspects of compression complexity, compression ratio, and feasibility of compressed-domain algorithms. In particular, for the operations of connected component extraction, skew detection, and rotation, MG4 shows a significant speed advantage over conventional algorithms. Some useful techniques for processing the JBIG encoded images directly in the compressed domain, or concurrently while they are being decoded, are proposed and generalized; In the second part of this work, the possibility of facilitating image processing in the wavelet transform domain is investigated. The textured images can be distinguished from each other by examining their wavelet transforms. The basic idea is that highly textured regions can be segmented using feature vectors extracted from high frequency bands based on the observation that textured images have large energies in both high and middle frequencies while images in which the grey level varies smoothly are heavily dominated by the low-frequency channels in the wavelet transform domain. As a result, a new method is developed and implemented to detect textures and abnormalities existing in document images by using polynomial wavelets. Segmentation experiments indicate that this approach is superior to other traditional methods in terms of memory space and processing time

    Research and applications: Artificial intelligence

    The program is reported for developing techniques in artificial intelligence and their application to the control of mobile automatons for carrying out tasks autonomously. Visual scene analysis, short-term problem solving, and long-term problem solving are discussed along with the PDP-15 simulator, LISP-FORTRAN-MACRO interface, resolution strategies, and cost effectiveness