128 research outputs found

    The Second Hungarian Workshop on Image Analysis : Budapest, June 7-9, 1988.

    Get PDF

    High-resolution 3D direct-write prototyping for healthcare applications

    Get PDF
    The healthcare sector has much to benefit from the vast array of novelties erupting from the manufacturing world. 3D printing (additive manufacturing) is amongst the most promising recent inventions with much research concentrated around the various approaches of 3D printing and applying this effectively in the health sector. Amongst these methods, the direct-write assembly approach is a promising candidate for rapid prototyping and manufacturing of miniaturised medical devices/sensors and in particular, miniaturised flexible capacitive pressure sensors. Microstructuring the dielectric medium of capacitive pressure sensors enhances the sensitivity of the capacitive pressure sensor. The structuring has been predominantly achieved with photolithography and similar subtractive approaches. In this project high-resolution 3D direct write printing was used to fabricate structured dielectric mediums for capacitive pressure sensors. This involved the development and rheological characterisation of printability-tuned water soluble polyvinyl pyrrolidone (PVP) based inks (10%-30% polymer content) for stable high-resolution 3D printing. These inks were used to print water soluble micromoulds that were filled and cured with otherwise difficult to structure low G’ materials like PDMS. Our approach essentially decouples ink synthesis from printability at the micrometre scale. The developed micro moulding approach was employed for printing pyramidal micro moulds, that were used as templates for fabricating pyramid structured dielectric mediums for capacitive pressure sensing. The power of the approach was used to alter the microstructures and reap enhanced pressure sensing characteristics for effective miniaturised capacitive pressure sensors. A pressure sensing ring – that could be worn by doctors and surgeons – was prototyped with our approach and employed successfully to monitor in real-time the radial pulse signal of a 29 year old male volunteer. The print resolution of the inks was enhanced by formulating and rheologically characterising a PVP/PVDF polymer blend ink that would wet the printing nozzle less due to the hydrophobicity of the PVDF

    Texture Structure Analysis

    Get PDF
    abstract: Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms of perceived regularity. Our human visual system (HVS) uses the perceived regularity as one of the important pre-attentive cues in low-level image understanding. Similar to the HVS, image processing and computer vision systems can make fast and efficient decisions if they can quantify this regularity automatically. In this work, the problem of quantifying the degree of perceived regularity when looking at an arbitrary texture is introduced and addressed. One key contribution of this work is in proposing an objective no-reference perceptual texture regularity metric based on visual saliency. Other key contributions include an adaptive texture synthesis method based on texture regularity, and a low-complexity reduced-reference visual quality metric for assessing the quality of synthesized textures. In order to use the best performing visual attention model on textures, the performance of the most popular visual attention models to predict the visual saliency on textures is evaluated. Since there is no publicly available database with ground-truth saliency maps on images with exclusive texture content, a new eye-tracking database is systematically built. Using the Visual Saliency Map (VSM) generated by the best visual attention model, the proposed texture regularity metric is computed. The proposed metric is based on the observation that VSM characteristics differ between textures of differing regularity. The proposed texture regularity metric is based on two texture regularity scores, namely a textural similarity score and a spatial distribution score. In order to evaluate the performance of the proposed regularity metric, a texture regularity database called RegTEX, is built as a part of this work. It is shown through subjective testing that the proposed metric has a strong correlation with the Mean Opinion Score (MOS) for the perceived regularity of textures. The proposed method is also shown to be robust to geometric and photometric transformations and outperforms some of the popular texture regularity metrics in predicting the perceived regularity. The impact of the proposed metric to improve the performance of many image-processing applications is also presented. The influence of the perceived texture regularity on the perceptual quality of synthesized textures is demonstrated through building a synthesized textures database named SynTEX. It is shown through subjective testing that textures with different degrees of perceived regularities exhibit different degrees of vulnerability to artifacts resulting from different texture synthesis approaches. This work also proposes an algorithm for adaptively selecting the appropriate texture synthesis method based on the perceived regularity of the original texture. A reduced-reference texture quality metric for texture synthesis is also proposed as part of this work. The metric is based on the change in perceived regularity and the change in perceived granularity between the original and the synthesized textures. The perceived granularity is quantified through a new granularity metric that is proposed in this work. It is shown through subjective testing that the proposed quality metric, using just 2 parameters, has a strong correlation with the MOS for the fidelity of synthesized textures and outperforms the state-of-the-art full-reference quality metrics on 3 different texture databases. Finally, the ability of the proposed regularity metric in predicting the perceived degradation of textures due to compression and blur artifacts is also established.Dissertation/ThesisPh.D. Electrical Engineering 201

    A topological solution to object segmentation and tracking

    Full text link
    The world is composed of objects, the ground, and the sky. Visual perception of objects requires solving two fundamental challenges: segmenting visual input into discrete units, and tracking identities of these units despite appearance changes due to object deformation, changing perspective, and dynamic occlusion. Current computer vision approaches to segmentation and tracking that approach human performance all require learning, raising the question: can objects be segmented and tracked without learning? Here, we show that the mathematical structure of light rays reflected from environment surfaces yields a natural representation of persistent surfaces, and this surface representation provides a solution to both the segmentation and tracking problems. We describe how to generate this surface representation from continuous visual input, and demonstrate that our approach can segment and invariantly track objects in cluttered synthetic video despite severe appearance changes, without requiring learning.Comment: 21 pages, 6 main figures, 3 supplemental figures, and supplementary material containing mathematical proof

    Analysis and synthesis of iris images

    Get PDF
    Of all the physiological traits of the human body that help in personal identification, the iris is probably the most robust and accurate. Although numerous iris recognition algorithms have been proposed, the underlying processes that define the texture of irises have not been extensively studied. In this thesis, multiple pair-wise pixel interactions have been used to describe the textural content of the iris image thereby resulting in a Markov Random Field (MRF) model for the iris image. This information is expected to be useful for the development of user-specific models for iris images, i.e. the matcher could be tuned to accommodate the characteristics of each user\u27s iris image in order to improve matching performance. We also use MRF modeling to construct synthetic irises based on iris primitive extracted from real iris images. The synthesis procedure is deterministic and avoids the sampling of a probability distribution making it computationally simple. We demonstrate that iris textures in general are significantly different from other irregular textural patterns. Clustering experiments indicate that the synthetic irises generated using the proposed technique are similar in textural content to real iris images

    A Stochastic Grammar of Images

    Get PDF
    This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents both the hierarchical decompositions from scenes, to objects, parts, primitives and pixels by terminal and non-terminal nodes and the contexts for spatial and functional relations by horizontal links between the nodes. It formulates each object category as the set of all possible valid configurations produced by the grammar. (ii) The grammar is embodied in a simple And-Or graph representation where each Or-node points to alternative sub-configurations and an And-node is decomposed into a number of components. This representation supports recursive top-down/bottom-up procedures for image parsing under the Bayesian framework and make it convenient to scale up in complexity. Given an input image, the image parsing task constructs a most probable parse graph on-the-fly as the output interpretation and this parse graph is a subgraph of the And-Or graph after making choice on the Or-nodes. (iii) A probabilistic model is defined on this And-Or graph representation to account for the natural occurrence frequency of objects and parts as well as their relations. This model is learned from a relatively small training set per category and then sampled to synthesize a large number of configurations to cover novel object instances in the test set. This generalization capability is mostly missing in discriminative machine learning methods and can largely improve recognition performance in experiments. (iv) To fill the well-known semantic gap between symbols and raw signals, the grammar includes a series of visual dictionaries and organizes them through graph composition. At the bottom-level the dictionary is a set of image primitives each having a number of anchor points with open bonds to link with other primitives. These primitives can be combined to form larger and larger graph structures for parts and objects. The ambiguities in inferring local primitives shall be resolved through top-down computation using larger structures. Finally these primitives forms a primal sketch representation which will generate the input image with every pixels explained. The proposal grammar integrates three prominent representations in the literature: stochastic grammars for composition, Markov (or graphical) models for contexts, and sparse coding with primitives (wavelets). It also combines the structure-based and appearance based methods in the vision literature. Finally the paper presents three case studies to illustrate the proposed grammar.Mathematic

    Toward Flare-Free Images: A Survey

    Full text link
    Lens flare is a common image artifact that can significantly degrade image quality and affect the performance of computer vision systems due to a strong light source pointing at the camera. This survey provides a comprehensive overview of the multifaceted domain of lens flare, encompassing its underlying physics, influencing factors, types, and characteristics. It delves into the complex optics of flare formation, arising from factors like internal reflection, scattering, diffraction, and dispersion within the camera lens system. The diverse categories of flare are explored, including scattering, reflective, glare, orb, and starburst types. Key properties such as shape, color, and localization are analyzed. The numerous factors impacting flare appearance are discussed, spanning light source attributes, lens features, camera settings, and scene content. The survey extensively covers the wide range of methods proposed for flare removal, including hardware optimization strategies, classical image processing techniques, and learning-based methods using deep learning. It not only describes pioneering flare datasets created for training and evaluation purposes but also how they were created. Commonly employed performance metrics such as PSNR, SSIM, and LPIPS are explored. Challenges posed by flare's complex and data-dependent characteristics are highlighted. The survey provides insights into best practices, limitations, and promising future directions for flare removal research. Reviewing the state-of-the-art enables an in-depth understanding of the inherent complexities of the flare phenomenon and the capabilities of existing solutions. This can inform and inspire new innovations for handling lens flare artifacts and improving visual quality across various applications

    Modeling Robotic Systems with Activity Flow Graphs

    Get PDF
    Autonomous robotic systems are becoming increasingly common in our society, with research efforts towards automated goods transportation, service robots and autonomous cars. These complex systems have to solve many different problems in order to function robustly. Two especially important areas of interest are perception and high level control. Intelligent systems have to perceive their surroundings in order to facilitate autonomy. With an understanding of the environment, they then can make their own decisions based on high level control policies defined by the developers. Robotic systems differ drastically in their sensory capabilities, their computational power, and their designated tasks. When developing algorithms, however, we need to have a common modeling framework that enables us to generalize and re-use existing solutions. A modular approach, which is coherent across different platforms, also allows faster prototyping of new systems. In this dissertation we develop a modeling framework based on data flow that achieves this goal. We first extend the existing Synchronous Data Flow (SDF) model and combine it with reactive programming ideas and finite-state machines. Together, these existing frameworks enable us to model many aspects of complex robotic systems. We apply this model to a robot in a warehouse scenario to demonstrate the viability of the approach. Using three disjoint formalisms to model a robotic system has many downsides. In a first unification step we merge SDF and reactive programming into Hybrid Flow Graphs (HFGs), where we explicitly model synchronous and asynchronous data flow. We then apply the HFG model to the perception system of an autonomous transportation robot. In a last step, we eliminate the need for separate finite-state machines by introducing the concept of activity into the data flow. We therefore unify the different aspects into a single and coherent framework which we call Activity Flow Graphs (AFGs). The flow of activity enables us to model high level state directly in the data flow graph. The result is a single computation graph that can express both perception and high level control aspects of any robotic system. We then demonstrate this with multiple high level robotic system models. Finally, we make use of the uniform AFG model to provide a single graphical user interface that allows a developer to rapidly prototype complete robotic systems. Since all aspects of a robot can be implemented using the same theoretical framework, there is no need to switch between different paradigms. The user interface is designed to give immediate feedback, which speeds up prototyping, testing and evaluation, as well as debugging when working with real robots.Autonome Roboter werden zunehmend zu einem wichtigen Bestandteil unserer Gesellschaft, in Bereichen wie dem automatisierten Gütertransport, in der Servicerobotik und bei autonomen Automobilen. Diese komplexen Systeme müssen viele Problem lösen, um robust zu funktionieren. Zwei sehr wichtige Anwendungsfelder sind die Umgebungswahrnehmung und die Ablaufplanung. Intelligente Systeme müssen ihre Umgebung wahrnehmen, um autonom agieren zu können. Mit einem Verständnis der Umwelt können sie Entscheidungen treffen, welche auf abstrakten Richtlinien der Entwickler basieren. Verschiedene Roboter weichen stark in ihren sensorischen Fähigkeiten, in ihrer Rechenleistung und in ihren zu lösenden Aufgaben voneinander ab. Bei der Entwicklung von Algorithmen wird jedoch ein einheitliches Modellierungssystem benötigt, welches die Wiederverwendung von existierenden Lösungen erlaubt. Ein modulares System, welches über mehrere Plattformen hinweg genutzt werden kann, ermöglicht eine schnellere Entwicklung von neuen Systemen. In dieser Dissertation wird ein auf Datenfluss basierendes Modell entwickelt, welches diese Anforderungen erfüllt. Zuerst wird das existierende Synchronous Data Flow (SDF) Modell erweitert und mit Elementen von reaktiver Programmierung und endlichen Zustandsautomaten kombiniert. Zusammen können so viele Aspekte von Robotern modelliert werden. Das Modell wird auf einen Roboter in einem Warenhausszenario angewandt, um den Ansatz zu validieren. Drei verschiedene Formalismen zur Modellierung von Robotern zu verwenden hat viele Nachteile. In einem ersten Vereinigungsschritt werden SDF und reaktive Programmierung zu hybriden Flussgraphen (HFG) kombiniert, bei denen synchroner und asynchroner Datenfluss explizit modelliert werden. Dann wird das HFG-Modell auf die Wahrnehmungsmodule eines autonomen Transportsystems angewandt. Anschließend wird der Bedarf eines Zustandsautomaten beseitigt, indem das Konzept der Aktivität in den Datenfluss eingeführt wird. Dadurch werden alle Aspekte in einem einzigen, schlüssigen System vereinigt, welches Aktivitätsflussgraph (AFG) genannt wird. Der Aktivitätsfluss ermöglicht es, den höheren Systemzustand direkt im Datenflussgraphen zu modellieren. Als Ergebnis erhalten wir einen einzigen Berechnungsgraphen, der sowohl zur Beschreibung der Umgebungswahrnehmung als auch zur Kontrolle der höheren Abläufe benutzt werden kann. Dies wird anhand mehrerer Robotersysteme demonstriert. Eine graphische Benutzerschnittstelle wird bereitgestellt, welche von dem einheitlichen Modell Gebrauch macht, um ein schnelles Prototyping von Robotern zu ermöglichen. Da alle Aspekte mit demselben System modelliert werden, muss nicht zwischen verschiedenen Paradigmen gewechselt werden. Die Nutzerschnittstelle erleichtert Entwicklung, Test und Validierung von Algorithmen sowie das Auffinden von Fehlern bei echten Robotern

    Représentations de niveau intermédiaire pour la modélisation d'objets

    Get PDF
    In this thesis we propose the use of mid-level representations, and in particular i) medial axes, ii) object parts, and iii)convolutional features, for modelling objects.The first part of the thesis deals with detecting medial axes in natural RGB images. We adopt a learning approach, utilizing colour, texture and spectral clustering features, to build a classifier that produces a dense probability map for symmetry. Multiple Instance Learning (MIL) allows us to treat scale and orientation as latent variables during training, while a variation based on random forests offers significant gains in terms of running time.In the second part of the thesis we focus on object part modeling using both hand-crafted and learned feature representations. We develop a coarse-to-fine, hierarchical approach that uses probabilistic bounds for part scores to decrease the computational cost of mixture models with a large number of HOG-based templates. These efficiently computed probabilistic bounds allow us to quickly discard large parts of the image, and evaluate the exact convolution scores only at promising locations. Our approach achieves a "4times-5times" speedup over the naive approach with minimal loss in performance.We also employ convolutional features to improve object detection. We use a popular CNN architecture to extract responses from an intermediate convolutional layer. We integrate these responses in the classic DPM pipeline, replacing hand-crafted HOG features, and observe a significant boost in detection performance (~14.5% increase in mAP).In the last part of the thesis we experiment with fully convolutional neural networks for the segmentation of object parts.We re-purpose a state-of-the-art CNN to perform fine-grained semantic segmentation of object parts and use a fully-connected CRF as a post-processing step to obtain sharp boundaries.We also inject prior shape information in our model through a Restricted Boltzmann Machine, trained on ground-truth segmentations.Finally, we train a new fully-convolutional architecture from a random initialization, to segment different parts of the human brain in magnetic resonance image data.Our methods achieve state-of-the-art results on both types of data.Dans cette thèse, nous proposons l'utilisation de représentations de niveau intermédiaire, et en particulier i) d'axes médians, ii) de parties d'objets, et iii) des caractéristiques convolutionnels, pour modéliser des objets.La première partie de la thèse traite de détecter les axes médians dans des images naturelles en couleur. Nous adoptons une approche d'apprentissage, en utilisant la couleur, la texture et les caractéristiques de regroupement spectral pour construire un classificateur qui produit une carte de probabilité dense pour la symétrie. Le Multiple Instance Learning (MIL) nous permet de traiter l'échelle et l'orientation comme des variables latentes pendant l'entraînement, tandis qu'une variante fondée sur les forêts aléatoires offre des gains significatifs en termes de temps de calcul.Dans la deuxième partie de la thèse, nous traitons de la modélisation des objets, utilisant des modèles de parties déformables (DPM). Nous développons une approche « coarse-to-fine » hiérarchique, qui utilise des bornes probabilistes pour diminuer le coût de calcul dans les modèles à grand nombre de composants basés sur HOGs. Ces bornes probabilistes, calculés de manière efficace, nous permettent d'écarter rapidement de grandes parties de l'image, et d'évaluer précisément les filtres convolutionnels seulement à des endroits prometteurs. Notre approche permet d'obtenir une accélération de 4-5 fois sur l'approche naïve, avec une perte minimale en performance.Nous employons aussi des réseaux de neurones convolutionnels (CNN) pour améliorer la détection d'objets. Nous utilisons une architecture CNN communément utilisée pour extraire les réponses de la dernière couche de convolution. Nous intégrons ces réponses dans l'architecture DPM classique, remplaçant les descripteurs HOG fabriqués à la main, et nous observons une augmentation significative de la performance de détection (~14.5% de mAP).Dans la dernière partie de la thèse nous expérimentons avec des réseaux de neurones entièrement convolutionnels pous la segmentation de parties d'objets.Nous réadaptons un CNN utilisé à l'état de l'art pour effectuer une segmentation sémantique fine de parties d'objets et nous utilisons un CRF entièrement connecté comme étape de post-traitement pour obtenir des bords fins.Nous introduirons aussi un à priori sur les formes à l'aide d'une Restricted Boltzmann Machine (RBM), à partir des segmentations de vérité terrain.Enfin, nous concevons une nouvelle architecture entièrement convolutionnel, et l'entraînons sur des données d'image à résonance magnétique du cerveau, afin de segmenter les différentes parties du cerveau humain.Notre approche permet d'atteindre des résultats à l'état de l'art sur les deux types de données

    Digital analysis of paintings

    Get PDF
    corecore