7 research outputs found

    Generic Document Image Dewarping by Probabilistic Discretization of Vanishing Points

    Get PDF
    International audienceDocument images dewarping is still a challenge especially when documents are captured with one camera in an uncontrolled environment. In this paper we propose a generic approach based on vanishing points (VP) to reconstruct the 3D shape of document pages. Unlike previous methods we do not need to segment the text included in the documents. Therefore, our approach is less sensitive to pre-processing and segmentation errors. The computation of the VPs is robust and relies on the a-contrario framework, which has only one parameter whose setting is based on probabilistic reasoning instead of experimental tuning. Thus, our method can be applied to any kind of document including text and non-text blocks and extended to other kind of images. Experimental results show that the proposed method is robust to a variety of distortions

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Development of a text reading system on video images

    Get PDF
    Since the early days of computer science researchers sought to devise a machine which could automatically read text to help people with visual impairments. The problem of extracting and recognising text on document images has been largely resolved, but reading text from images of natural scenes remains a challenge. Scene text can present uneven lighting, complex backgrounds or perspective and lens distortion; it usually appears as short sentences or isolated words and shows a very diverse set of typefaces. However, video sequences of natural scenes provide a temporal redundancy that can be exploited to compensate for some of these deficiencies. Here we present a complete end-to-end, real-time scene text reading system on video images based on perspective aware text tracking. The main contribution of this work is a system that automatically detects, recognises and tracks text in videos of natural scenes in real-time. The focus of our method is on large text found in outdoor environments, such as shop signs, street names and billboards. We introduce novel efficient techniques for text detection, text aggregation and text perspective estimation. Furthermore, we propose using a set of Unscented Kalman Filters (UKF) to maintain each text region¿s identity and to continuously track the homography transformation of the text into a fronto-parallel view, thereby being resilient to erratic camera motion and wide baseline changes in orientation. The orientation of each text line is estimated using a method that relies on the geometry of the characters themselves to estimate a rectifying homography. This is done irrespective of the view of the text over a large range of orientations. We also demonstrate a wearable head-mounted device for text reading that encases a camera for image acquisition and a pair of headphones for synthesized speech output. Our system is designed for continuous and unsupervised operation over long periods of time. It is completely automatic and features quick failure recovery and interactive text reading. It is also highly parallelised in order to maximize the usage of available processing power and to achieve real-time operation. We show comparative results that improve the current state-of-the-art when correcting perspective deformation of scene text. The end-to-end system performance is demonstrated on sequences recorded in outdoor scenarios. Finally, we also release a dataset of text tracking videos along with the annotated ground-truth of text regions

    Development of Particle Image Velocimetry for In-Vitro Studies of Arterial Haemodynamics

    Get PDF
    Atherosclerosis and related cardiovascular diseases (CVDs) are amongst the largest causes of morbidity and mortality in the developed world, causing considerable monetary pressure on public health systems worldwide. Atherosclerosis is characterised by the build up of vascular plaque in medium and large arteries and is a direct precursor to acute vascular syndromes such a myocardial infarction, stroke or peripheral arterial diseases. The causative factors leading to CVD still remain relatively poorly understood, but are becoming increasingly identifiable as a dysfunction of the endothelial cells that line the arterial wall. It is well known that the endothelium responds to the prevailing fluid mechanic (i.e. haemodynamic) environment, which plays a crucial role in the localised occurrence of atherosclerosis near vessel bends and bifurcations. In these areas, disturbed haemodynamics lead to flow separation and very low wall shear stress (WSS), which directly affects the functionality of the endothelium and impedes the transport of important blood borne agonists and antagonists. Detailed full field measurements assessing complex haemodynamics are sparse and consequently this thesis aims to address some of the important questions related to arterial haemodynamics and CVD by performing in-vitro flow measurements in physiologically relevant conditions. In particular, this research develops and uses state-of-the-art Particle Image Velocimetry (PIV) techniques to measure three-dimensional velocity and WSS fields in scaled models of the human carotid artery. For this purpose, the necessary theoretical and experimental concepts are developed and in-depth analyses of the underlying factors affecting the local haemodynamics and their relation to CVD are carried out. In the first part, a methodology for the construct of transparent hydraulic flow phantoms from medical imaging data is developed. The arterial geometries are reproduced in optically clear silicone and the flowing blood is modelled with a refractive index matched blood analogue. Subsequently, planar and Stereo-PIV techniques are developed and verified. A novel interfacial PIV (iPIV) technique is introduced to directly measure WSS by inferring the velocity gradient from the recorded particle images. The new technique offers a maximal achievable resolution of 1 pixel and therefore removes the resolution limit near the wall usually associated with PIV. Furthermore, the iPIV performance is assessed on a number of numerical and experimental test cases and iPIV offers a significantly improved measurement accuracy compared to more traditional techniques. Subsequently, the developed methodologies are applied in three studies to characterise the velocity and WSS fields in the human carotid artery under a number of physiological and experimental conditions. The first study focuses on idealised vessel geometries with and without disease and establishes a general understanding of the haemodynamic environment. Secondly, a physiological accurate vessel geometry under pulsatile flow conditions is investigated to provide a more realistic representation of the true in-vivo flow conditions. The prevailing flow structure in both cases is characterised by flow separation, strong secondary flows and large spatial and temporal variations in WSS. Large spatial and temporal differences exist between the different geometries and flow conditions; spatial variations appear to be more significant than transient events. Thirdly, the three-dimensional flow structure in the physiological carotid artery model is investigated by means of stereoscopic and tomographic PIV, permitting for the first time the measurement of the full 3D-3C velocity field and shear stress tensor in such geometries. The flow field within the model is complex and three-dimensional and inherently determined by the vessel geometry and the build up of an adverse pressure gradient. The main features include strong heliocoidal flow motions and large spatial variations in WSS. Lastly, the physiological implications of the current results are discussed in detail and reference to previous work is given. In summary, the present research develops a novel and versatile PIV methodology for haemodynamic in vitro studies and the functionality and accuracy is demonstrated through a number of physiological relevant flow measurements

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
    corecore