27 research outputs found

    Image Deblurring for Navigation Systems of Vision Impaired People Using Sensor Fusion Data

    Get PDF
    Image deblurring is a key component in vision based indoor/outdoor navigation systems; as blurring is one of the main causes of poor image quality. When images with poor quality are used for analysis, navigation errors are likely to be generated. For navigation systems, camera movement mainly causes blurring, as the camera is continuously moving by the body movement. This paper proposes a deblurring methodology that takes advantage of the fact that most smartphones are equipped with 3-axis accelerometers and gyroscopes. It uses data of the accelerometer and gyroscope to derive a motion vector calculated from the motion of the smartphone during the image-capturing period. A heuristic method, namely particle swarm optimization, is developed to determine the optimal motion vector, in order to deblur the captured image by reversing the effect of motion. Experimental results indicated that deblurring can be successfully performed using the optimal motion vector and that the deblurred images can be used as a readily approach to object and path identification in vision based navigation systems, especially for blind and vision impaired indoor/outdoor navigation. Also, the performance of proposed method is compared with the commonly used deblurring methods. Better results in term of image quality can be achieved. This experiment aims to identify issues in image quality including low light conditions, low quality images due to movement of the capture device and static and moving obstacles in front of the user in both indoor and outdoor environments. From this information, image-processing techniques to will be identified to assist in object and path edge detection necessary to create a guidance system for those with low vision

    Alignment parameter calibration for IMU using the Taguchi method for image deblurring

    Get PDF
    Inertial measurement units (IMUs) utilized in smartphones can be used to detect camera motion during exposure, in order to improve image quality degraded with blur through long hand-held exposure. Based on the captured camera motion, blur in images can be removed when an appropriate deblurring filter is used. However, two research issues have not been addressed: (a) the calibration of alignment parameters for the IMU has not been addressed. When inappropriate alignment parameters are used for the IMU, the camera motion would not be captured accurately and the deblurring effectiveness can be downgraded. (b) Also selection of an appropriate deblurring filter correlated with the image quality has still not been addressed. Without the use of an appropriate deblurring filter, the image quality could not be optimal. This paper proposes a systematic method, namely the Taguchi method, which is a robust and systematic approach for designing reliable and high-precision devices, in order to perform the alignment parameter calibration for the IMU and filter selection. The Taguchi method conducts a small number of systematic experiments based on orthogonal arrays. It studies the impact of the alignment parameters and appropriate deblurring filter, which attempts to perform an effective deblurring. Several widely adopted image quality metrics are used to evaluate the deblurred images generated by the proposed Taguchi method. Experimental results show that the quality of deblurred images achieved by the proposed Taguchi method is better than those obtained by deblurring methods which are not involved with the alignment parameter calibration and filter selection. Also, much less computational effort is required by the Taguchi method when comparing with the commonly used optimization methods for determining alignment parameters and deblurring filter

    Deep learning-based diagnostic system for malignant liver detection

    Get PDF
    Cancer is the second most common cause of death of human beings, whereas liver cancer is the fifth most common cause of mortality. The prevention of deadly diseases in living beings requires timely, independent, accurate, and robust detection of ailment by a computer-aided diagnostic (CAD) system. Executing such intelligent CAD requires some preliminary steps, including preprocessing, attribute analysis, and identification. In recent studies, conventional techniques have been used to develop computer-aided diagnosis algorithms. However, such traditional methods could immensely affect the structural properties of processed images with inconsistent performance due to variable shape and size of region-of-interest. Moreover, the unavailability of sufficient datasets makes the performance of the proposed methods doubtful for commercial use. To address these limitations, I propose novel methodologies in this dissertation. First, I modified a generative adversarial network to perform deblurring and contrast adjustment on computed tomography (CT) scans. Second, I designed a deep neural network with a novel loss function for fully automatic precise segmentation of liver and lesions from CT scans. Third, I developed a multi-modal deep neural network to integrate pathological data with imaging data to perform computer-aided diagnosis for malignant liver detection. The dissertation starts with background information that discusses the proposed study objectives and the workflow. Afterward, Chapter 2 reviews a general schematic for developing a computer-aided algorithm, including image acquisition techniques, preprocessing steps, feature extraction approaches, and machine learning-based prediction methods. The first study proposed in Chapter 3 discusses blurred images and their possible effects on classification. A novel multi-scale GAN network with residual image learning is proposed to deblur images. The second method in Chapter 4 addresses the issue of low-contrast CT scan images. A multi-level GAN is utilized to enhance images with well-contrast regions. Thus, the enhanced images improve the cancer diagnosis performance. Chapter 5 proposes a deep neural network for the segmentation of liver and lesions from abdominal CT scan images. A modified Unet with a novel loss function can precisely segment minute lesions. Similarly, Chapter 6 introduces a multi-modal approach for liver cancer variants diagnosis. The pathological data are integrated with CT scan images to diagnose liver cancer variants. In summary, this dissertation presents novel algorithms for preprocessing and disease detection. Furthermore, the comparative analysis validates the effectiveness of proposed methods in computer-aided diagnosis

    Sparse Gradient Optimization and its Applications in Image Processing

    Get PDF
    Millions of digital images are captured by imaging devices on a daily basis. The way imaging devices operate follows an integral process from which the information of the original scene needs to be estimated. The estimation is done by inverting the integral process of the imaging device with the use of optimization techniques. This linear inverse problem, the inversion of the integral acquisition process, is at the heart of several image processing applications such as denoising, deblurring, inpainting, and super-resolution. We describe in detail the use of linear inverse problems in these applications. We review and compare several state-of-the-art optimization algorithms that invert this integral process. Linear inverse problems are usually very difficult to solve. Therefore, additional prior assumptions need to be introduced to successfully estimate the output signal. Several priors have been suggested in the research literature, with the Total Variation (TV) being one of the most prominent. In this thesis, we review another prior, the l0 pseudo-norm over the gradient domain. This prior allows full control over how many non-zero gradients are retained to approximate prominent structures of the image. We show the superiority of the l0 gradient prior over the TV prior in recovering genuinely piece-wise constant signals. The l0 gradient prior has shown to produce state-of-the-art results in edge-preserving image smoothing. Moreover, this general prior can be applied to several other applications, such as edge extraction, clip-art JPEG artifact removal, non-photorealistic image rendering, detail magnification, and tone mapping. We review and evaluate several state-of-the-art algorithms that solve the optimization problem based on the l0 gradient prior. Subsequently we apply the l0 gradient prior to two applications where we show superior results as compared to the current state-of-the-art. The first application is that of single-image reflection removal. Existing solutions to this problem have shown limited success because of the highly ill-posed nature of the problem. We show that the standard l0 gradient prior with a modified data-fidelity term based on the Laplacian operator is able to sufficiently remove unwanted reflections from images in many realistic scenarios. We conduct extensive experiments and show that our method outperforms the state-of-the-art. In the second application of haze removal from visible-NIR image pairs we propose a novel optimization framework, where the prior term penalizes the number of non-zero gradients of the difference between the output and the NIR image. Due to the longer wavelengths of NIR, an image taken in the NIR spectrum suffers significantly less from haze artifacts. Using this prior term, we are able to transfer details from the haze-free NIR image to the final result. We show that our formulation provides state-of-the-art results compared to haze removal methods that use a single image and also to those that are based on visible-NIR image pairs

    Computational Imaging and Artificial Intelligence: The Next Revolution of Mobile Vision

    Full text link
    Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot meet this unprecedented demand. Differently, Computational Imaging (CI) systems are designed to capture high-dimensional data in an encoded manner to provide more information for mobile vision systems.Thanks to AI, CI can now be used in real systems by integrating deep learning algorithms into the mobile vision platform to achieve the closed loop of intelligent acquisition, processing and decision making, thus leading to the next revolution of mobile vision.Starting from the history of mobile vision using digital cameras, this work first introduces the advances of CI in diverse applications and then conducts a comprehensive review of current research topics combining CI and AI. Motivated by the fact that most existing studies only loosely connect CI and AI (usually using AI to improve the performance of CI and only limited works have deeply connected them), in this work, we propose a framework to deeply integrate CI and AI by using the example of self-driving vehicles with high-speed communication, edge computing and traffic planning. Finally, we outlook the future of CI plus AI by investigating new materials, brain science and new computing techniques to shed light on new directions of mobile vision systems

    Text Similarity Between Concepts Extracted from Source Code and Documentation

    Get PDF
    Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p

    Computer Vision for Multimedia Geolocation in Human Trafficking Investigation: A Systematic Literature Review

    Full text link
    The task of multimedia geolocation is becoming an increasingly essential component of the digital forensics toolkit to effectively combat human trafficking, child sexual exploitation, and other illegal acts. Typically, metadata-based geolocation information is stripped when multimedia content is shared via instant messaging and social media. The intricacy of geolocating, geotagging, or finding geographical clues in this content is often overly burdensome for investigators. Recent research has shown that contemporary advancements in artificial intelligence, specifically computer vision and deep learning, show significant promise towards expediting the multimedia geolocation task. This systematic literature review thoroughly examines the state-of-the-art leveraging computer vision techniques for multimedia geolocation and assesses their potential to expedite human trafficking investigation. This includes a comprehensive overview of the application of computer vision-based approaches to multimedia geolocation, identifies their applicability in combating human trafficking, and highlights the potential implications of enhanced multimedia geolocation for prosecuting human trafficking. 123 articles inform this systematic literature review. The findings suggest numerous potential paths for future impactful research on the subject

    Neural Radiance Fields: Past, Present, and Future

    Full text link
    The various aspects like modeling and interpreting 3D environments and surroundings have enticed humans to progress their research in 3D Computer Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in Computer Graphics, Robotics, Computer Vision, and the possible scope of High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D models have gained traction from res with more than 1000 preprints related to NeRFs published. This paper serves as a bridge for people starting to study these fields by building on the basics of Mathematics, Geometry, Computer Vision, and Computer Graphics to the difficulties encountered in Implicit Representations at the intersection of all these disciplines. This survey provides the history of rendering, Implicit Learning, and NeRFs, the progression of research on NeRFs, and the potential applications and implications of NeRFs in today's world. In doing so, this survey categorizes all the NeRF-related research in terms of the datasets used, objective functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation

    Personality Identification from Social Media Using Deep Learning: A Review

    Get PDF
    Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed
    corecore