6,888 research outputs found

    Portable Camera Based Assistive Pattern Recognition for Visually Challenged Persons

    Get PDF
    Choosing clothes, food recognition and traffic signal analysis are major challenges for visually impaired persons. The existing automatic clothing pattern recognition is also a challenging research problem due to rotation, scaling, illumination, and especially large intra class pattern variations. This project, a camera based assistive framework is proposed to help blind persons for identification of food pattern, clothe pattern and colors in their daily lives. The existing traffic signal using sensors method is difficult to analysis and many components used. A camera based traffic signal analysis method easy to handle, to provide clear traffic signal analysis and reduce the time delay. The system contains the following major components 1) a camera for capturing clothe, food and traffic signal images, a microphone for speech command input; 2) data capture and analysis to perform command control, recognize clothe patterns, food patterns and traffic signal identification by using a wearable computer and 3) a speaker to provide the name of audio outputs of clothe patterns and colors, food patterns and traffic signal analysis, as well as system status. To handle the large intra class variations, a novel descriptor, Radon Signature is proposed to capture the global directionality of clothe patterns, food patterns and traffic signal analysis. To evaluate the effectiveness of the proposed approach CCNY clothes Pattern dataset is used. Our approach achieves 92.55% recognition to improve the life quality, do not depend others. DOI: 10.17762/ijritcc2321-8169.15032

    Vision Based Assistive System for Label Detection with Voice Output

    Get PDF
    Abstract--A camera based assistive text reading framework to help blind persons read text labels and product packaging from hand-held object in their daily resides is proposed. To isolate the object from cluttered backgrounds or other surroundings objects in the camera view, we propose an efficient and effective motion based method to define a region of interest (ROI) in the video by asking the user to shake the object. In the extracted ROI, text localization and recognition are conducted to acquire text information. To automatically localize the text regions from the object ROI, we propose a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are then binarized and recognized by off-theshelf optical character recognition software. The recognized text codes are output to blind users in speech

    Deep Learning on Smart Meter Data: Non-Intrusive Load Monitoring and Stealthy Black-Box Attacks

    Get PDF
    Climate change and environmental concerns are instigating widespread changes in modern electricity sectors due to energy policy initiatives and advances in sustainable technologies. To raise awareness of sustainable energy usage and capitalize on advanced metering infrastructure (AMI), a novel deep learning non-intrusive load monitoring (NILM) model is proposed to disaggregate smart meter readings and identify the operation of individual appliances. This model can be used by Electric power utility (EPU) companies and third party entities, and then utilized to perform active or passive consumer power demand management. Although machine learning (ML) algorithms are powerful, these remain vulnerable to adversarial attacks. In this thesis, a novel stealthy black-box attack that targets NILM models is proposed. This work sheds light on both effectiveness and vulnerabilities of ML models in the smart grid context and provides valuable insights for maintaining security especially with increasing proliferation of artificial intelligence in the power system

    Soft Biometric Analysis: MultiPerson and RealTime Pedestrian Attribute Recognition in Crowded Urban Environments

    Get PDF
    Traditionally, recognition systems were only based on human hard biometrics. However, the ubiquitous CCTV cameras have raised the desire to analyze human biometrics from far distances, without people attendance in the acquisition process. Highresolution face closeshots are rarely available at far distances such that facebased systems cannot provide reliable results in surveillance applications. Human soft biometrics such as body and clothing attributes are believed to be more effective in analyzing human data collected by security cameras. This thesis contributes to the human soft biometric analysis in uncontrolled environments and mainly focuses on two tasks: Pedestrian Attribute Recognition (PAR) and person reidentification (reid). We first review the literature of both tasks and highlight the history of advancements, recent developments, and the existing benchmarks. PAR and person reid difficulties are due to significant distances between intraclass samples, which originate from variations in several factors such as body pose, illumination, background, occlusion, and data resolution. Recent stateoftheart approaches present endtoend models that can extract discriminative and comprehensive feature representations from people. The correlation between different regions of the body and dealing with limited learning data is also the objective of many recent works. Moreover, class imbalance and correlation between human attributes are specific challenges associated with the PAR problem. We collect a large surveillance dataset to train a novel gender recognition model suitable for uncontrolled environments. We propose a deep residual network that extracts several posewise patches from samples and obtains a comprehensive feature representation. In the next step, we develop a model for multiple attribute recognition at once. Considering the correlation between human semantic attributes and class imbalance, we respectively use a multitask model and a weighted loss function. We also propose a multiplication layer on top of the backbone features extraction layers to exclude the background features from the final representation of samples and draw the attention of the model to the foreground area. We address the problem of person reid by implicitly defining the receptive fields of deep learning classification frameworks. The receptive fields of deep learning models determine the most significant regions of the input data for providing correct decisions. Therefore, we synthesize a set of learning data in which the destructive regions (e.g., background) in each pair of instances are interchanged. A segmentation module determines destructive and useful regions in each sample, and the label of synthesized instances are inherited from the sample that shared the useful regions in the synthesized image. The synthesized learning data are then used in the learning phase and help the model rapidly learn that the identity and background regions are not correlated. Meanwhile, the proposed solution could be seen as a data augmentation approach that fully preserves the label information and is compatible with other data augmentation techniques. When reid methods are learned in scenarios where the target person appears with identical garments in the gallery, the visual appearance of clothes is given the most importance in the final feature representation. Clothbased representations are not reliable in the longterm reid settings as people may change their clothes. Therefore, developing solutions that ignore clothing cues and focus on identityrelevant features are in demand. We transform the original data such that the identityrelevant information of people (e.g., face and body shape) are removed, while the identityunrelated cues (i.e., color and texture of clothes) remain unchanged. A learned model on the synthesized dataset predicts the identityunrelated cues (shortterm features). Therefore, we train a second model coupled with the first model and learns the embeddings of the original data such that the similarity between the embeddings of the original and synthesized data is minimized. This way, the second model predicts based on the identityrelated (longterm) representation of people. To evaluate the performance of the proposed models, we use PAR and person reid datasets, namely BIODI, PETA, RAP, Market1501, MSMTV2, PRCC, LTCC, and MIT and compared our experimental results with stateoftheart methods in the field. In conclusion, the data collected from surveillance cameras have low resolution, such that the extraction of hard biometric features is not possible, and facebased approaches produce poor results. In contrast, soft biometrics are robust to variations in data quality. So, we propose approaches both for PAR and person reid to learn discriminative features from each instance and evaluate our proposed solutions on several publicly available benchmarks.This thesis was prepared at the University of Beria Interior, IT Instituto de Telecomunicações, Soft Computing and Image Analysis Laboratory (SOCIA Lab), Covilhã Delegation, and was submitted to the University of Beira Interior for defense in a public examination session

    The Science of Disguise

    Get PDF
    Technological advances have made digital cameras ubiquitous, to the point where it is difficult to purchase even a mobile phone without one. Coupled with similar advances in face recognition technology, we are seeing a marked increase in the use of biometrics, such as face recognition, to identify individuals. However, remaining unrecognized in an era of ubiquitous camera surveillance remains desirable to some citizens, notably those concerned with privacy. Since biometrics are an intrinsic part of a person\u27s identity, it may be that the only means of evading detection is through disguise. We have created a comprehensive database of high-quality imagery that will allow us to explore the effectiveness of disguise as an approach to avoiding unwanted recognition. Using this database, we have evaluated the performance of a variety of automated machine-based face recognition algorithms on disguised faces. Our data-driven analysis finds that for the sample population contained in our database: (1) disguise is effective; (2) there are significant performance differences between individuals and demographic groups; and (3) elements including coverage, contrast, and disguise combination are determinative factors in the success or failure of face recognition algorithms on an image. In this dissertation, we examine the present-day uses of face recognition and their interplay with privacy concerns. We sketch the capabilities of a new database of facial imagery, unique both in the diversity of the imaged population, and in the diversity and consistency of disguises applied to each subject. We provide an analysis of disguise performance based on both a highly-rated commercial face recognition system and an open-source algorithm available to the FR community. Finally, we put forth hypothetical models for these results, and provide insights into the types of disguises that are the most effective at defeating facial recognition for various demographic populations. As cameras become more sophisticated and algorithms become more advanced, disguise may become less effective. For security professionals, this is a laudable outcome; privacy advocates will certainly feel differently

    Object Detection and Recognition for Visually Impaired People

    Full text link
    Object detection plays a very important role in many applications such as image retrieval, surveillance, robot navigation, wayfinding, etc. In this thesis, we propose different approaches to detect indoor signage, stairs and pedestrians. In the first chapter we introduce some related work in this field. In the second chapter, we introduced a new method to detect the indoor signage to help blind people find their destination in unfamiliar environments. Our method first extracts the attended areas by using a saliency map. Then the signage is detected in the attended areas by using bipartite graph matching. The proposed method can handle multiple signage detection. Experimental results on our collected indoor signage dataset demonstrate the effectiveness and efficiency of our proposed method. Furthermore, saliency maps could eliminate the interference information and improve the accuracy of the detection results. In the third chapter, we present a novel camera-based approach to automatically detect and recognize restroom signage from surrounding environments. Our method first extracts the attended areas which may content signage based on shape detection. Then, Scale-Invariant Feature Transform (SIFT) is applied to extract local features in the detected attended areas. Finally, signage is detected and recognized as the regions with the SIFT matching scores larger than a threshold. The proposed method can handle multiple signage detection. Experimental results on our collected restroom signage dataset demonstrate the effectiveness and efficiency of our proposed method. In the fourth chapter, we develop a new framework to detect and recognize stairs and pedestrian crosswalks using a RGBD camera. Since both stairs and pedestrian crosswalks are featured by a group of parallel lines, we first apply Hough transform to extract the concurrent parallel lines based on the RGB channels. Then, the Depth channel is employed to further recognize pedestrian crosswalks, upstairs, and downstairs using support vector machine (SVM) classifiers. Furthermore, we estimate the distance between the camera and stairs for the blind users. The detection and recognition results on our collected dataset demonstrate that the effectiveness and efficiency of our proposed framework Keywords: Blind people, Navigation and wayfinding, Camera, Signage detection and recognition, Independent trave

    A Methodology for Extracting Human Bodies from Still Images

    Get PDF
    Monitoring and surveillance of humans is one of the most prominent applications of today and it is expected to be part of many future aspects of our life, for safety reasons, assisted living and many others. Many efforts have been made towards automatic and robust solutions, but the general problem is very challenging and remains still open. In this PhD dissertation we examine the problem from many perspectives. First, we study the performance of a hardware architecture designed for large-scale surveillance systems. Then, we focus on the general problem of human activity recognition, present an extensive survey of methodologies that deal with this subject and propose a maturity metric to evaluate them. One of the numerous and most popular algorithms for image processing found in the field is image segmentation and we propose a blind metric to evaluate their results regarding the activity at local regions. Finally, we propose a fully automatic system for segmenting and extracting human bodies from challenging single images, which is the main contribution of the dissertation. Our methodology is a novel bottom-up approach relying mostly on anthropometric constraints and is facilitated by our research in the fields of face, skin and hands detection. Experimental results and comparison with state-of-the-art methodologies demonstrate the success of our approach
    corecore