444 research outputs found

    Towards an efficient, unsupervised and automatic face detection system for unconstrained environments

    Get PDF
    Nowadays, there is growing interest in face detection applications for unconstrained environments. The increasing need for public security and national security motivated our research on the automatic face detection system. For public security surveillance applications, the face detection system must be able to cope with unconstrained environments, which includes cluttered background and complicated illuminations. Supervised approaches give very good results on constrained environments, but when it comes to unconstrained environments, even obtaining all the training samples needed is sometimes impractical. The limitation of supervised approaches impels us to turn to unsupervised approaches. In this thesis, we present an efficient and unsupervised face detection system, which is feature and configuration based. It combines geometric feature detection and local appearance feature extraction to increase stability and performance of the detection process. It also contains a novel adaptive lighting compensation approach to normalize the complicated illumination in real life environments. We aim to develop a system that has as few assumptions as possible from the very beginning, is robust and exploits accuracy/complexity trade-offs as much as possible. Although our attempt is ambitious for such an ill posed problem-we manage to tackle it in the end with very few assumptions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Face recognition in the wild.

    Get PDF
    Research in face recognition deals with problems related to Age, Pose, Illumination and Expression (A-PIE), and seeks approaches that are invariant to these factors. Video images add a temporal aspect to the image acquisition process. Another degree of complexity, above and beyond A-PIE recognition, occurs when multiple pieces of information are known about people, which may be distorted, partially occluded, or disguised, and when the imaging conditions are totally unorthodox! A-PIE recognition in these circumstances becomes really “wild” and therefore, Face Recognition in the Wild has emerged as a field of research in the past few years. Its main purpose is to challenge constrained approaches of automatic face recognition, emulating some of the virtues of the Human Visual System (HVS) which is very tolerant to age, occlusion and distortions in the imaging process. HVS also integrates information about individuals and adds contexts together to recognize people within an activity or behavior. Machine vision has a very long road to emulate HVS, but face recognition in the wild, using the computer, is a road to perform face recognition in that path. In this thesis, Face Recognition in the Wild is defined as unconstrained face recognition under A-PIE+; the (+) connotes any alterations to the design scenario of the face recognition system. This thesis evaluates the Biometric Optical Surveillance System (BOSS) developed at the CVIP Lab, using low resolution imaging sensors. Specifically, the thesis tests the BOSS using cell phone cameras, and examines the potential of facial biometrics on smart portable devices like iPhone, iPads, and Tablets. For quantitative evaluation, the thesis focused on a specific testing scenario of BOSS software using iPhone 4 cell phones and a laptop. Testing was carried out indoor, at the CVIP Lab, using 21 subjects at distances of 5, 10 and 15 feet, with three poses, two expressions and two illumination levels. The three steps (detection, representation and matching) of the BOSS system were tested in this imaging scenario. False positives in facial detection increased with distances and with pose angles above ± 15°. The overall identification rate (face detection at confidence levels above 80%) also degraded with distances, pose, and expressions. The indoor lighting added challenges also, by inducing shadows which affected the image quality and the overall performance of the system. While this limited number of subjects and somewhat constrained imaging environment does not fully support a “wild” imaging scenario, it did provide a deep insight on the issues with automatic face recognition. The recognition rate curves demonstrate the limits of low-resolution cameras for face recognition at a distance (FRAD), yet it also provides a plausible defense for possible A-PIE face recognition on portable devices

    Extracting structured information from 2D images

    Get PDF
    Convolutional neural networks can handle an impressive array of supervised learning tasks while relying on a single backbone architecture, suggesting that one solution fits all vision problems. But for many tasks, we can directly make use of the problem structure within neural networks to deliver more accurate predictions. In this thesis, we propose novel deep learning components that exploit the structured output space of an increasingly complex set of problems. We start from Optical Character Recognition (OCR) in natural scenes and leverage the constraints imposed by a spatial outline of letters and language requirements. Conventional OCR systems do not work well in natural scenes due to distortions, blur, or letter variability. We introduce a new attention-based model, equipped with extra information about the neuron positions to guide its focus across characters sequentially. It beats the previous state-of-the-art benchmark by a significant margin. We then turn to dense labeling tasks employing encoder-decoder architectures. We start with an experimental study that documents the drastic impact that decoder design can have on task performance. Rather than optimizing one decoder per task separately, we propose new robust layers for the upsampling of high-dimensional encodings. We show that these better suit the structured per pixel output across the board of all tasks. Finally, we turn to the problem of urban scene understanding. There is an elaborate structure in both the input space (multi-view recordings, aerial and street-view scenes) and the output space (multiple fine-grained attributes for holistic building understanding). We design new models that benefit from a relatively simple cuboidal-like geometry of buildings to create a single unified representation from multiple views. To benchmark our model, we build a new multi-view large-scale dataset of buildings images and fine-grained attributes and show systematic improvements when compared to a broad range of strong CNN-based baselines

    Towards an automated photogrammetry-based approach for monitoring and controlling construction site activities

    Get PDF
    The construction industry has a poor productivity record, which was predominantly ascribed to inadequate monitoring of how a project is progressing at any given time. Most available approaches do not offer key stakeholders a shared understanding of project performance in real-time, which as a result failed to identify any project slippage on the original schedule. This study reports on the development of a novel automated system for monitoring, updating and controlling construction site activities in real-time. The proposed system seeks to harness advances in close-range photogrammetry, BIM and computer vision to deliver an original approach that is capable of continuous monitoring of construction activities, with the progress status determinable, at any given time, throughout the construction stage.The research adopted a sequential mixed approach strategy pursuant to the design science standard processes in three stages. The first stage involved interviews within a focus group setting with seven carefully selected construction professionals. Their answers were analysed and provided "the informed-basis for the development of the automated system” for detecting and notifying delays in construction projects. The second stage involved development of ‘proof of the concept’ in a pilot project case study with nine potential users of the proposed automated system. Face-to-face interviews were conducted to evaluate and verify the effectiveness of the developed prototype, which as a result was continuously refined and improved according to the users’ comments and feedbacks. Within this stage the prototype to be tested and evaluated by a representative of construction professionals was developed. Subsequently a sub-stage of the system’s development sought to test and validate the final version of the system in the context of a real-life construction project in Dubai whereby an online survey is administered to 40 users, a representative sample of potential system users. The third stage addressed the conclusion, limitations and recommendations for further research studies for the proposed system.The findings of the study revealed that once the system installed and programmed, it does not require any expertise or manual intervention. This is mainly due to all the processes of the system being fully automated and the data collection, interpretations, analysis and notifications are automatically processed without any human intervention. Consequently, human errors and subjectivity are eliminated, and accordingly the system achieved a significantly high level of accuracy, automation and reliability. The system achieved a level of accuracy of 99.97% for horizontal construction elements and exceeded 99.70% for vertical elements. The findings also highlighted that this developed system is inexpensive, easy to operate and its accuracy excels that of current systems sought to automate monitoring and updating of progress status’ for construction projects. The distinctive features of the proposed system assisted the site team to complete the project 61 days ahead of its contractual completion date with a 9% time saving and 3% cost saving.The proposed system has the potential to identify any deviation from as-planned construction schedules, and prompt actions taken in response to the automatic notification system, which informs decision-makers via emails and SMS

    A privacy-aware and secure system for human memory augmentation

    Get PDF
    The ubiquity of digital sensors embedded in today's mobile and wearable devices (e.g., smartphones, wearable cameras, wristbands) has made technology more intertwined with our life. Among many other things, this allows us to seamlessly log our daily experiences in increasing numbers and quality, a process known as ``lifelogging''. This practice produces a great amount of pictures and videos that can potentially improve human memory. Consider how a single photograph can bring back distant childhood memories, or how a song can help us reminisce about our last vacation. Such a vision of a ``memory augmentation system'' can offer considerable benefits, but it also raises new security and privacy challenges. Maybe obviously, a system that captures everywhere we go, and everything we say, see, and do, is greatly increasing the danger to our privacy. Any data breach of such a memory repository, whether accidental or malicious, could negatively impact both our professional and private reputation. In addition, the threat of memory manipulation might be the most worrisome aspect of a memory augmentation system: if an attacker is able to remove, add, or change our captured information, the resulting data may implant memories in our heads that never took place, or, in turn, accelerate the loss of other memories. Starting from such key challenges, this thesis investigates how to design secure memory augmentation systems. In the course of this research, we develop tools and prototypes that can be applied by researchers and system engineers to develop pervasive applications that help users capture and later recall episodic memories in a secure fashion. We build trusted sensors and protocols to securely capture and store experience data, and secure software for the secure and privacy-aware exchange of experience data with others. We explore the suitability of various access control models to put users in control of the plethora of data that the system captures on their behalf. We also explore the possibility of using in situ physical gestures to control different aspects regarding the capturing and sharing of experience data. Ultimately, this thesis contributes to the design and development of secure systems for memory augmentation

    Towards privacy-compliant mobile computing

    Get PDF
    Sophisticated mobile computing, sensing and recording devices like smartphones, smartwatches, and wearable cameras are carried by their users virtually around the clock, blurring the distinction between the online and offline worlds. While these devices enable transformative new applications and services, they also introduce entirely new threats to users’ privacy because they can capture a complete record of the user’s location, online and offline activities, and social encounters, including an audiovisual record. Such a record of users’ personal information is highly sensitive and is subject to numerous privacy risks. In this thesis, we have investigated and built systems to mitigate two such privacy risks: 1) privacy risks due to ubiquitous digital capture, where bystanders may inadvertently be captured in photos and videos recorded by other nearby users, 2) privacy risks to users’ personal information introduced by a popular class of apps called ‘mobile social apps’. In this thesis, we present two systems, called I-Pic and EnCore, built to mitigate these two privacy risks. Both systems aim to put the users back in control of what personal information is being collected and shared, while still enabling innovative new applications. We built working prototypes of both systems and evaluated them through actual user deployments. Overall we demonstrate that it is possible to achieve privacy-compliant digital capture and it is possible to build privacy-compliant mobile social apps, while preserving their intended functionality and ease-of-use. Furthermore, we also explore how the two solutions can be merged into a powerful combination, one which could enable novel workflows for specifying privacy preferences in image capture that do not currently exist.Die heutigen GerĂ€te zur mobilen Kommunikation, und Messdatenerfassung und - aufzeichnung, wie Smartphones, Smartwatches und Sport-Kameras werden in der Regel von ihren Besitzern rund um die Uhr getragen, so daß der Unterschied zwischen Online- und Offline-Zeiten zunehmend verschwimmt. Diese GerĂ€te ermöglichen zwar völlig neue Applikationen und Dienste, gefĂ€hrden aber gleichzeitig die PrivatsphĂ€re ihrer Nutzer, weil sie den Standort, die gesamten On-und Offline AktivitĂ€ten, sowie die soziale Beziehungen protokollieren, bis hin zu audio-visuellen Aufzeichnungen. Solche persönlichen Nutzerdaten sind extrem schĂŒtzenswert und sind verschiedenen Risiken in Bezug auf die PrivatsphĂ€re ausgesetzt. In dieser These haben wir Systeme untersucht und gebaut, die zwei dieser Risiken fĂŒr die PrivatsphĂ€re minimieren: 1) Risiko der PrivatssphĂ€re wegen omniprĂ€senter digitaler Aufzeichnungen Dritter, bei denen Unbeteiligte unbeabsichtigt (oder gegen ihren Wunsch) in Fotos und Videos festgehalten werden 2) Risiko fĂŒr die persönlichen Informationen der Nutzer welche durch die bekannte Kategorie der sozialen Applikationen herbeigefĂŒhrt werden. In dieser These stellen wir zwei Systeme, namens I-Pic und EnCore vor, welche die zwei PrivatssphĂ€re-Risiken minimieren. Beide System wollen dem Benutzer die Kontrolle zurĂŒckgeben, zu entscheiden welche seiner persönlichen Daten gesammelt und geteilt werden, wĂ€hrend weiterhin neue innovative Applikationen ermöglicht werden. Wir haben fĂŒr beide Systeme funktionsfĂ€hige Prototypen gebaut und diese mit echten Nutzerdaten evaluiert. Wir können generell zeigen dass es möglich ist, digitale Aufzeichnung zu machen, und soziale Applikationen zu bauen, welche nicht die PrivatsphĂ€re verletzen, ohne dabei die beabsichtige FunktionalitĂ€t zu verlieren oder die Bedienbarkeit zu mindern. Des weiteren erforschen wir, wie diese zwei Systeme zu einem leistungsfĂ€higeren Ansatz zusammengefĂŒhrt werden können, welcher neuartige Workflows ermöglicht, um Einstellungen zur PrivatsphĂ€re fĂŒr digitale Aufzeichnungen vorzunehmen, die es heute noch nicht gibt

    Place Recognition by Per-Location Classifiers

    Get PDF
    Place recognition is formulated as a task of finding the location where the query image was captured. This is an important task that has many practical applications in robotics, autonomous driving, augmented reality, 3D reconstruction or systems that organize imagery in geographically structured manner. Place recognition is typically done by finding a reference image in a large structured geo-referenced database. In this work, we first address the problem of building a geo-referenced dataset for place recognition. We describe a framework for building the dataset from the street-side imagery of the Google Street View that provides panoramic views from positions along many streets, cities and rural areas worldwide. Besides of downloading the panoramic views and ability to transform them into a set of perspective images, the framework is capable of getting underlying scene depth information. Second, we aim at localizing a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold; (i) we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition, and (ii) as only a few positive training examples are available for each location, we propose two methods to calibrate all the per-location SVM classifiers without the need for additional positive training data. The first method relies on p-values from statistical hypothesis testing and uses only the available negative training data. The second method performs an affine calibration by appropriately normalizing the learned classifier hyperplane and does not need any additional labeled training data. We test the proposed place recognition method with the bag-of-visual-words and Fisher vector image representations suitable for large scale indexing. Experiments are performed on three datasets: 25,000 and 55,000 geotagged street view images of Pittsburgh, and the 24/7 Tokyo benchmark containing 76,000 images with varying illumination conditions. The results show improved place recognition accuracy of the learned image representation over direct matching of raw image descriptors.Katedra kybernetik
    • 

    corecore