1,296 research outputs found

    Automatic indoor/outdoor scene classification

    Get PDF
    The advent and wide acceptance of digital imaging technology has motivated an upsurge in research focused on managing the ever-growing number of digital images. Current research in image manipulation represents a general shift in the field of computer vision from traditional image analysis based on low-level features (e.g. color and texture) to semantic scene understanding based on high-level features (e.g. grass and sky). One particular area of investigation is scene categorization, where the organization of a large number of images is treated as a classification problem. Generally, the classification involves mapping a set of traditional low-level features to semantically meaningful categories, such as indoor and outdoor scenes, using a classifier engine. Successful indoor/outdoor scene categorization is beneficial to a number of image manipulation applications, as indoor and outdoor scenes represent among the most general scene types. In content-based image retrieval, for example, a query for a scene containing a sunset can be restricted to images in the database pre-categorized as outdoor scenes. Also, in image enhancement, categorization of a scene as indoor vs. outdoor can lead to improved color balancing and tone reproduction. Prior research in scene classification has shown that high-level information can, in fact, be inferred from low-level image features. Classification rates of roughly 90% have been reported using low-level features to predict indoor scenes vs. outdoor scenes. However, the high classification rates are often achieved by using computationally expensive, high-dimensional feature sets, thus limiting the practical implementation of such systems. To address this problem, a low complexity, low-dimensional feature set was extracted in a variety of configurations in the work presented here. Due to their excellent generalization performance, Support Vector Machines (SVMs) were used to manage the tradeoff between reduced dimensionality and increased classification accuracy. It was determined that features extracted from image subblocks, as opposed to the full image, can yield better classification rates when combined in a second stage. In particular, applying SVMs in two stages led to an indoor/outdoor classification accuracy of 90.2% on a large database of consumer photographs provided by Kodak. Finally, it was also shown that low-level and semantic features can be integrated efficiently using Bayesian networks for increased accuracy. Specifically, the integration of grass and sky semantic features with color and texture low-level features increased the indoor/outdoor classification rate to 92.8% on the same database of images

    Parts-based object detection using multiple views

    Get PDF
    One of the most important problems in image understanding is robust object detection. Small changes in object appearance due to illumination, viewpoint, and occlusion can drastically change the performance of many object detection methods. Non-rigid object can be even more difficult to reliably detect. The unique contribution of this thesis was to extend the approach of parts-based object detection to include support for multiple viewing angles. Bayesian networks were used to integrate the parts detection of each view in a flexible manner, so that the experimental performance of each part detector could be incorporated into the decision. The detectors were implemented using neural networks trained using the bootstrapping method of repeated backpropagation, where false-positives are introduced to the training set as negative examples. The Bayesian networks were trained with a separate dataset to gauge the performance of each part detector. The final decision of object detection system was made with a logical OR operation. The domain of human face detection was used to demonstrate the power of this approach. The FERET human face database was selected to provide both training and testing images; a frontal and a side view were chosen from the available poses. Part detectors were trained on four features from each view?the right and left eyes, the nose, and the mouth. The individual part detection rates ranged from 85% to 95% against testing images. Crossvalidation was used to test the system as a whole, giving average view detection rates of 96.7% and 97.2% respectively for the frontal and side views, and an overall face detection rate of 96.9% amongst true-positive images. A 5.7% false-positive rate was demonstrated against background clutter images. These results compare favorably with existing methods, but provide the additional benefit of face detection at different view angles

    The state-of-the-art in personalized recommender systems for social networking

    Get PDF
    With the explosion of Web 2.0 application such as blogs, social and professional networks, and various other types of social media, the rich online information and various new sources of knowledge flood users and hence pose a great challenge in terms of information overload. It is critical to use intelligent agent software systems to assist users in finding the right information from an abundance of Web data. Recommender systems can help users deal with information overload problem efficiently by suggesting items (e.g., information and products) that match users’ personal interests. The recommender technology has been successfully employed in many applications such as recommending films, music, books, etc. The purpose of this report is to give an overview of existing technologies for building personalized recommender systems in social networking environment, to propose a research direction for addressing user profiling and cold start problems by exploiting user-generated content newly available in Web 2.0

    Analyzing tourist data on Twitter: a case study in the province of Granada at Spain

    Get PDF
    This work has been funded by the Spanish Ministerio de Economía y Competitividad under project TIN2016-77902-C3-2-P, and the European Regional Development Fund (ERDF-FEDER)

    Emerging technologies for learning report (volume 3)

    Get PDF

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    A solution to the hyper complex, cross domain reality of artificial intelligence: The hierarchy of AI

    Get PDF
    Artificial Intelligence (AI) is an umbrella term used to describe machine-based forms of learning. This can encapsulate anything from Siri, Apple's smartphone-based assistant, to Tesla's autonomous vehicles (self-driving cars). At present, there are no set criteria to classify AI. The implications of which include public uncertainty, corporate scepticism, diminished confidence, insufficient funding and limited progress. Current substantial challenges exist with AI such as the use of combinationally large search space, prediction errors against ground truth values, the use of quantum error correction strategies. These are discussed in addition to fundamental data issues across collection, sample error and quality. The concept of cross realms and domains used to inform AI, is considered. Furthermore there is the issue of the confusing range of current AI labels. This paper aims to provide a more consistent form of classification, to be used by institutions and organisations alike, as they endeavour to make AI part of their practice. In turn, this seeks to promote transparency and increase trust. This has been done through primary research, including a panel of data scientists / experts in the field, and through a literature review on existing research. The authors propose a model solution in that of the Hierarchy of AI

    Representations and representation learning for image aesthetics prediction and image enhancement

    Get PDF
    With the continual improvement in cell phone cameras and improvements in the connectivity of mobile devices, we have seen an exponential increase in the images that are captured, stored and shared on social media. For example, as of July 1st 2017 Instagram had over 715 million registered users which had posted just shy of 35 billion images. This represented approximately seven and nine-fold increase in the number of users and photos present on Instagram since 2012. Whether the images are stored on personal computers or reside on social networks (e.g. Instagram, Flickr), the sheer number of images calls for methods to determine various image properties, such as object presence or appeal, for the purpose of automatic image management and curation. One of the central problems in consumer photography centers around determining the aesthetic appeal of an image and motivates us to explore questions related to understanding aesthetic preferences, image enhancement and the possibility of using such models on devices with constrained resources. In this dissertation, we present our work on exploring representations and representation learning approaches for aesthetic inference, composition ranking and its application to image enhancement. Firstly, we discuss early representations that mainly consisted of expert features, and their possibility to enhance Convolutional Neural Networks (CNN). Secondly, we discuss the ability of resource-constrained CNNs, and the different architecture choices (inputs size and layer depth) in solving various aesthetic inference tasks: binary classification, regression, and image cropping. We show that if trained for solving fine-grained aesthetics inference, such models can rival the cropping performance of other aesthetics-based croppers, however they fall short in comparison to models trained for composition ranking. Lastly, we discuss our work on exploring and identifying the design choices in training composition ranking functions, with the goal of using them for image composition enhancement
    corecore