27 research outputs found

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

    Content-Based Access Control

    Get PDF
    In conventional database, the most popular access control model specifies policies explicitly for each role of every user against each data object manually. Nowadays, in large-scale content-centric data sharing, conventional approaches could be impractical due to exponential explosion of the data growth and the sensitivity of data objects. What's more, conventional database access control policy will not be functional when the semantic content of data is expected to play a role in access decisions. Users are often over-privileged, and ex post facto auditing is enforced to detect misuse of the privileges. Unfortunately, it is usually difficult to reverse the damage, as (large amount of) data has been disclosed already. In this dissertation, we first introduce Content-Based Access Control (CBAC), an innovative access control model for content-centric information sharing. As a complement to conventional access control models, the CBAC model makes access control decisions based on the content similarity between user credentials and data content automatically. In CBAC, each user is allowed by a metarule to access "a subset" of the designated data objects of a content-centric database, while the boundary of the subset is dynamically determined by the textual content of data objects. We then present an enforcement mechanism for CBAC that exploits Oracles Virtual Private Database (VPD) to implement a row-wise access control and to prevent data objects from being abused by unnecessary access admission. To further improve the performance of the proposed approach, we introduce a content-based blocking mechanism to improve the efficiency of CBAC enforcement to further reveal a more relevant part of the data objects comparing with only using the user credentials and data content. We also utilized several tagging mechanisms for more accurate textual content matching for short text snippets (e.g. short VarChar attributes) to extract topics other than pure word occurrences to represent the content of data. In the tagging mechanism, the similarity of content is calculated not purely dependent on the word occurrences but the semantic topics underneath the text content. Experimental results show that CBAC makes accurate access control decisions with a small overhead

    Data Mining

    Get PDF
    The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining

    Bag-of-Visual Words and Error-Correcting Output Codes for Multilabel Classification of Remote Sensing Images

    Get PDF
    This paper presents a novel framework for multilabel classification of remote sensing images using Error-Correcting Output Codes (ECOC). Starting with a set of primary class labels, the proposed framework consists in transforming the multiclass problem into binary learning subproblems. The distributed output representations of these binary learners are then transformed into primary class labels. In order to obtain robustness with respect to scale, rotation and image content, a Bag-of-Visual Words (BOVW) model based on Scale Invariant Feature Transform (SIFT) descriptors is used for feature extraction. BOVW assumes an a-priori unsupervised learning of a dictionary of visual words over the training set. Experiments are performed on GeoEye-1 images and the results show the effectiveness of the proposed approach towards multilabel classification, if compared to other methods

    LWA 2013. Lernen, Wissen & Adaptivität ; Workshop Proceedings Bamberg, 7.-9. October 2013

    Get PDF
    LWA Workshop Proceedings: LWA stands for "Lernen, Wissen, Adaption" (Learning, Knowledge, Adaptation). It is the joint forum of four special interest groups of the German Computer Science Society (GI). Following the tradition of the last years, LWA provides a joint forum for experienced and for young researchers, to bring insights to recent trends, technologies and applications, and to promote interaction among the SIGs

    Face Recognition from Weakly Labeled Data

    Get PDF
    Recognizing the identity of a face or a person in the media usually requires lots of training data to design robust classifiers, which demands a great amount of human effort for annotation. Alternatively, the weakly labeled data is publicly available, but the labels can be ambiguous or noisy. For instance, names in the caption of a news photo provide possible candidates for faces appearing in the image. Names in the screenplays are only weakly associated with faces in the videos. Since weakly labeled data is not explicitly labeled by humans, robust learning methods that use weakly labeled data should suppress the impact of noisy instances or automatically resolve the ambiguities in noisy labels. We propose a method for character identification in a TV-series. The proposed method uses automatically extracted labels by associating the faces with names in the transcripts. Such weakly labeled data often has erroneous labels resulting from errors in detecting a face and synchronization. Our approach achieves robustness to noisy labeling by utilizing several features. We construct track nodes from face and person tracks and utilize information from facial and clothing appearances. We discover the video structure for effective inference by constructing a minimum-distance spanning tree (MST) from the track nodes. Hence, track nodes of similar appearance become adjacent to each other and are likely to have the same identity. The non-local cost aggregation step thus serves as a noise suppression step to reliably recognize the identity of the characters in the video. Another type of weakly labeled data results from labeling ambiguities. In other words, a training sample can have more than one label, and typically one of the labels is the true label. For instance, a news photo is usually accompanied by the captions, and the names provided in the captions can be used as the candidate labels for the faces appearing in the photo. Learning an effective subject classifier from the ambiguously labeled data is called ambiguously labeled learning. We propose a matrix completion framework for predicting the actual labels from the ambiguously labeled instances, and a standard supervised classifier that subsequently learns from the disambiguated labels to classify new data. We generalize this matrix completion framework to handle the issue of labeling imbalance that avoids domination by dominant labels. Besides, an iterative candidate elimination step is integrated with the proposed approach to improve the ambiguity resolution. Recently, video-based face recognition techniques have received significant attention since faces in a video provide diverse exemplars for constructing a robust representation of the target (i.e., subject of interest). Nevertheless, the target face in the video is usually annotated with minimum human effort (i.e., a single bounding box in a video frame). Although face tracking techniques can be utilized to associate faces in a single video shot, it is ineffective for associating faces across multiple video shots. To fully utilize faces of a target in multiples-shot videos, we propose a target face association (TFA) method to obtain a set of images of the target face, and these associated images are then utilized to construct a robust representation of the target for improving the performance of video-based face recognition task. One of the most important applications of video-based face recognition is outdoor video surveillance using a camera network. Face recognition in outdoor environment is a challenging task due to illumination changes, pose variations, and occlusions. We present the taxonomy of camera networks and discuss several techniques for continuous tracking of faces acquired by an outdoor camera network as well as a face matching algorithm. Finally, we demonstrate the real-time video surveillance system using pan-tilt-zoom (PTZ) cameras to perform pedestrian tracking, localization, face detection, and face recognition

    Creating 3D city models from satellite imagery for integrated assessment and forecasting of solar energy

    Get PDF
    Buildings are the most prominent component in the urban environment. The geometric identification of urban buildings plays an important role in a range of urban applications, including 3D representations of buildings, energy consumption analysis, sustainable development, urban planning, risk assessment, and change detection. In particular, 3D building models can provide a comprehensive assessment of surfaces exposed to solar radiation. However, the identification of the available surfaces on urban structures and the actual locations which receive a sufficient amount of sunlight to increase installed power capacity (e.g. Photovoltaic systems) are crucial considerations for solar energy supply efficiency. Although considerable research has been devoted to detecting the rooftops of buildings, less attention has been paid to creating and completing 3D models of urban buildings. Therefore, there is a need to increase our understanding of the solar energy potential of the surfaces of building envelopes so we can formulate future adaptive energy policies for improving the sustainability of cities. The goal of this thesis was to develop a new approach to automatically model existing buildings for the exploitation of solar energy potential within an urban environment. By investigating building footprints and heights based on shadow information derived from satellite images, 3D city models were generated. Footprints were detected using a two level segmentation process: (1) the iterative graph cuts approach for determining building regions and (2) the active contour method and the adjusted-geometry parameters method for modifying the edges and shapes of the extracted building footprints. Building heights were estimated based on the simulation of artificial shadow regions using identified building footprints and solar information in the image metadata at pre-defined height increments. The difference between the actual and simulated shadow regions at every height increment was computed using the Jaccard similarity coefficient. The 3D models at the first level of detail were then obtained by extruding the building footprints based on their heights by creating image voxels and using the marching cube approach. In conclusion, 3D models of buildings can be generated solely from 2D data of the buildings’attributes in any selected urban area. The approach outperforms the past attempts, and mean error is reduced by at least 21%. Qualitative evaluations of the study illustrate that it is possible to achieve 3D building models based on satellite images with a mean error of less than 5 m. This comprehensive study allows for 3D city models to be generated in the absence of elevation attributes and additional data. Experiments revealed that this novel, automated method can be useful in a number of spatial analyses and urban sustainability applications
    corecore