9 research outputs found

    Overview of the 2005 cross-language image retrieval track (ImageCLEF)

    Get PDF
    The purpose of this paper is to outline efforts from the 2005 CLEF crosslanguage image retrieval campaign (ImageCLEF). The aim of this CLEF track is to explore the use of both text and content-based retrieval methods for cross-language image retrieval. Four tasks were offered in the ImageCLEF track: a ad-hoc retrieval from an historic photographic collection, ad-hoc retrieval from a medical collection, an automatic image annotation task, and a user-centered (interactive) evaluation task that is explained in the iCLEF summary. 24 research groups from a variety of backgrounds and nationalities (14 countries) participated in ImageCLEF. In this paper we describe the ImageCLEF tasks, submissions from participating groups and summarise the main fndings

    Cross-Language and Cross-Media Image Retrieval: An Empirical Study at ImageCLEF2007

    Get PDF
    Abstract. This paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task is to search natural photos by some query with both textual and visual information. In this paper, we study the empirical evaluations of our solutions for the image retrieval tasks in three aspects. First of all, we study the application of language models and smoothing strategies for text-based image retrieval, particularly addressing the short text query issue. Secondly, we study the cross-media image retrieval problem using some simple combination strategy. Lastly, we study the cross-language image retrieval problem between English and Chinese. Finally, we summarize our empirical experiences and indicate some future directions.

    Computer Vision Techniques for Ambient Intelligence Applications

    Get PDF
    Ambient Intelligence (AmI) is a muldisciplinary area which refers to environments that are sensitive and responsive to the presence of people and objects. The rapid progress of technology and simultaneous reduction of hardware costs characterizing the recent years have enlarged the number of possible AmI applications, thus raising at the same time new research challenges. In particular, one important requirement in AmI is providing a proactive support to people in their everyday working and free-time activities. To this aim, Computer Vision represents a core research track since only through suitable vision devices and techniques it is possible to detect elements of interest and understand the occurring events. The goal of this thesis is presenting and demonstrating efficacy of novel machine vision research contributes for different AmI scenarios: object keypoints analysis for Augmented Reality purpose, segmentation of natural images for plant species recognition and heterogeneous people identification in unconstrained environments

    Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

    Get PDF
    This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

    심미적 시지각에 대한 인지적고찰

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 협동과정 인지과학전공, 2015. 8. 장병탁.Answering for the question what is beauty? has emerged as an issue of psychology, neuroscience and computer science during the last decade, after the long history of exploration in the field of philosophy and aesthetics. Especially, in the field of computer science, computational aesthetics pursues implementing an automated aesthetic judgement system based on the low-level features and machine learning techniques, for tangible application such as content recommendation. In this paper, as an effort of building a computational model of estimating aesthetic value of photos in content recommendation, a hypothesis that surface curvatures of objects and a place in a scene contribute to the estimation is proposed and implemented as a new visual descriptor named Local Slant Cue (LoSC) which represent catching 2.5D information which traditional local descriptors are hard to catch. Experimental results show its comparable performance just with the 30 percent of computational workload of the previous arts. However, comparative study reveals there exist a kind of glass ceiling regardless of feature selection, due to a weird attribute of the mediocre samples, which occupy an absolute majority of any given sample group, in machine learning framework. Observation to the score distributions of the mediocre group leads to the discovery of significantly high variance in consensus level among human raters for the stimuli. For quantitative validation of the observation, skewness-kurtosis map is adopted as a tool of consensus analysis and applied to a massive photo aesthetics dataset consisting of 225,000 samples, followed by the result of showing validated universality of the observation as one of four patterns, which are incompatible with Gaussianity that has been expected so far. Several computational models of visual aesthetic perception are proposed and tested from the view of how well they explain the observed patterns, finding the comparative advantage of dynamic systems model. As an effort of elaborating the idea of dynamic systems for the aesthetic perception, a new computational model named as DDM4AP (Drift-Diffusion Model for Aesthetic Perception) is proposed regarding visual aesthetic perception as a result of dynamic interaction between like factors and dislike factors. While it is concentrating to explain the wide variance in consensus level, the proposed model predicts a significantly longer latency when appreciating photos the mediocre group rather than the good or the bad, regardless of consensus level. Human subject experiments validate the prediction, supporting the model as reflecting important attributes of visual aesthetic perception in human mind. In conclusion, this study declares computational aesthetics requires new approaches of machine learning and computer vision considering dynamic interaction between two contrastive factors and selecting training data and features in accordance with such mixed data.CHAPTER 1. Introduction 1.1. Background 1.1.1 In Philosophy 1.1.2 In Psychology 1.1.3 In Neuroscience 1.2. Related Works: Computational Aesthetics CHAPTER 2. Finding Features 2.1. Background 2.2. Local Slant Cue (LoSC) 2.2.1 Representation 2.2.2 Region Description 2.3. Experiments 2.4. Discussion CHAPTER 3. Data Revisited 3.1. What Makes Glass Ceiling 3.2. Consensus Analysis 3.2.1 Data Set 3.2.2 Method 3.3. Analysis Results: 4 Patterns 3.3.1 Pattern 1: A Wide Kurtosis Range 3.3.2 Pattern 2: Consensus Asymmetry 3.3.3 Pattern 3: The 4/3 Power Law Regime 3.3.4 Pattern 4: Tag Effect 3.4. Discussion CHAPTER 4. Modeling 4.1. Background 4.2. Static Models 4.3. Dynamic Models (DDM4AP) 4.4. Discussion CHAPTER 5. Validation 5.1. Background: Prediction from DDM4AP 5.2. Method 5.3. Experimental Results 5.4. Discussion Conclusion References Appendix 1. Free vs. Non-Free Study Appendix 2. Summary of Skewness and Kurtosis 국문초록Docto

    Multi-view Data Analysis

    Get PDF
    Multi-view data analysis is a key technology for making effective decisions by leveraging information from multiple data sources. The process of data acquisition across various sensory modalities gives rise to the heterogeneous property of data. In my thesis, multi-view data representations are studied towards exploiting the enriched information encoded in different domains or feature types, and novel algorithms are formulated to enhance feature discriminability. Extracting informative data representation is a critical step in visual recognition and data mining tasks. Multi-view embeddings provide a new way of representation learning to bridge the semantic gap between the low-level observations and high-level human comprehensible knowledge benefitting from enriched information in multiple modalities.Recent advances on multi-view learning have introduced a new paradigm in jointly modeling cross-modal data. Subspace learning method, which extracts compact features by exploiting a common latent space and fuses multi-view information, has emerged proiminent among different categories of multi-view learning techniques. This thesis provides novel solutions in learning compact and discriminative multi-view data representations by exploiting the data structures in low dimensional subspace. We also demonstrate the performance of the learned representation scheme on a number of challenging tasks in recognition, retrieval and ranking problems.The major contribution of the thesis is a unified solution for subspace learning methods, which is extensible for multiple views, supervised learning, and non-linear transformations. Traditional statistical learning techniques including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied by constructing graphs of specific forms under the same framework. Methods using non-linear transforms based on kernels and (deep) neural networks are derived, which lead to superior performance compared to the linear ones. A novel multi-view discriminant embedding method is proposed by taking the view difference into consideration. Secondly, a multiview nonparametric discriminant analysis method is introduced by exploiting the class boundary structure and discrepancy information of the available views. This allows for multiple projecion directions, by relaxing the Gaussian distribution assumption of related methods. Thirdly, we propose a composite ranking method by keeping a close correlation with the individual rankings for optimal rank fusion. We propose a multi-objective solution to ranking problems by capturing inter-view and intra-view information using autoencoderlike networks. Finally, a novel end-to-end solution is introduced to enhance joint ranking with minimum view-specific ranking loss, so that we can achieve the maximum global view agreements within a single optimization process.In summary, this thesis aims to address the challenges in representing multi-view data across different tasks. The proposed solutions have shown superior performance in numerous tasks, including object recognition, cross-modal image retrieval, face recognition and object ranking

    Intelligent Circuits and Systems

    Get PDF
    ICICS-2020 is the third conference initiated by the School of Electronics and Electrical Engineering at Lovely Professional University that explored recent innovations of researchers working for the development of smart and green technologies in the fields of Energy, Electronics, Communications, Computers, and Control. ICICS provides innovators to identify new opportunities for the social and economic benefits of society.  This conference bridges the gap between academics and R&D institutions, social visionaries, and experts from all strata of society to present their ongoing research activities and foster research relations between them. It provides opportunities for the exchange of new ideas, applications, and experiences in the field of smart technologies and finding global partners for future collaboration. The ICICS-2020 was conducted in two broad categories, Intelligent Circuits & Intelligent Systems and Emerging Technologies in Electrical Engineering
    corecore