5,079 research outputs found

    Cumulative Attribute Space for Age and Crowd Density Estimation

    Get PDF
    A number of computer vision problems such as human age estimation, crowd density estimation and body/face pose (view angle) estimation can be formulated as a regression problem by learning a mapping function between a high dimensional vector-formed feature input and a scalarvalued output. Such a learning problem is made difficult due to sparse and imbalanced training data and large feature variations caused by both uncertain viewing conditions and intrinsic ambiguities between observable visual features and the scalar values to be estimated. Encouraged by the recent success in using attributes for solving classification problems with sparse training data, this paper introduces a novel cumulative attribute concept for learning a regression model when only sparse and imbalanced data are available. More precisely, low-level visual features extracted from sparse and imbalanced image samples are mapped onto a cumulative attribute space where each dimension has clearly defined semantic interpretation (a label) that captures how the scalar output value (e.g. age, people count) changes continuously and cumulatively. Extensive experiments show that our cumulative attribute framework gains notable advantage on accuracy for both age estimation and crowd counting when compared against conventional regression models, especially when the labelled training data is sparse with imbalanced sampling. 1

    Latent Dependency Mining for Solving Regression Problems in Computer Vision

    Get PDF
    PhDRegression-based frameworks, learning the direct mapping between low-level imagery features and vector/scalar-formed continuous labels, have been widely exploited in computer vision, e.g. in crowd counting, age estimation and human pose estimation. In the last decade, many efforts have been dedicated by researchers in computer vision for better regression fitting. Nevertheless, solving these computer vision problems with regression frameworks remained a formidable challenge due to 1) feature variation and 2) imbalance and sparse data. On one hand, large feature variation can be caused by the changes of extrinsic conditions (i.e. images are taken under different lighting condition and viewing angles) and also intrinsic conditions (e.g. different aging process of different persons in age estimation and inter-object occlusion in crowd density estimation). On the other hand, imbalanced and sparse data distributions can also have an important effect on regression performance. Apparently, these two challenges existing in regression learning are related in the sense that the feature inconsistency problem is compounded by sparse and imbalanced training data and vice versa, and they need be tackled jointly in modelling and explicitly in representation. This thesis firstly mines an intermediary feature representation consisting of concatenating spatially localised feature for sharing the information from neighbouring localised cells in the frames. This thesis secondly introduces the cumulative attribute concept constructed for learning a regression model by exploiting the latent cumulative dependent nature of label space in regression, in the application of facial age and crowd density estimation. The thesis thirdly demonstrates the effectiveness of a discriminative structured-output regression framework to learn the inherent latent correlation between each element of output variables in the application of 2D human upper body pose estimation. The effectiveness of the proposed regression frameworks for crowd counting, age estimation, and human pose estimation is validated with public benchmarks
    • …
    corecore