30,975 research outputs found
Towards information profiling: data lake content metadata management
There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, commonly called Data Lakes (DL). These BD require new techniques of data integration and schema alignment in order to make the data usable by its consumers and to discover the relationships linking their content. This can be provided by metadata services which discover and describe their content. However, there is currently a lack of a systematic approach for such kind of metadata discovery and management. Thus, we propose a framework for the profiling of informational content stored in the DL, which we call information profiling. The profiles are stored as metadata to support data analysis. We formally define a metadata management process which identifies the key activities required to effectively handle this.We demonstrate the alternative techniques and performance of our process using a prototype implementation handling a real-life case-study from the OpenML DL, which showcases the value and feasibility of our approach.Peer ReviewedPostprint (author's final draft
Improving Landmark Localization with Semi-Supervised Learning
We present two techniques to improve landmark localization in images from
partially annotated datasets. Our primary goal is to leverage the common
situation where precise landmark locations are only provided for a small data
subset, but where class labels for classification or regression tasks related
to the landmarks are more abundantly available. First, we propose the framework
of sequential multitasking and explore it here through an architecture for
landmark localization where training with class labels acts as an auxiliary
signal to guide the landmark localization on unlabeled data. A key aspect of
our approach is that errors can be backpropagated through a complete landmark
localization model. Second, we propose and explore an unsupervised learning
technique for landmark localization based on having a model predict equivariant
landmarks with respect to transformations applied to the image. We show that
these techniques, improve landmark prediction considerably and can learn
effective detectors even when only a small fraction of the dataset has landmark
labels. We present results on two toy datasets and four real datasets, with
hands and faces, and report new state-of-the-art on two datasets in the wild,
e.g. with only 5\% of labeled images we outperform previous state-of-the-art
trained on the AFLW dataset.Comment: Published as a conference paper in CVPR 201
Face Attribute Prediction Using Off-the-Shelf CNN Features
Predicting attributes from face images in the wild is a challenging computer
vision problem. To automatically describe face attributes from face containing
images, traditionally one needs to cascade three technical blocks --- face
localization, facial descriptor construction, and attribute classification ---
in a pipeline. As a typical classification problem, face attribute prediction
has been addressed using deep learning. Current state-of-the-art performance
was achieved by using two cascaded Convolutional Neural Networks (CNNs), which
were specifically trained to learn face localization and attribute description.
In this paper, we experiment with an alternative way of employing the power of
deep representations from CNNs. Combining with conventional face localization
techniques, we use off-the-shelf architectures trained for face recognition to
build facial descriptors. Recognizing that the describable face attributes are
diverse, our face descriptors are constructed from different levels of the CNNs
for different attributes to best facilitate face attribute prediction.
Experiments on two large datasets, LFWA and CelebA, show that our approach is
entirely comparable to the state-of-the-art. Our findings not only demonstrate
an efficient face attribute prediction approach, but also raise an important
question: how to leverage the power of off-the-shelf CNN representations for
novel tasks.Comment: In proceeding of 2016 International Conference on Biometrics (ICB
Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths
Zero-shot recognition aims to accurately recognize objects of unseen classes
by using a shared visual-semantic mapping between the image feature space and
the semantic embedding space. This mapping is learned on training data of seen
classes and is expected to have transfer ability to unseen classes. In this
paper, we tackle this problem by exploiting the intrinsic relationship between
the semantic space manifold and the transfer ability of visual-semantic
mapping. We formalize their connection and cast zero-shot recognition as a
joint optimization problem. Motivated by this, we propose a novel framework for
zero-shot recognition, which contains dual visual-semantic mapping paths. Our
analysis shows this framework can not only apply prior semantic knowledge to
infer underlying semantic manifold in the image feature space, but also
generate optimized semantic embedding space, which can enhance the transfer
ability of the visual-semantic mapping to unseen classes. The proposed method
is evaluated for zero-shot recognition on four benchmark datasets, achieving
outstanding results.Comment: Accepted as a full paper in IEEE Computer Vision and Pattern
Recognition (CVPR) 201
- …