17 research outputs found

    A Proximity-Aware Hierarchical Clustering of Faces

    Full text link
    In this paper, we propose an unsupervised face clustering algorithm called "Proximity-Aware Hierarchical Clustering" (PAHC) that exploits the local structure of deep representations. In the proposed method, a similarity measure between deep features is computed by evaluating linear SVM margins. SVMs are trained using nearest neighbors of sample data, and thus do not require any external training data. Clusters are then formed by thresholding the similarity scores. We evaluate the clustering performance using three challenging unconstrained face datasets, including Celebrity in Frontal-Profile (CFP), IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3) datasets. Experimental results demonstrate that the proposed approach can achieve significant improvements over state-of-the-art methods. Moreover, we also show that the proposed clustering algorithm can be applied to curate a set of large-scale and noisy training dataset while maintaining sufficient amount of images and their variations due to nuisance factors. The face verification performance on JANUS CS3 improves significantly by finetuning a DCNN model with the curated MS-Celeb-1M dataset which contains over three million face images

    Hybrid Cluster based Collaborative Filtering using Firefly and Agglomerative Hierarchical Clustering

    Get PDF
    Recommendation Systems finds the user preferences based on the purchase history of an individual using data mining and machine learning techniques. To reduce the time taken for computation Recommendation systems generally use a pre-processing technique which in turn helps to increase high low performance and over comes over-fitting of data. In this paper, we propose a hybrid collaborative filtering algorithm using firefly and agglomerative hierarchical clustering technique with priority queue and Principle Component Analysis (PCA). We applied our hybrid algorithm on movielens dataset and used Pearson Correlation to obtain Top N recommendations. Experimental results show that the our algorithm delivers accurate and reliable recommendations showing high performance when compared with  existing algorithms

    The current approaches in pattern recognition

    Get PDF

    Large-scale clustering of CAGE tag expression data

    Get PDF
    Background: Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. Results: We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70-100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. Conclusion: Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data

    Estimation of Object Parameters from Images

    Get PDF
    Rapidní rozvoj komunikačních technologií v posledním desetiletí zapříčinil zvýšení objemu informací, které lidé a organizace generují a sdílejí. V současné spleti je stále těžší identifikovat relevantní zprávy, protože ještě neexistují nástroje a techniky pro inteligentní správu informace v masovém měřítku. Obrazová informace je vzhledem k multimediální povaze dnešních médií stále frekventovanější a důležitější. Tato práce popisuje software pro automatický odhad předem definovaných vlastností objektů v obraze. Je také popsána implementace tohoto algoritmu v jazyce C++.Rapid expansion of communication technologies in last decade caused increased volume of information which is beeing generated and shared by people and organisations. It is permanently harder to identify relevant content today because of absence of tools and techniques which may support mass information management. As today's media have rather multimedial character image information is even more important. This project describes software for automatic estimation of predefined object parameters from images. A C++ implementation of this algorithm is also described.

    Using deep learning for social analysis in egocentric images

    Get PDF
    In this work, we explore in detail and propose a system to cluster faces from unconstrained images. This system can be divided mainly in two big steps: i) align the faces and pass them through a deep convolutional neural network, and ii) cluster the face images by their feature representation
    corecore