465 research outputs found

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    Open-Source Face Recognition Frameworks: A Review of the Landscape

    Get PDF

    Detection of Bird Species Using Acoustic Signals

    Get PDF
    The task at hand is to use supervised learning to determine which bird species are audible in a given recording. In order to extract valuable ecological data from field recordings, it may be necessary to first develop efficient algorithms for classifying bird species. In this study, we use SVM (Support vector machine) technique to categorize bird chipping sounds into their respective species. Memory management, the availability of high-quality bird species for machine recognition, and a difference in signal-to-noise ratio across sets used for training and testing all posed problems for this research. We utilized the SVM technique to solve these problems and found that it provided satisfactory results. In this case, SVM is the most effective technique for dealing with recognizing problems. simply, then CNN tweaking, testing, and categorization. Birds are categorized based on their attributes (size, color, species, etc.) & outcome is contrasted with pre-trained data to provide output. In addition, we use the KNN, LR, and random forest algorithms. In this project, we provide information on many species, including their native territory, diet, name, maximum age, plumage color, body length, and whether or not they are migratory. Hearing who it is on the other end of the line only by hearing their voices. People with low vision may benefit from this endeavor. The concept involves using bird calls to provide early warnings of impending rain

    People detection and re-identification from a stationary camera located indoors

    Get PDF
    Cílem této práce je tvorba systému schopného detekovat a sledovat pohyb osob pomocí informací ze stacionární kamery. Systém také dokáže z detekcí extrahovat biometrické informace jako věk a pohlaví. Využití tohoto systému se nabízí zejména v komerčním prostředí, kde obchod může použít tyto informace k predikování chování zákazníků a/nebo plánování marketingových strategií.The goal of this thesis is the creation of a system, which is able to detect and track persons using information from a stationary camera. This system is also able to extract biometric information of age and gender from the detections. This can be useful for example in a commercial setting, where a retail store can use this information to predict customer behavior and/or plan marketing strategies

    Using Mobile Phone to Assist DHH Individuals

    Get PDF
    Past research on sign language recognition has mostly been based on physical information obtained via wearable devices or depth cameras. However, both types of devices are costly and inconvenient to carry, making it difficult to gain widespread acceptance by potential users. This research aims to use sophisticated and recently developed deep learning technology to build a recognition model for a Taiwanese version of sign language, with a limited focus on RGB images for training and recognition. It is hoped that this research, which makes use of lightweight devices such as mobile phones and webcams, will make a significant contribution to the communication needs of deaf and hard-of-hearing (DHH) individuals

    Pengenalan Wajah Resolusi Rendah Menggunakan Arsitektur Lightweight VarGFaceNet dengan Adaptive Margin Loss

    Get PDF
    Pengenalan wajah merupakan solusi keamanan modern yang cepat dan mudah di integrasikan pada kebanyakan device yang ada saat ini, sehingga sistem ini banyak diterapkan pada beberapa domain sebagai salah satu otorisasi keamanan. Pengembangan model pengenalan wajah menggunakan arsitektur mainstream (AlexNet, VGGNet, GoogleNet, ResNet, dan SENet) dapat menyebabkan model pengenalan wajah sulit diimplementasikan pada perangkat mobile dan embedded system. Selain itu input dengan resolusi yang rendah seperti pada footage kamera pengawas CCTV ataupun drone menyebabkan model kesulitan untuk mengenali wajah yang di input-kan karena gambar tidak punya cukup petunjuk untuk dikenali. Oleh karena itu penelitian ini akan menganalisis performa model pengenalan wajah yang dikembangkan dengan arsitektur lightweight VarGFaceNet dengan fungsi adaptive margin loss AdaFace pada dataset gambar resolusi rendah. Berdasarkan hasil evaluasi pada dataset LFW, didapatkan akurasi sebesar 99.08% pada data resolusi tinggi (112x112 piksel), sedangkan pada data resolusi rendah sintetis dengan resolusi terendah (14x14 piksel) didapatkan akurasi sebesar 79.87% dengan batuan model super resolution Real-ESRGAN dan GFP-GAN. Pada dataset TinyFace, tanpa melakukan fine tune, didapatkan akurasi Rank-1 sebesar 46.08% tanpa menggunakan model super resolution dan 45.03% dengan menggunakan model super resolution