19 research outputs found
AI-Based Analytics for Hawkers Identification in Video Surveillance for Smart Community
Street hawking is a widespread phenomenon in urban areas globally, presenting challenges for local authorities such as traffic congestion, waste management, and negative impacts on the city's image. This research addresses key issues faced by authorities in managing hawkers, including the resistance to formalization, maintaining urban aesthetics, waste disposal, and understanding user preferences. The study investigates the performance of the You Only Look Once (YOLO) algorithm, utilizing Convolutional Neural Networks (CNN) for real-time object detection. To achieve thisobjective, the YOLOv5 algorithm is trained with a custom image dataset collected from the same camera along the street in the city area to detect five classes of objects, namely umbrella, table, stool, car, and people. Real images that were captured via camera and video surveillance were compiled as datasets which are then used to train and test the algorithm. The study aims to provide insights into the data collection process of hawkers along the street around the areas and the development of real-time hawker detection for the smart city application
Research on product detection and recognition methods for intelligent vending machines
With the continuous development of China's economy and the improvement of residents' living standards, it also brings increasing costs of labor and rent. In addition, the impact of the pandemic on the entity industry has brought opportunities for the development of new retail models. Based on the booming development of artificial intelligence, big data, and mobile payment in the new era, the new retail industry using artificial intelligence technology has shown outstanding performance in the market. Among them, intelligent vending machines have emerged in the new retail model. In order to provide users with a good shopping experience, the product detection speed and accuracy of intelligent vending machines must be high enough. We adopt Faster R-CNN, a mature object detection algorithm in deep learning, to solve the commodity settlement scenario of intelligent vending machines
Analyzing computer vision models for detecting customers: a practical experience in a mexican retail
Computer vision has become an important technology for obtaining meaningful data from visual content and providing valuable information for enhancing security controls, marketing, and logistic strategies in diverse industrial and business sectors. The retail sector constitutes an important part of the worldwide economy. Analyzing customer data and shopping behaviors has become essential to deliver the right products to customers, maximize profits, and increase competitiveness. In-person shopping is still a predominant form of retail despite the appearance of online retail outlets. As such, in-person retail is adopting computer vision models to monitor store products and customers. This research paper presents the development of a computer vision solution by Lytica Company to detect customers in Sterenโs physical retail stores in Mexico. Current computer vision models such as SSD Mobilenet V2, YOLO-FastestV2, YOLOv5, and YOLOXn were analyzed to find the most accurate system according to the conditions and characteristics of the available devices. Some of the challenges addressed during the analysis of videos were obstruction and proximity of the customers, lighting conditions, position and distance of the camera concerning the customer when entering the store, image quality, and scalability of the process. Models were evaluated with the F1-score metric: 0.64 with YOLO FastestV2, 0.74 with SSD Mobilenetv2, 0.86 with YOLOv5n, 0.86 with YOLOv5xs, and 0.74 with YOLOXn. Although YOLOv5 achieved the best performance, YOLOXn presented the best balance between performance and FPS (frames per second) rate, considering the limited hardware and computing power conditions
Centralised and decentralised sensor fusionโbased emergency brake assist
Copyright: ยฉ 2021 by the authors. Many advanced driver assistance systems (ADAS) are currently trying to utilise multi-sensor architectures, where the driver assistance algorithm receives data from a multitude of sen-sors. As monoโsensor systems cannot provide reliable and consistent readings under all circum-stances because of errors and other limitations, fusing data from multiple sensors ensures that the environmental parameters are perceived correctly and reliably for most scenarios, thereby substan-tially improving the reliability of the multiโsensorโbased automotive systems. This paper first high-lights the significance of efficiently fusing data from multiple sensors in ADAS features. An emergency brake assist (EBA) system is showcased using multiple sensors, namely, a light detection and ranging (LiDAR) sensor and camera. The architectures of the proposed โcentralisedโ and โdecentral-isedโ sensor fusion approaches for EBA are discussed along with their constituents, i.e., the detection algorithms, the fusion algorithm, and the tracking algorithm. The centralised and decentralised architectures are built and analytically compared, and the performance of these two fusion architectures for EBA are evaluated in terms of speed of execution, accuracy, and computational cost. While both fusion methods are seen to drive the EBA application at an acceptable frame rate (~20fps or higher) on an Intel i5โbased Ubuntu system, it was concluded through the experiments and analyt-ical comparisons that the decentralised fusionโdriven EBA leads to higher accuracy; however, it has the downside of a higher computational cost. The centralised fusionโdriven EBA yields compara-tively less accurate results, but with the benefits of a higher frame rate and lesser computational cost
Dominant Feature Pooling for Multi Camera Object Detection and Optimization of Retinex Algorithm
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ ๊ธฐยท์ ๋ณด๊ณตํ๋ถ, 2021.8. ์ดํ์ฌ.๋ณธ ๋
ผ๋ฌธ์ ๋ฉํฐ ์นด๋ฉ๋ผ object detection CNN์ ์ํ detection ๋จ๊ณ์์ ํ์ฉํ๋ ์๋ก์ด dominant feature pooling ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. ๋ฉํฐ ์นด๋ฉ๋ผ ์์คํ
์ ๋ค์ํ ๊ด์ ์์ ๋ฌผ์ฒด์ ์ด๋ฏธ์ง๋ฅผ ์บก์ฒํ๊ณ , ๋ฌผ์ฒด์ ๋ ๋ง์ ์ฃผ์ feature๋ฅผ detection์ ํ์ฉํ ์ ์๋ค. ๋ฐ๋ผ์ ์ฌ๋ฌ ์นด๋ฉ๋ผ์์ feature๋ฅผ poolingํ๋ฉด detection ์ ํ๋๋ฅผ ํฅ์์ํฌ ์ ์๋ค. ์ ์๋ ๋ฐฉ๋ฒ์ ๊ฐ์ฒด์ ๋ค์ํ ๋ทฐํฌ์ธํธ์์ ์ป์ feature vector ์ค์์ ๋ ๋ง์ ์ ๋ณด๋ฅผ ์ ๊ณตํ๋ ์ฃผ์ feature์ ์ ํํ๊ณ ์ ํํ feature vector๋ฅผ poolingํ์ฌ ์๋ก์ด feature map์ ๊ตฌ์ฑํ๋ค. ์ ์๋ ๋ฐฉ๋ฒ์ ๋จ์ผ ์นด๋ฉ๋ผ์ ๋ํ YOLOv3 ๋คํธ์ํฌ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๋ฉฐ, ๋ฉํฐ ์นด๋ฉ๋ผ ์์คํ
์ ๋ํ ์ถ๊ฐ ํ์ต ๊ณผ์ ์ด ํ์ํ์ง ์๋ค. Dominant feature pooling์ ํจ๊ณผ๋ฅผ ์ฃผ์ฅํ๊ธฐ ์ํด, ์ด ์ฐ๊ตฌ์์๋ feature vector๋ฅผ ์๊ฐํํ๋ ์๋ก์ด ๋ฐฉ๋ฒ๋ ์ ์๋๋ค.
๋ํ object detection CNN์ ์ ์กฐ๋ ํ๊ฒฝ์ ๋์์ด ์ทจ์ฝํ๋ฏ๋ก ์ด๋ฅผ ๊ฐ์ ํ ์ ์๋ Retinex ์๊ณ ๋ฆฌ์ฆ์ ํ์ฉ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. ์ ์กฐ๋ ์์์ ๊ทธ๋๋ก ํ์ตํ์ฌ ๊ฐ์ ์ ํ ์ ์์ง๋ง, ์ค ์ฌ์ฉ ํ๊ฒฝ์์ ์กฐ๋ ์ ๋๋ฅผ ์์ธกํ ์ ์๊ธฐ ๋๋ฌธ์ Retinex ๊ฐ์ ์ด ํ์์ ์์ ์คํ์ ํตํด ๋ํ๋ด์๋ค. ๋ํ ๊ฐ์ ํจ๊ณผ๊ฐ ๋๋ ทํ์ง๋ง ๋ณต์ก๋๊ฐ ๋์ Retinex ์๊ณ ๋ฆฌ์ฆ์ HW ์ค๊ณ๋ฅผ ํตํด ์ต์ ํ ํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ๋ค. Retinex ์๊ณ ๋ฆฌ์ฆ ์ฐ์ฐ์ ํ์์ ์ธ exponentiation๊ณผ Gaussian filtering์ ํจ์จ์ ์ผ๋ก ๊ตฌํํ๋ ๋ฐฉ๋ฒ์ ์ ์ํ์ฌ ๋์ ํด์๋์์๋ ์ค์๊ฐ์ผ๋ก ๋์์ด ๊ฐ๋ฅํ HW๋ฅผ ๊ตฌํํ์๋ค.This paper proposes a novel dominant feature pooling method utilized in the detection phase for multi-camera object detection CNNs. Multi-camera systems can capture images of objects from various perspectives and utilize more of the important features of objects for detection. Thus, the detection accuracy can be improved by pooling the features of the multiple cameras. The proposed method constructs a new feature patch by selecting and pooling the dominant features that provides more information among the feature vectors obtained from various viewpoints of objects. The proposed method is based on the YOLOv3 network for a single camera, and does not require additional learning processes for multi-camera systems. To show the effectiveness of dominant feature pooling, a novel method of visualizing feature vectors is also proposed in this work.
Furthermore, a method of utilizing Retinex algorithms that can improve response to low-light environments for object detection CNN is proposed. Although improvements can be made by learning low-light images as they are, experimental results show that Retinex improvements are essential because the degree of illumination cannot be predicted accurately to create new datasets in practical environments. This work proposes a method to optimize Retinex algorithms through HW designs. An efficient implementation of the exponentiation operation and the Gaussian filtering, which are essential for Retinex algorithm operations is proposed to implement HW that can operate in real time at high resolution.์ 1 ์ฅ ์ ๋ก 1
1.1 ์ฐ๊ตฌ ๋ฐฐ๊ฒฝ 1
1.2 ์ฐ๊ตฌ ๋ด์ฉ 2
1.3 ๋
ผ๋ฌธ ๊ตฌ์ฑ 4
์ 2 ์ฅ ๋ฐฐ๊ฒฝ ์ด๋ก ๋ฐ ๊ด๋ จ ์ฐ๊ตฌ 5
2.1 Object Detection CNN 5
2.2 Multi View CNN 6
2.3 Retinex ์๊ณ ๋ฆฌ์ฆ 7
2.3.1 Retinex Algorithm using Gaussian Filter 8
2.3.2 Multiscale Retinex Algorithm 9
2.3.3 Efficient Naturalness Restoration 10
์ 3 ์ฅ ๋ฌด์ธ ํ๋งค๋ ์์คํ
12
3.1 ๋ฌด์ธ ํ๋งค๋ ์์คํ
๊ฐ์ 12
3.2 Object Detection CNN์ ํ์ฉํ ์ํ ์ธ์ 16
3.3 Multi-Object Tracking์ ํ์ฉํ ์ํ ๊ตฌ๋งค ํ๋จ 18
3.4 ๋ฌด์ธ ํ๋งค๋์ ์ค์๊ฐ ๋์์ ์ํ ์ต์ ํ ๋ฐฉ์ 20
3.4.1 ์นด๋ฉ๋ผ ์ ํ ์๊ณ ๋ฆฌ์ฆ 20
3.4.2 Multithreading 24
3.4.3 Pruning 25
3.5 ๋ฌด์ธ ํ๋งค๋ ์์คํ
์ฑ๋ฅ ํ๊ฐ 27
3.5.1 Object Detection ์ฑ๋ฅ ํ๊ฐ 27
3.5.2 ๋ฌด์ธ ํ๋งค๋ ์์คํ
์ ์ฒด ๊ฒฐ๊ณผ 29
์ 4 ์ฅ ๋ฉํฐ ์นด๋ฉ๋ผ Dominant Feature Pooling 32
4.1 Object Detection CNN๊ณผ ๋ฉํฐ ์นด๋ฉ๋ผ Object Clustering 33
4.1.1 Object Detection CNN 33
4.1.2 ๋ฉํฐ ์นด๋ฉ๋ผ Object Clustring 35
4.2 Dominant Feature Pooling ๋ฐฉ๋ฒ 37
4.2.1 Dominant Feature Scoring 40
4.2.2 Dominant Feature Pooling 47
4.2.3 YOLOv3์ Detection Layer ์ฌ์ฌ์ฉ 50
4.3 Feature ์๊ฐํ๋ฅผ ํตํ ์ ์ ๋ฐฉ๋ฒ ๋ถ์ 52
4.3.1 ์ ์ํ๋ Feature ์๊ฐํ ๋ฐฉ๋ฒ 52
4.3.2 ๊ธฐ์กด ๋จ์ผ ์นด๋ฉ๋ผ YOLOv3์ Feature ์๊ฐํ 55
4.3.3 ์ ์ํ๋ ๋ฐฉ๋ฒ์ ๋ฉํฐ์นด๋ฉ๋ผ Feature ์๊ฐํ 57
4.4 Dominant Feature Pooling ๊ฒฐ๊ณผ ๋ฐ ๋ถ์ 59
4.4.1 COCO Dataset์์์ ๊ฒฐ๊ณผ 60
4.4.2 Custom Dataset์์์ ๊ฒฐ๊ณผ 62
4.4.3 Scoring Method ๋ณ ๊ฒฐ๊ณผ 63
4.4.3 Dominant Feature Pooling์ ์ํ์๊ฐ ๊ฒฐ๊ณผ 64
์ 5 ์ฅ Retinex Applied Object Detection ๋ฐ ํ๋์จ์ ๊ฐ์์์คํ
65
5.1 ๊ธฐ์กด Retinex ์ ์ฉ ์ฐ๊ตฌ 66
5.2 Retinex Applied Object Detection 68
5.2.1 Retinex Applied Object Detection ํ์ต 68
5.2.2 Retinex Applied Object Detection ๊ฒฐ๊ณผ 72
5.3 Object Detection์ ์ํ Retinex ์ต์ ํ 76
5.3.1 Gaussian Filter ํฌ๊ธฐ์ ๋ฐ๋ฅธ Retinex ํจ๊ณผ ๋ถ์ 76
5.3.2 Gaussain Filter ํฌ๊ธฐ์ ๋ฐ๋ฅธ Object Detection ๊ฒฐ๊ณผ 80
5.4 Retinex ํ๋์จ์ด ์์คํ
์ ํ์์ฑ ๋ฐ ๊ธฐ์กด ์ฐ๊ตฌ 82
5.5 ์ ์ ํ๋์จ์ด ์์คํ
๊ตฌํ ๊ฐ์ 85
5.6 ์ ์ ํ๋์จ์ด ์์คํ
๊ตฌํ ํน์ฅ์ 89
5.6.1 Gaussian filter์ ๊ตฌํ 89
5.6.2 Exponentiation์ ๊ตฌํ 96
5.6.3 HDMI/DVI ์ง์ ๋ฐ ์์ latency ์ต์ํ 103
5.7 ์ ์ ํ๋์จ์ด ์์คํ
๊ตฌํ ๊ฒฐ๊ณผ ๋ฐ ๋ถ์ 106
5.7.1 ์ค์๊ฐ ๋์ ๋ฐ ๋ฎ์ latency์ ๋ํ ๋ถ์ 106
5.7.2 ์ ์ํ ์์คํ
์ ์์ ์ฒ๋ฆฌ ์ฑ๋ฅ ๊ฒฐ๊ณผ ๋ถ์ 109
5.7.3 ์ ์ํ ์์คํ
์ FPGA Resource Utilization 112
5.7.4 ๋ค๋ฅธ ์์คํ
๊ณผ์ Resource Utilization ๋น๊ต 114
5.7.5 ์ ์ํ ์์คํ
์ ์์ ์ฒ๋ฆฌ ์ฑ๋ฅ ๊ฒฐ๊ณผ ๋ถ์ 119
์ 6 ์ฅ ๊ฒฐ๋ก 120
์ฐธ๊ณ ๋ฌธํ 121
Abstract 131๋ฐ
Analyzing computer vision models for detecting customers: a practical experience in a mexican retail
Computer vision has become an important technology for obtaining meaningful data from visual content and providing valuable information for enhancing security controls, marketing, and logistic strategies in diverse industrial and business sectors. The retail sector constitutes an important part of the worldwide economy. Analyzing customer data and shopping behaviors has become essential to deliver the right products to customers, maximize profits, and increase competitiveness. In-person shopping is still a predominant form of retail despite the appearance of online retail outlets. As such, in-person retail is adopting computer vision models to monitor store products and customers. This research paper presents the development of a computer vision solution by Lytica Company to detect customers in Sterenโs physical retail stores in Mexico. Current computer vision models such as SSD Mobilenet V2, YOLO-FastestV2, YOLOv5, and YOLOXn were analyzed to find the most accurate system according to the conditions and characteristics of the available devices. Some of the challenges addressed during the analysis of videos were obstruction and proximity of the customers, lighting conditions, position and distance of the camera concerning the customer when entering the store, image quality, and scalability of the process. Models were evaluated with the F1-score metric: 0.64 with YOLO FastestV2, 0.74 with SSD Mobilenetv2, 0.86 with YOLOv5n, 0.86 with YOLOv5xs, and 0.74 with YOLOXn. Although YOLOv5 achieved the best performance, YOLOXn presented the best balance between performance and FPS (frames per second) rate, considering the limited hardware and computing power conditions
Deep Learning Detected Nutrient Deficiency in Chili Plant
Chili is a staple commodity that also affects the Indonesian economy due to high market demand.
Proven in June 2019, chili is a contributor to Indonesia's inflation of 0.20% from 0.55%. One
factor is crop failure due to malnutrition. In this study, the aim is to explore Deep Learning
Technology in agriculture to help farmers be able to diagnose their plants, so that their plants
are not malnourished. Using the RCNN algorithm as the architecture of this system. Use 270
datasets in 4 categories. The dataset used is primary data with chili samples in Boyolali Regency,
Indonesia. The chili we use are curly chili. The results of this study are computers that can
recognize nutrient deficiencies in chili plants based on image input received with the greatest
testing accuracy of 82.61% and has the best mAP value of 15.57%
Radial Basis Function Neural Network in Identifying The Types of Mangoes
Mango (Mangifera Indica L) is part of a fruit
plant species that have different color and texture
characteristics to indicate its type. The identification of the
types of mangoes uses the manual method through direct visual
observation of mangoes to be classified. At the same time, the
more subjective way humans work causes differences in their
determination. Therefore in the use of information technology,
it is possible to classify mangoes based on their texture using a
computerized system. In its completion, the acquisition process
is using the camera as an image processing instrument of the
recorded images. To determine the pattern of mango data
taken from several samples of texture features using Gabor
filters from various types of mangoes and the value of the
feature extraction results through artificial neural networks
(ANN). Using the Radial Base Function method, which
produces weight values, is then used as a process for classifying
types of mangoes. The accuracy of the test results obtained
from the use of extraction methods and existing learning
methods is 100%