19 research outputs found

    AI-Based Analytics for Hawkers Identification in Video Surveillance for Smart Community

    Get PDF
    Street hawking is a widespread phenomenon in urban areas globally, presenting challenges for local authorities such as traffic congestion, waste management, and negative impacts on the city's image. This research addresses key issues faced by authorities in managing hawkers, including the resistance to formalization, maintaining urban aesthetics, waste disposal, and understanding user preferences. The study investigates the performance of the You Only Look Once (YOLO) algorithm, utilizing Convolutional Neural Networks (CNN) for real-time object detection. To achieve thisobjective, the YOLOv5 algorithm is trained with a custom image dataset collected from the same camera along the street in the city area to detect five classes of objects, namely umbrella, table, stool, car, and people. Real images that were captured via camera and video surveillance were compiled as datasets which are then used to train and test the algorithm. The study aims to provide insights into the data collection process of hawkers along the street around the areas and the development of real-time hawker detection for the smart city application

    Research on product detection and recognition methods for intelligent vending machines

    Get PDF
    With the continuous development of China's economy and the improvement of residents' living standards, it also brings increasing costs of labor and rent. In addition, the impact of the pandemic on the entity industry has brought opportunities for the development of new retail models. Based on the booming development of artificial intelligence, big data, and mobile payment in the new era, the new retail industry using artificial intelligence technology has shown outstanding performance in the market. Among them, intelligent vending machines have emerged in the new retail model. In order to provide users with a good shopping experience, the product detection speed and accuracy of intelligent vending machines must be high enough. We adopt Faster R-CNN, a mature object detection algorithm in deep learning, to solve the commodity settlement scenario of intelligent vending machines

    Analyzing computer vision models for detecting customers: a practical experience in a mexican retail

    Get PDF
    Computer vision has become an important technology for obtaining meaningful data from visual content and providing valuable information for enhancing security controls, marketing, and logistic strategies in diverse industrial and business sectors. The retail sector constitutes an important part of the worldwide economy. Analyzing customer data and shopping behaviors has become essential to deliver the right products to customers, maximize profits, and increase competitiveness. In-person shopping is still a predominant form of retail despite the appearance of online retail outlets. As such, in-person retail is adopting computer vision models to monitor store products and customers. This research paper presents the development of a computer vision solution by Lytica Company to detect customers in Sterenโ€™s physical retail stores in Mexico. Current computer vision models such as SSD Mobilenet V2, YOLO-FastestV2, YOLOv5, and YOLOXn were analyzed to find the most accurate system according to the conditions and characteristics of the available devices. Some of the challenges addressed during the analysis of videos were obstruction and proximity of the customers, lighting conditions, position and distance of the camera concerning the customer when entering the store, image quality, and scalability of the process. Models were evaluated with the F1-score metric: 0.64 with YOLO FastestV2, 0.74 with SSD Mobilenetv2, 0.86 with YOLOv5n, 0.86 with YOLOv5xs, and 0.74 with YOLOXn. Although YOLOv5 achieved the best performance, YOLOXn presented the best balance between performance and FPS (frames per second) rate, considering the limited hardware and computing power conditions

    Centralised and decentralised sensor fusionโ€based emergency brake assist

    Get PDF
    Copyright: ยฉ 2021 by the authors. Many advanced driver assistance systems (ADAS) are currently trying to utilise multi-sensor architectures, where the driver assistance algorithm receives data from a multitude of sen-sors. As monoโ€sensor systems cannot provide reliable and consistent readings under all circum-stances because of errors and other limitations, fusing data from multiple sensors ensures that the environmental parameters are perceived correctly and reliably for most scenarios, thereby substan-tially improving the reliability of the multiโ€sensorโ€based automotive systems. This paper first high-lights the significance of efficiently fusing data from multiple sensors in ADAS features. An emergency brake assist (EBA) system is showcased using multiple sensors, namely, a light detection and ranging (LiDAR) sensor and camera. The architectures of the proposed โ€˜centralisedโ€™ and โ€˜decentral-isedโ€™ sensor fusion approaches for EBA are discussed along with their constituents, i.e., the detection algorithms, the fusion algorithm, and the tracking algorithm. The centralised and decentralised architectures are built and analytically compared, and the performance of these two fusion architectures for EBA are evaluated in terms of speed of execution, accuracy, and computational cost. While both fusion methods are seen to drive the EBA application at an acceptable frame rate (~20fps or higher) on an Intel i5โ€based Ubuntu system, it was concluded through the experiments and analyt-ical comparisons that the decentralised fusionโ€driven EBA leads to higher accuracy; however, it has the downside of a higher computational cost. The centralised fusionโ€driven EBA yields compara-tively less accurate results, but with the benefits of a higher frame rate and lesser computational cost

    Dominant Feature Pooling for Multi Camera Object Detection and Optimization of Retinex Algorithm

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ์ดํ˜์žฌ.๋ณธ ๋…ผ๋ฌธ์€ ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ object detection CNN์„ ์œ„ํ•œ detection ๋‹จ๊ณ„์—์„œ ํ™œ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด dominant feature pooling ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ์€ ๋‹ค์–‘ํ•œ ๊ด€์ ์—์„œ ๋ฌผ์ฒด์˜ ์ด๋ฏธ์ง€๋ฅผ ์บก์ฒ˜ํ•˜๊ณ , ๋ฌผ์ฒด์˜ ๋” ๋งŽ์€ ์ฃผ์š” feature๋ฅผ detection์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์—ฌ๋Ÿฌ ์นด๋ฉ”๋ผ์—์„œ feature๋ฅผ poolingํ•˜๋ฉด detection ์ •ํ™•๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๊ฐ์ฒด์˜ ๋‹ค์–‘ํ•œ ๋ทฐํฌ์ธํŠธ์—์„œ ์–ป์€ feature vector ์ค‘์—์„œ ๋” ๋งŽ์€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” ์ฃผ์š” feature์„ ์„ ํƒํ•˜๊ณ  ์„ ํƒํ•œ feature vector๋ฅผ poolingํ•˜์—ฌ ์ƒˆ๋กœ์šด feature map์„ ๊ตฌ์„ฑํ•œ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๋‹จ์ผ ์นด๋ฉ”๋ผ์— ๋Œ€ํ•œ YOLOv3 ๋„คํŠธ์›Œํฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ํ•™์Šต ๊ณผ์ •์ด ํ•„์š”ํ•˜์ง€ ์•Š๋‹ค. Dominant feature pooling์˜ ํšจ๊ณผ๋ฅผ ์ฃผ์žฅํ•˜๊ธฐ ์œ„ํ•ด, ์ด ์—ฐ๊ตฌ์—์„œ๋Š” feature vector๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•๋„ ์ œ์•ˆ๋œ๋‹ค. ๋˜ํ•œ object detection CNN์€ ์ €์กฐ๋„ ํ™˜๊ฒฝ์— ๋Œ€์‘์ด ์ทจ์•ฝํ•˜๋ฏ€๋กœ ์ด๋ฅผ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ๋Š” Retinex ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํ™œ์šฉ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ €์กฐ๋„ ์˜์ƒ์„ ๊ทธ๋Œ€๋กœ ํ•™์Šตํ•˜์—ฌ ๊ฐœ์„ ์„ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์‹ค ์‚ฌ์šฉ ํ™˜๊ฒฝ์—์„œ ์กฐ๋„ ์ •๋„๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— Retinex ๊ฐœ์„ ์ด ํ•„์ˆ˜์ ์ž„์„ ์‹คํ—˜์„ ํ†ตํ•ด ๋‚˜ํƒ€๋‚ด์—ˆ๋‹ค. ๋˜ํ•œ ๊ฐœ์„  ํšจ๊ณผ๊ฐ€ ๋šœ๋ ทํ•˜์ง€๋งŒ ๋ณต์žก๋„๊ฐ€ ๋†’์€ Retinex ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ HW ์„ค๊ณ„๋ฅผ ํ†ตํ•ด ์ตœ์ ํ™” ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. Retinex ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์—ฐ์‚ฐ์— ํ•„์ˆ˜์ ์ธ exponentiation๊ณผ Gaussian filtering์„ ํšจ์œจ์ ์œผ๋กœ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜์—ฌ ๋†’์€ ํ•ด์ƒ๋„์—์„œ๋„ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋™์ž‘์ด ๊ฐ€๋Šฅํ•œ HW๋ฅผ ๊ตฌํ˜„ํ•˜์˜€๋‹ค.This paper proposes a novel dominant feature pooling method utilized in the detection phase for multi-camera object detection CNNs. Multi-camera systems can capture images of objects from various perspectives and utilize more of the important features of objects for detection. Thus, the detection accuracy can be improved by pooling the features of the multiple cameras. The proposed method constructs a new feature patch by selecting and pooling the dominant features that provides more information among the feature vectors obtained from various viewpoints of objects. The proposed method is based on the YOLOv3 network for a single camera, and does not require additional learning processes for multi-camera systems. To show the effectiveness of dominant feature pooling, a novel method of visualizing feature vectors is also proposed in this work. Furthermore, a method of utilizing Retinex algorithms that can improve response to low-light environments for object detection CNN is proposed. Although improvements can be made by learning low-light images as they are, experimental results show that Retinex improvements are essential because the degree of illumination cannot be predicted accurately to create new datasets in practical environments. This work proposes a method to optimize Retinex algorithms through HW designs. An efficient implementation of the exponentiation operation and the Gaussian filtering, which are essential for Retinex algorithm operations is proposed to implement HW that can operate in real time at high resolution.์ œ 1 ์žฅ ์„œ ๋ก  1 1.1 ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ 1 1.2 ์—ฐ๊ตฌ ๋‚ด์šฉ 2 1.3 ๋…ผ๋ฌธ ๊ตฌ์„ฑ 4 ์ œ 2 ์žฅ ๋ฐฐ๊ฒฝ ์ด๋ก  ๋ฐ ๊ด€๋ จ ์—ฐ๊ตฌ 5 2.1 Object Detection CNN 5 2.2 Multi View CNN 6 2.3 Retinex ์•Œ๊ณ ๋ฆฌ์ฆ˜ 7 2.3.1 Retinex Algorithm using Gaussian Filter 8 2.3.2 Multiscale Retinex Algorithm 9 2.3.3 Efficient Naturalness Restoration 10 ์ œ 3 ์žฅ ๋ฌด์ธ ํŒ๋งค๋Œ€ ์‹œ์Šคํ…œ 12 3.1 ๋ฌด์ธ ํŒ๋งค๋Œ€ ์‹œ์Šคํ…œ ๊ฐœ์š” 12 3.2 Object Detection CNN์„ ํ™œ์šฉํ•œ ์ƒํ’ˆ ์ธ์‹ 16 3.3 Multi-Object Tracking์„ ํ™œ์šฉํ•œ ์ƒํ’ˆ ๊ตฌ๋งค ํŒ๋‹จ 18 3.4 ๋ฌด์ธ ํŒ๋งค๋Œ€์˜ ์‹ค์‹œ๊ฐ„ ๋™์ž‘์„ ์œ„ํ•œ ์ตœ์ ํ™” ๋ฐฉ์•ˆ 20 3.4.1 ์นด๋ฉ”๋ผ ์„ ํƒ ์•Œ๊ณ ๋ฆฌ์ฆ˜ 20 3.4.2 Multithreading 24 3.4.3 Pruning 25 3.5 ๋ฌด์ธ ํŒ๋งค๋Œ€ ์‹œ์Šคํ…œ ์„ฑ๋Šฅ ํ‰๊ฐ€ 27 3.5.1 Object Detection ์„ฑ๋Šฅ ํ‰๊ฐ€ 27 3.5.2 ๋ฌด์ธ ํŒ๋งค๋Œ€ ์‹œ์Šคํ…œ ์ „์ฒด ๊ฒฐ๊ณผ 29 ์ œ 4 ์žฅ ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ Dominant Feature Pooling 32 4.1 Object Detection CNN๊ณผ ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ Object Clustering 33 4.1.1 Object Detection CNN 33 4.1.2 ๋ฉ€ํ‹ฐ ์นด๋ฉ”๋ผ Object Clustring 35 4.2 Dominant Feature Pooling ๋ฐฉ๋ฒ• 37 4.2.1 Dominant Feature Scoring 40 4.2.2 Dominant Feature Pooling 47 4.2.3 YOLOv3์˜ Detection Layer ์žฌ์‚ฌ์šฉ 50 4.3 Feature ์‹œ๊ฐํ™”๋ฅผ ํ†ตํ•œ ์ œ์•ˆ ๋ฐฉ๋ฒ• ๋ถ„์„ 52 4.3.1 ์ œ์•ˆํ•˜๋Š” Feature ์‹œ๊ฐํ™” ๋ฐฉ๋ฒ• 52 4.3.2 ๊ธฐ์กด ๋‹จ์ผ ์นด๋ฉ”๋ผ YOLOv3์˜ Feature ์‹œ๊ฐํ™” 55 4.3.3 ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์˜ ๋ฉ€ํ‹ฐ์นด๋ฉ”๋ผ Feature ์‹œ๊ฐํ™” 57 4.4 Dominant Feature Pooling ๊ฒฐ๊ณผ ๋ฐ ๋ถ„์„ 59 4.4.1 COCO Dataset์—์„œ์˜ ๊ฒฐ๊ณผ 60 4.4.2 Custom Dataset์—์„œ์˜ ๊ฒฐ๊ณผ 62 4.4.3 Scoring Method ๋ณ„ ๊ฒฐ๊ณผ 63 4.4.3 Dominant Feature Pooling์˜ ์ˆ˜ํ–‰์‹œ๊ฐ„ ๊ฒฐ๊ณผ 64 ์ œ 5 ์žฅ Retinex Applied Object Detection ๋ฐ ํ•˜๋“œ์›จ์„œ ๊ฐ€์†์‹œ์Šคํ…œ 65 5.1 ๊ธฐ์กด Retinex ์ ์šฉ ์—ฐ๊ตฌ 66 5.2 Retinex Applied Object Detection 68 5.2.1 Retinex Applied Object Detection ํ•™์Šต 68 5.2.2 Retinex Applied Object Detection ๊ฒฐ๊ณผ 72 5.3 Object Detection์„ ์œ„ํ•œ Retinex ์ตœ์ ํ™” 76 5.3.1 Gaussian Filter ํฌ๊ธฐ์— ๋”ฐ๋ฅธ Retinex ํšจ๊ณผ ๋ถ„์„ 76 5.3.2 Gaussain Filter ํฌ๊ธฐ์— ๋”ฐ๋ฅธ Object Detection ๊ฒฐ๊ณผ 80 5.4 Retinex ํ•˜๋“œ์›จ์–ด ์‹œ์Šคํ…œ์˜ ํ•„์š”์„ฑ ๋ฐ ๊ธฐ์กด ์—ฐ๊ตฌ 82 5.5 ์ œ์•ˆ ํ•˜๋“œ์›จ์–ด ์‹œ์Šคํ…œ ๊ตฌํ˜„ ๊ฐœ์š” 85 5.6 ์ œ์•ˆ ํ•˜๋“œ์›จ์–ด ์‹œ์Šคํ…œ ๊ตฌํ˜„ ํŠน์žฅ์  89 5.6.1 Gaussian filter์˜ ๊ตฌํ˜„ 89 5.6.2 Exponentiation์˜ ๊ตฌํ˜„ 96 5.6.3 HDMI/DVI ์ง€์› ๋ฐ ์˜์ƒ latency ์ตœ์†Œํ™” 103 5.7 ์ œ์•ˆ ํ•˜๋“œ์›จ์–ด ์‹œ์Šคํ…œ ๊ตฌํ˜„ ๊ฒฐ๊ณผ ๋ฐ ๋ถ„์„ 106 5.7.1 ์‹ค์‹œ๊ฐ„ ๋™์ž‘ ๋ฐ ๋‚ฎ์€ latency์— ๋Œ€ํ•œ ๋ถ„์„ 106 5.7.2 ์ œ์•ˆํ•œ ์‹œ์Šคํ…œ์˜ ์˜์ƒ ์ฒ˜๋ฆฌ ์„ฑ๋Šฅ ๊ฒฐ๊ณผ ๋ถ„์„ 109 5.7.3 ์ œ์•ˆํ•œ ์‹œ์Šคํ…œ์˜ FPGA Resource Utilization 112 5.7.4 ๋‹ค๋ฅธ ์‹œ์Šคํ…œ๊ณผ์˜ Resource Utilization ๋น„๊ต 114 5.7.5 ์ œ์•ˆํ•œ ์‹œ์Šคํ…œ์˜ ์˜์ƒ ์ฒ˜๋ฆฌ ์„ฑ๋Šฅ ๊ฒฐ๊ณผ ๋ถ„์„ 119 ์ œ 6 ์žฅ ๊ฒฐ๋ก  120 ์ฐธ๊ณ ๋ฌธํ—Œ 121 Abstract 131๋ฐ•

    Analyzing computer vision models for detecting customers: a practical experience in a mexican retail

    Get PDF
    Computer vision has become an important technology for obtaining meaningful data from visual content and providing valuable information for enhancing security controls, marketing, and logistic strategies in diverse industrial and business sectors. The retail sector constitutes an important part of the worldwide economy. Analyzing customer data and shopping behaviors has become essential to deliver the right products to customers, maximize profits, and increase competitiveness. In-person shopping is still a predominant form of retail despite the appearance of online retail outlets. As such, in-person retail is adopting computer vision models to monitor store products and customers. This research paper presents the development of a computer vision solution by Lytica Company to detect customers in Sterenโ€™s physical retail stores in Mexico. Current computer vision models such as SSD Mobilenet V2, YOLO-FastestV2, YOLOv5, and YOLOXn were analyzed to find the most accurate system according to the conditions and characteristics of the available devices. Some of the challenges addressed during the analysis of videos were obstruction and proximity of the customers, lighting conditions, position and distance of the camera concerning the customer when entering the store, image quality, and scalability of the process. Models were evaluated with the F1-score metric: 0.64 with YOLO FastestV2, 0.74 with SSD Mobilenetv2, 0.86 with YOLOv5n, 0.86 with YOLOv5xs, and 0.74 with YOLOXn. Although YOLOv5 achieved the best performance, YOLOXn presented the best balance between performance and FPS (frames per second) rate, considering the limited hardware and computing power conditions

    Deep Learning Detected Nutrient Deficiency in Chili Plant

    Get PDF
    Chili is a staple commodity that also affects the Indonesian economy due to high market demand. Proven in June 2019, chili is a contributor to Indonesia's inflation of 0.20% from 0.55%. One factor is crop failure due to malnutrition. In this study, the aim is to explore Deep Learning Technology in agriculture to help farmers be able to diagnose their plants, so that their plants are not malnourished. Using the RCNN algorithm as the architecture of this system. Use 270 datasets in 4 categories. The dataset used is primary data with chili samples in Boyolali Regency, Indonesia. The chili we use are curly chili. The results of this study are computers that can recognize nutrient deficiencies in chili plants based on image input received with the greatest testing accuracy of 82.61% and has the best mAP value of 15.57%

    Radial Basis Function Neural Network in Identifying The Types of Mangoes

    Get PDF
    Mango (Mangifera Indica L) is part of a fruit plant species that have different color and texture characteristics to indicate its type. The identification of the types of mangoes uses the manual method through direct visual observation of mangoes to be classified. At the same time, the more subjective way humans work causes differences in their determination. Therefore in the use of information technology, it is possible to classify mangoes based on their texture using a computerized system. In its completion, the acquisition process is using the camera as an image processing instrument of the recorded images. To determine the pattern of mango data taken from several samples of texture features using Gabor filters from various types of mangoes and the value of the feature extraction results through artificial neural networks (ANN). Using the Radial Base Function method, which produces weight values, is then used as a process for classifying types of mangoes. The accuracy of the test results obtained from the use of extraction methods and existing learning methods is 100%
    corecore