2,521 research outputs found

    MLPerf Inference Benchmark

    Full text link
    Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.Comment: ISCA 202

    Classifying BISINDO Alphabet using TensorFlow Object Detection API

    Get PDF
    Indonesian Sign Language (BISINDO) is one of the sign languages used in Indonesia. The process of classifying BISINDO can be done by utilizing advances in computer technology such as deep learning. The use of the BISINDO letter classification system with the application of the MobileNet V2 FPNLite  SSD model using the TensorFlow object detection API. The purpose of this study is to classify BISINDO letters A-Z and measure the accuracy, precision, recall, and cross-validation performance of the model. The dataset used was 4054 images with a size of  consisting of 26 letter classes, which were taken by researchers by applying several research scenarios and limitations. The steps carried out are: dividing the ratio of the simulation dataset 80:20, and applying cross-validation (k-fold = 5). In this study, a real time testing using 2 scenarios was conducted, namely testing with bright light conditions of 500 lux and dim light of 50 lux with an average processing time of 30 frames per second (fps). With a simulation data set ratio of 80:20, 5 iterations were performed, the first iteration yielded a precision result of 0.758 and a recall result of 0.790, and the second iteration yielded a precision result of 0.635 and a recall result of 0.77, then obtained an accuracy score of 0.712, the third iteration provides a recall score of 0.746, the fourth iteration obtains a precision score of 0.713 and a recall score of 0.751, the fifth iteration gives a precision score of 0.742 for a fit score case and the recall score is 0.773. So, the overall average precision score is 0.712 and the overall average recall score is 0.747, indicating that the model built performs very well.Indonesian Sign Language (BISINDO) is one of the sign languages used in Indonesia. The process of classifying BISINDO can be done by utilizing advances in computer technology such as deep learning. The use of the BISINDO letter classification system with the application of the MobileNet V2 FPNLite  SSD model using the TensorFlow object detection API. The purpose of this study is to classify BISINDO letters A-Z and measure the accuracy, precision, recall, and cross-validation performance of the model. The dataset used was 4054 images with a size of  consisting of 26 letter classes, which were taken by researchers by applying several research scenarios and limitations. The steps carried out are: dividing the ratio of the simulation dataset 80:20, and applying cross-validation (k-fold = 5). In this study, a real time testing using 2 scenarios was conducted, namely testing with bright light conditions of 500 lux and dim light of 50 lux with an average processing time of 30 frames per second (fps). With a simulation data set ratio of 80:20, 5 iterations were performed, the first iteration yielded a precision result of 0.758 and a recall result of 0.790, and the second iteration yielded a precision result of 0.635 and a recall result of 0.77, then obtained an accuracy score of 0.712, the third iteration provides a recall score of 0.746, the fourth iteration obtains a precision score of 0.713 and a recall score of 0.751, the fifth iteration gives a precision score of 0.742 for a fit score case and the recall score is 0.773. So, the overall average precision score is 0.712 and the overall average recall score is 0.747, indicating that the model built performs very well

    Vehicle Collision Detection based on Synthetic Data using Deep Learning

    Get PDF
    Computer vision and deep learning methods that process visual data have considerably improved during the last decade. This progress has also affected the development of so-called autonomous vehicles, which are able to act independently in the traffic. One notable hindrance facing any deep learning application is the amount of quality data that is available. Data means the corpus of information from which the models learn new skills. Lack of good data is often the most significant hurdle a deep learning project faces. When considering autonomous vehicles and traffic generally, this problem is particularly evident in a collision context, as there is very little accident data available for public use and research, particularly when the data should be both consistent and of good quality. This thesis presents a solution in which real data is substituted with data that is generated in a video game environment. The solution proposed in this thesis can learn collision detection by looking at the synthetic data and then apply the learned information in detecting real collisions. The presented solution consists of three phases. The first two phases are object detection and object tracking which are used to identify and follow vehicles moving the video footage using deep learning. Information obtained in these phases is then transferred to the third phase, a collision detector, which attempts to infer if the tracked vehicle is moving normally or if it is participating in a collision. Initial results indicate a promising although limited connection between synthetic and real-world data, and the proposed model is able to slightly surpass the performance of a trivial baseline. However, the generated synthetic training data is not entirely representative of its real-world counterpart, which results in some of the collision events being very difficult to detect properly.Konenäkö ja visuaalista dataa käsittelevät koneoppimismenetelmät ovat kehittyneet merkittävästi kuluneen vuosikymmenen aikana. Tämä edistys on myös näkynyt myös niin kutsuttujen autonomisten ajoneuvojen tuotekehityksessä, eli kehitettäessä liikennevälineitä, jotka kykenevät toimimaan liikenteessä itsenäisesti. Eräs merkittävä hidaste mille tahansa koneoppimisen sovellutukselle on saatavilla olevan laadukkaan datan määrä. Datalla tarkoitetaan sitä tietoaineistoa, jonka avulla koneoppimisen mallit oppivat uusia taitoja. Laadukkaan aineiston puute on usein merkittävin este, jonka moni koneoppimiseen liittyvä projekti kohtaa. Autonomisista ajoneuvoista sekä yleisesti liikenteestä puhuttaessa edellä mainittu koskee erityisesti onnettomuuksia, joista ei juurikaan ole tarjolla yhdenmukaista ja hyvälaatuista dataa julkista käyttöä ja tutkimusta varten. Tämä opinnäytetyö esittelee ratkaisun, jossa todellinen onnettomuusdata korvataan videopeliympäristössä luodulla datalla. Tutkielmassa esiteltävä ratkaisu kykenee oppimaan törmäyksen tunnistuksen keinotekoisesta datasta ja sen jälkeen soveltamaan opittua tietoa todellisiin törmäyksiin. Tutkielman ratkaisu koostuu kolmesta erillisestä osa-alueesta. Kaksi ensimmäistä osa-aluetta ovat kohteiden tunnistaminen ja seuranta, joiden avulla jäljitetään videoaineistossa liikkuvia ajoneuvoja. Tiedot liikkuvista ajoneuvoista siirretään törmäyksiä tunnistavalle mallille, joka pyrkii päättelemään, liikkuuko seurattava ajoneuvo normaalisti vai onko se osallisena törmäyksessä. Mallinnukset tuottavat lupaavia tuloksia, mutta yhteys keinotekoisen ja todellisen datan välillä jää osin vaillinaiseksi. Esitetty törmäyksiä tunnistava malli onnistuu hienoisesti parantamaan tuloksia verrattuna triviaaliin vertailukohtamalliin. Keinotekoinen data ei kuitenkaan täysin vastaa todellisia törmäyksiä, mistä johtuen mallin on erittäin vaikea tunnistaa joitakin törmäystilanteita täsmällisesti

    Impact of Ground Truth Annotation Quality on Performance of Semantic Image Segmentation of Traffic Conditions

    Full text link
    Preparation of high-quality datasets for the urban scene understanding is a labor-intensive task, especially, for datasets designed for the autonomous driving applications. The application of the coarse ground truth (GT) annotations of these datasets without detriment to the accuracy of semantic image segmentation (by the mean intersection over union - mIoU) could simplify and speedup the dataset preparation and model fine tuning before its practical application. Here the results of the comparative analysis for semantic segmentation accuracy obtained by PSPNet deep learning architecture are presented for fine and coarse annotated images from Cityscapes dataset. Two scenarios were investigated: scenario 1 - the fine GT images for training and prediction, and scenario 2 - the fine GT images for training and the coarse GT images for prediction. The obtained results demonstrated that for the most important classes the mean accuracy values of semantic image segmentation for coarse GT annotations are higher than for the fine GT ones, and the standard deviation values are vice versa. It means that for some applications some unimportant classes can be excluded and the model can be tuned further for some classes and specific regions on the coarse GT dataset without loss of the accuracy even. Moreover, this opens the perspectives to use deep neural networks for the preparation of such coarse GT datasets.Comment: 10 pages, 6 figures, 2 tables, The Second International Conference on Computer Science, Engineering and Education Applications (ICCSEEA2019) 26-27 January 2019, Kiev, Ukrain
    corecore