110 research outputs found

    Reduced Memory Region Based Deep Convolutional Neural Network Detection

    Get PDF
    Accurate pedestrian detection has a primary role in automotive safety: for example, by issuing warnings to the driver or acting actively on car's brakes, it helps decreasing the probability of injuries and human fatalities. In order to achieve very high accuracy, recent pedestrian detectors have been based on Convolutional Neural Networks (CNN). Unfortunately, such approaches require vast amounts of computational power and memory, preventing efficient implementations on embedded systems. This work proposes a CNN-based detector, adapting a general-purpose convolutional network to the task at hand. By thoroughly analyzing and optimizing each step of the detection pipeline, we develop an architecture that outperforms methods based on traditional image features and achieves an accuracy close to the state-of-the-art while having low computational complexity. Furthermore, the model is compressed in order to fit the tight constrains of low power devices with a limited amount of embedded memory available. This paper makes two main contributions: (1) it proves that a region based deep neural network can be finely tuned to achieve adequate accuracy for pedestrian detection (2) it achieves a very low memory usage without reducing detection accuracy on the Caltech Pedestrian dataset.Comment: IEEE 2016 ICCE-Berli

    An In-Depth Study on Open-Set Camera Model Identification

    Full text link
    Camera model identification refers to the problem of linking a picture to the camera model used to shoot it. As this might be an enabling factor in different forensic applications to single out possible suspects (e.g., detecting the author of child abuse or terrorist propaganda material), many accurate camera model attribution methods have been developed in the literature. One of their main drawbacks, however, is the typical closed-set assumption of the problem. This means that an investigated photograph is always assigned to one camera model within a set of known ones present during investigation, i.e., training time, and the fact that the picture can come from a completely unrelated camera model during actual testing is usually ignored. Under realistic conditions, it is not possible to assume that every picture under analysis belongs to one of the available camera models. To deal with this issue, in this paper, we present the first in-depth study on the possibility of solving the camera model identification problem in open-set scenarios. Given a photograph, we aim at detecting whether it comes from one of the known camera models of interest or from an unknown one. We compare different feature extraction algorithms and classifiers specially targeting open-set recognition. We also evaluate possible open-set training protocols that can be applied along with any open-set classifier, observing that a simple of those alternatives obtains best results. Thorough testing on independent datasets shows that it is possible to leverage a recently proposed convolutional neural network as feature extractor paired with a properly trained open-set classifier aiming at solving the open-set camera model attribution problem even to small-scale image patches, improving over state-of-the-art available solutions.Comment: Published through IEEE Access journa

    Multi-view coding of local features in visual sensor networks

    Get PDF
    Local visual features extracted from multiple camera views are employed nowadays in several application scenarios, such as object recognition, disparity matching, image stitching and many others. In several cases, local features need to be transmitted or stored on resource-limited devices, thus calling for efficient coding techniques. While recent works have addressed the problem of efficiently compressing local features extracted from still images or video sequences, in this paper we propose and evaluate an architecture for coding features extracted from multiple, overlapping views. The proposed Multi-View Feature Coding architecture can be applied to either real-valued or binary features, and allows to obtain bitrate reductions in the order of 10-20% with respect to simulcast coding

    Aligned and Non-Aligned Double JPEG Detection Using Convolutional Neural Networks

    Full text link
    Due to the wide diffusion of JPEG coding standard, the image forensic community has devoted significant attention to the development of double JPEG (DJPEG) compression detectors through the years. The ability of detecting whether an image has been compressed twice provides paramount information toward image authenticity assessment. Given the trend recently gained by convolutional neural networks (CNN) in many computer vision tasks, in this paper we propose to use CNNs for aligned and non-aligned double JPEG compression detection. In particular, we explore the capability of CNNs to capture DJPEG artifacts directly from images. Results show that the proposed CNN-based detectors achieve good performance even with small size images (i.e., 64x64), outperforming state-of-the-art solutions, especially in the non-aligned case. Besides, good results are also achieved in the commonly-recognized challenging case in which the first quality factor is larger than the second one.Comment: Submitted to Journal of Visual Communication and Image Representation (first submission: March 20, 2017; second submission: August 2, 2017

    A Visual Sensor Network for Parking Lot Occupancy Detection in Smart Cities

    Get PDF
    Technology is quickly revolutionizing our everyday lives, helping us to perform complex tasks. The Internet of Things (IoT) paradigm is getting more and more popular and is key to the development of Smart Cities. Among all the applications of IoT in the context of Smart Cities, real-time parking lot occupancy detection recently gained a lot of attention. Solutions based on computer vision yield good performance in terms of accuracy and are deployable on top of visual sensor networks. Since the problem of detecting vacant parking lots is usually distributed over multiple cameras, adhoc algorithms for content acquisition and transmission are to be devised. A traditional paradigm consists in acquiring and encoding images or videos and transmitting them to a central controller, which is responsible for analyzing such content. A novel paradigm, which moves part of the analysis to sensing devices, is quickly becoming popular. We propose a system for distributed parking lot occupancy detection based on the latter paradigm, showing that onboard analysis and transmission of simple features yield better performance with respect to the traditional paradigm in terms of the overall rate-energy-accuracy performance

    Synthesis and DNA binding tests of a fluorescent pyrene bearing a Pt(II) pyridineimino complex

    Get PDF
    Despite the long time gone respect to the discovery of cis-platinum anticancer activity, still a huge amount of research is devoted to the design of new Pt(II) complexes with enhanced biological activity [1-3]. The here presented work concerns the synthesis of a fluorescent pyridinimino platinum(II) complex, where the presence of a cis-platinum moiety linked to an extended aromatic residue could provide interesting properties as for binding to biosubstrates. In fact, covalent Pt(II) binding can occur, which would be strengthened by the anchoring offered by possible intercalation in nucleic acids of the pyrene fragment. Antiproliferative properties have been described for some pyridinimino [4] and pyridinamino [5] platinum(II) complexes. Moreover, similar bifunctional systems have already been tested with interesting performances [7,8]. The chelating iminopyridine ligand was prepared by a condensation reaction between pyridine-2-carboxyaldehyde and the suitably O-alkylated aminoalcohol. The platinum complex was then synthesized starting from cis-[PtCl2(DMSO)2], and purified by crystallization. The pure complex (elemental analysis) was spectroscopically (IR, 1H-, 13C and 195Pt NMR) characterized. It is well soluble in DMSO and in DMSO/H2O mixtures, where its stability was checked by 1H- and 195Pt NMR. The absorbance and fluorescence optical features of the dye were also checked. Afterwards, the target Pt(II) complex was let interact with natural double stranded DNA to check its reactivity towards this biosubstrate. Spectrophotometric and spectrofluorometric titrations show that the binding does indeed occur. As for absorbance data, hypochromic and bathochromic effects suggest intercalative binding. However, the absence of a defined isosbestic point indicates multiple equilibria. Interestingly and in agreement with this observation, the light emission behavior of the dye/DNA system is complex. Opposite fluorescence change trends are observed at different temperatures, likely related to a different contribution of DNA-templated dye aggregation. Under the (until now) explored conditions, the binding is so strong to turn to be quantitative. Further experiments are ongoing to better enlighten the binding mechanism. References: [1] S. X. Chong, S. C. F. Au-Yeung, K. K. W. To, Current Medicinal Chemistry 2016, 23(12), 1268-12. [2] L. Cai, C. Yu, L. Ba, Q. Liu, Y. Qian, B. Yang, C. Gao, Applied Organometallic Chemistry 2018, 32(4). [3] M. Hanif, C. G. Hartinger, Future Medicinal Chemistry 2018, 10(6), 615-617. [4] B. A. Miles, A. E. Patterson, C. M. Vogels, A. Decken, J. C. Waller, P. Jr. Morin, S. A. Westcott, Polyhedron 2016, 108, 23-29. [5] S. Karmakar, K. Purkait, S. Chatterjee, A. Mukherjee, Dalton Trans. 2016, 45, 3599-3615. [6] S. Hochreuther, R. van Eldik, Inorg. Chem., 2012, 51 (5), 3025-3038. [7] C. Bazzicalupi, A. Bencini, A. Bianchi, T. Biver, A. Boggioni, S. Bonacchi, A. Danesi, C. Giorgi, P. Gratteri, A. Marchal Ingraín, F. Secco, C. Sissi, B. Valtancoli, M. Venturini, Chemistry – A European Journal 2008, 14(1), 184-196. [8] S. Biagini, A. Bianchi, T. Biver, A. Boggioni, I.V. Nikolayenko, F. Secco, M. Venturini, Journal of Inorganic Biochemistry 2011, 105, 558-562

    Two vs. Four-Channel Sound Event Localization and Detection

    Full text link
    Sound event localization and detection (SELD) systems estimate both the direction-of-arrival (DOA) and class of sound sources over time. In the DCASE 2022 SELD Challenge (Task 3), models are designed to operate in a 4-channel setting. While beneficial to further the development of SELD systems using a multichannel recording setup such as first-order Ambisonics (FOA), most consumer electronics devices rarely are able to record using more than two channels. For this reason, in this work we investigate the performance of the DCASE 2022 SELD baseline model using three audio input representations: FOA, binaural, and stereo. We perform a novel comparative analysis illustrating the effect of these audio input representations on SELD performance. Crucially, we show that binaural and stereo (i.e. 2-channel) audio-based SELD models are still able to localize and detect sound sources laterally quite well, despite overall performance degrading as less audio information is provided. Further, we segment our analysis by scenes containing varying degrees of sound source polyphony to better understand the effect of audio input representation on localization and detection performance as scene conditions become increasingly complex
    corecore