473 research outputs found

    Joint-Based Action Progress Prediction

    Get PDF
    Action understanding is a fundamental computer vision branch for several applications, ranging from surveillance to robotics. Most works deal with localizing and recognizing the action in both time and space, without providing a characterization of its evolution. Recent works have addressed the prediction of action progress, which is an estimate of how far the action has advanced as it is performed. In this paper, we propose to predict action progress using a different modality compared to previous methods: body joints. Human body joints carry very precise information about human poses, which we believe are a much more lightweight and effective way of characterizing actions and therefore their execution. Estimating action progress can in fact be determined based on the understanding of how key poses follow each other during the development of an activity. We show how an action progress prediction model can exploit body joints and integrate it with modules providing keypoint and action information in order to be run directly from raw pixels. The proposed method is experimentally validated on the Penn Action Dataset

    Multiple future prediction leveraging synthetic trajectories

    Get PDF

    Multi-axis Stiffness Sensing Device for Medical Palpation

    Get PDF
    This paper presents an innovative hand-held device able to compute stiffness when interacting with a soft object. The device is composed of four linear indenters and a USB camera. The stiffness is computed in real-time, tracking the movements of spherical features in the image of the camera. Those movements relate to the movements of the four indenters when interacting with a soft surface. Since the indenters are connected to springs with different spring constants, the displacement of the indenters varies when interacting with a soft object. The proposed multi-indenting device allows measuring the object's stiffness as well as the pan and tilt angles between the sensor and the surface of the soft object. Tests were performed to evaluate the accuracy of the proposed palpation mechanism against commercial springs of known stiffness. Results show that the accuracy and sensitivity of the proposed device increases with the softness of the examined object. Preliminary tests with silicon show the ability of the sensing mechanism to characterize phantom soft tissue for small indentation. It is noted that the results are not affected by the orientation of the device when probing the surface. The proposed sensing device can be used in different applications, such as external palpation for diagnosis or, if miniaturized, embedded on an endoscopic camera and used in Minimally Invasive Surgery (MIS)

    Credence attributes and the quest for a higher price – A hedonic stochastic frontier approach

    Get PDF
    Food manufacturers that offer credence attributes, whose presence cannot be determined a priori, may fail to differentiate their products effectively and achieve higher prices if asymmetric information (on the producers' side) impairs their ability to reach consumers with higher willingness to pay. In this article, we assess whether manufacturers carrying products with credence attributes in their portfolio are able to obtain higher prices. To this end, we use a large database of yoghurt sales in Italy and a hedonic price model estimated using a stochastic frontier estimator. The results indicate that manufacturers that offer more credence attributes in their portfolios have the ability to price their products systematically at higher levels

    A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces

    Get PDF
    The 3D Morphable Model (3DMM) is a powerful statistical tool for representing 3D face shapes. To build a 3DMM, a training set of face scans in full point-to-point correspondence is required, and its modeling capabilities directly depend on the variability contained in the training data. Thus, to increase the descriptive power of the 3DMM, establishing a dense correspondence across heterogeneous scans with sufficient diversity in terms of identities, ethnicities, or expressions becomes essential. In this manuscript, we present a fully automatic approach that leverages a 3DMM to transfer its dense semantic annotation across raw 3D faces, establishing a dense correspondence between them. We propose a novel formulation to learn a set of sparse deformation components with local support on the face that, together with an original non-rigid deformation algorithm, allow the 3DMM to precisely fit unseen faces and transfer its semantic annotation. We extensively experimented our approach, showing it can effectively generalize to highly diverse samples and accurately establish a dense correspondence even in presence of complex facial expressions. The accuracy of the dense registration is demonstrated by building a heterogeneous, large-scale 3DMM from more than 9,000 fully registered scans obtained by joining three large datasets together

    Explaining autonomous driving by learning end-to-end visual attention

    Get PDF

    Increasing Video Perceptual Quality with GANs and Semantic Coding

    Get PDF
    We have seen a rise in video based user communication in the last year, unfortunately fueled by the spread of COVID-19 disease. Efficient low-latency delay of transmission of video is a challenging problem which must also deal with the segmented nature of network infrastructure not always allowing a high throughput. Lossy video compression is a basic requirement to enable such technology widely. While this may compromise the quality of the streamed video there are recent deep learning based solutions to restore quality of a lossy compressed video. Considering the very nature of video conferencing, bitrate allocation in video streaming could be driven semantically, differentiating quality between the talking subjects and the background. Currently there have not been any work studying the restoration of semantically coded video using deep learning. In this work we show how such videos can be efficiently generated by shifting bitrate with masks derived via computer vision and how a deep generative adversarial network can be trained to restore video quality. Our study shows that the combination of semantic coding and learning based video restoration can provide superior results

    Preserving low-quality video through deep learning

    Get PDF
    Lossy video stream compression is performed to reduce the bandwidth and storage requirements. Moreover also image compression is a need that arises in many circumstances.It is often the case that older archive are stored at low resolution and with a compression rate suitable for the technology available at the time the video was created. Unfortunately, lossy compression algorithms cause artifact. Such artifacts, usually damage higher frequency details also adding noise or novel image patterns. There are several issues with this phenomenon. Low-quality images can be less pleasant to persons. Object detectors algorithms may have their performance reduced. As a result, given a perturbed version of it, we aim at removing such artifacts to recover the original image. To obtain that, one should reverse the compression process through a complicated non-linear image transformation. We propose a deep neural network able to improve image quality. We show that this model can be optimized either traditionally, directly optimizing an image similarity loss (SSIM), or using a generative adversarial approach (GAN). Our restored images have more photorealistic details with respect to traditional image enhancement networks. Our training procedure based on sub-patches is novel. Moreover, we propose novel testing protocol to evaluate restored images quantitatively. Differently from previously proposed approaches we are able to remove artifacts generated at any quality by inferring the image quality directly from data. Human evaluation and quantitative experiments in object detection show that our GAN generates images with finer consistent details and these details make a difference both for machines and humans

    Space-time Zernike Moments and Pyramid Kernel Descriptors for Action Classification

    Get PDF
    Action recognition in videos is a relevant and challenging task of automatic semantic video analysis. Most successful approaches exploit local space-time descriptors. These descriptors are usually carefully engineered in order to obtain feature invariance to photometric and geometric variations. The main drawback of space-time descriptors is high dimensionality and efficiency. In this paper we propose a novel descriptor based on 3D Zernike moments computed for space-time patches. Moments are by construction not redundant and therefore optimal for compactness. Given the hierarchical structure of our descriptor we propose a novel similarity procedure that exploits this structure comparing features as pyramids. The approach is tested on a public dataset and compared with state-of-the art descriptors

    Forgery detection from printed images: a tool in crime scene analysis

    Get PDF
    .The preliminary analysis of the genuineness of a photo is become, in the time, the first step of any forensic examination that involves images, in case there is not a certainty of its intrinsic authenticity. Digital cameras have largely replaced film based devices, till some years ago, in some areas (countries) just images made from film negatives where considered fully reliable in Court. There was a widespread prejudicial thought regarding a digital image which, according to some people, it cannot ever been considered a legal proof, since its “inconsistent digital nature”. Great efforts have been made by the forensic science community on this field and now, after all this year, different approaches have been unveiled to discover and declare possible malicious frauds, thus to establish whereas an image is authentic or not or, at least, to assess a certain degree of probability of its “pureness”. Nowadays it’s an easy practice to manipulate digital images by using powerful photo editing tools. In order to alter the original meaning of the image, copy-move forgery is the one of the most common ways of manipulating the contents. With this technique a portion of the image is copied and pasted once or more times elsewhere into the same image to hide something or change the real meaning of it. Whenever a digital image (or a printed image) will be presented as an evidence into a Court, it should be followed the criteria to analyze the document with a forensic approach to determine if it contains traces of manipulation. Image forensics literature offers several examples of detectors for such manipulation and, among them, the most recent and effective ones are those based on Zernike moments and those based on Scale Invariant Feature Transform (SIFT). In particular, the capability of SIFT to discover correspondences among similar visual contents allows the forensic analysis to detect even very accurate and realistic copy-move forgeries. In some situation, however, instead of a digital document only its analog version may be available. It is interesting to ask whether it is possible to identify tampering from a printed picture rather than its digital counterpart. Scanned documents or recaptured printed documents by a digital camera are widely used in a number of different scenarios, from medical imaging, law enforcement to banking and daily consumer use. So, in this paper, the problem of identifying copy-move forgery from a printed picture is investigated. The copy-move manipulation is detected by proving the presence of copy-move patches in the scanned image by using the tool, named CADET (Cloned Area DETector), based on our previous methodology which has been adapted in a version tailored for printed image case (e.g. choice of the minimum number of matched keypoints, size of the input image, etc.) In this paper a real case of murder is presented, where an image of a crime scene, submitted as a printed documentary evidence, had been modified by the defense advisors to reject the theory of accusation given by the Prosecutor. The goal of this paper is to experimentally investigate the requirement set under which reliable copy-move forgery detection is possible on printed images, in that way the forgery test is the very first step of an appropriate operational check list manual
    • …
    corecore