122 research outputs found

    On the translation-invariance of image distance metric

    Get PDF

    Predicting Audio Advertisement Quality

    Full text link
    Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 9 page

    Perceptual recognition of familiar objects in different orientations

    Get PDF
    Recent approaches to object recognition have suggested that representations are view-dependent and not object-centred as was previously asserted by Marr (Marr and Nishihara, 1978). The exact nature of these view-centred representations however does not concord across the different theories. Palmer suggested that a single canonical view represents an object in memory (Palmer et al., 1981) whereas other studies have shown that each object may have more than one view-point representation (Tarr and Pinker 1989).A set of experiments were run to determine the nature of the visual representation of rigid, familiar objects in memory that were presented foveally and in peripheral vision. In the initial set of experiments recognition times were measured to a selection of common, elongated objects rotated in increments of 30˚ degrees in the 3 different axes and their combinations. Significant main effects of orientation were found in all experiments. This effect was attributed to the delay in recognising objects when foreshortened. Objects with strong gravitational uprights yielded the same orientation effects as objects without gravitational uprights. Recognition times to objects rotated around the picture plane were found to be independent of orientation. The results were not dependent on practice with the objects. There was no benefit found for shaded objects over silhouetted objects. The findings were highly consistent across the experiments. Four experiments were also carried out which tested the detectability of objects presented foveally among a set of similar objects. The subjects viewed an object picture (target) surrounded by eight search pictures arranged in a circular array. The task was to locate the picture-match of the target object (which was sometimes absent) as fast as possible. All of the objects had prominent elongated axes and were viewed perpendicular to this axis. When the object was present in the search array, it could appear in one of five orientations: in its original orientation, rotated in the picture plane by 30 or 60 , or rotated by 30 or 60 in depth. Highly consistent results were found across the four experiments. It was found that objects rotated in depth by 60 took longer to find and were less likely to be found in the first saccade than all other orientations. These findings were independent of the type of display (i.e. randomly rotated distractors or aligned distractors) and also of the task (matching to a picture or a name of an object). It was concluded that there was no evidence that an abstract 3-dimensional representation was used in searching for an object. The results from these experiments are compatible with the notion of multiple-view representations of objects in memory. There was no evidence found that objects were stored as single, object-centred representations. It was found that representations are initially based on the familiar views of the objects but with practice on other views, those views which hold the maximum information about the object are stored. Novel views of objects are transformed to match these stored views and different candidates for the transformation process are discussed

    IEEE/NASA Workshop on Leveraging Applications of Formal Methods, Verification, and Validation

    Get PDF
    This volume contains the Preliminary Proceedings of the 2005 IEEE ISoLA Workshop on Leveraging Applications of Formal Methods, Verification, and Validation, with a special track on the theme of Formal Methods in Human and Robotic Space Exploration. The workshop was held on 23-24 September 2005 at the Loyola College Graduate Center, Columbia, MD, USA. The idea behind the Workshop arose from the experience and feedback of ISoLA 2004, the 1st International Symposium on Leveraging Applications of Formal Methods held in Paphos (Cyprus) last October-November. ISoLA 2004 served the need of providing a forum for developers, users, and researchers to discuss issues related to the adoption and use of rigorous tools and methods for the specification, analysis, verification, certification, construction, test, and maintenance of systems from the point of view of their different application domains

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    Listening for context – Interpretation, abstraction & the real

    Get PDF
    This paper seeks to explore the proposed notion of ‘context-based composition’ by examining the nature of ‘real-world’ context. It does this by studying the way in which listeners interpret sounds, working towards a deeper understanding of what it is that we mean by ‘real-world’ sound and context-based composition. These discussions are then utilised to explore the concept of what it means to compose context-based works and suggests that new potentials are opened up by a closer examination of the definition of context-based composition, one which liberates itself from a concern over an absolute physical nature of sounds and which embraces the use of both abstract and referential sounds. This journey highlights the importance of memory and experience within processes of interpretation and the creation of context-based compositions, and questions divisions between the virtual and the real

    Body objectified? Phenomenological perspective on patient objectification in teleconsultation

    Get PDF
    The global crisis of COVID-19 pandemic has considerably accelerated the use of teleconsultation (consultation between the patient and the doctor via video platforms). While it has some obvious benefits and drawbacks for both the patient and the doctor, it is important to consider—how teleconsultation impacts the quality of the patient-doctor relationship? I will approach this question through the lens of phenomenology of the body, focusing on the question—what happens to the patient objectification in teleconsultation? To answer this question I will adopt a phenomenological approach combining both insights drawn from the phenomenological tradition, i.e., the concepts of the lived body and the object body, and the results from the phenomenologically informed qualitative research study on the patient experience of teleconsultation. The theoretical background against which I have developed this study comprises discussions within the field of phenomenology of medicine regarding the different sources of patient objectification within clinical encounter and the arguments concerning the negative impact that objectification has on the quality of care. I will argue that a factor that has frequently been identified within phenomenology of medicine as the main source of patient objectification in clinical encounters, namely, the internalized gaze of the clinician, is diminished during teleconsultation, increasing patient’s sense of agency, decreasing her sense of alienation and opening up the possibility for a closer relationship between the patient and the health care provider, all of which lead to the transformation of the hierarchical patient-health care professional relationship.This research is funded by European Regional Development Fund, University of Latvia and the State budget/Post-Doctoral Research Aid, 4th Stage/1.1.1.2/VIAA/4/20/622/Healing at a distance: phenomenological analysis of patient experience of clinical encounter in telemedicin

    Design and Development of Robotic Part Assembly System under Vision Guidance

    Get PDF
    Robots are widely used for part assembly across manufacturing industries to attain high productivity through automation. The automated mechanical part assembly system contributes a major share in production process. An appropriate vision guided robotic assembly system further minimizes the lead time and improve quality of the end product by suitable object detection methods and robot control strategies. An approach is made for the development of robotic part assembly system with the aid of industrial vision system. This approach is accomplished mainly in three phases. The first phase of research is mainly focused on feature extraction and object detection techniques. A hybrid edge detection method is developed by combining both fuzzy inference rule and wavelet transformation. The performance of this edge detector is quantitatively analysed and compared with widely used edge detectors like Canny, Sobel, Prewitt, mathematical morphology based, Robert, Laplacian of Gaussian and wavelet transformation based. A comparative study is performed for choosing a suitable corner detection method. The corner detection technique used in the study are curvature scale space, Wang-Brady and Harris method. The successful implementation of vision guided robotic system is dependent on the system configuration like eye-in-hand or eye-to-hand. In this configuration, there may be a case that the captured images of the parts is corrupted by geometric transformation such as scaling, rotation, translation and blurring due to camera or robot motion. Considering such issue, an image reconstruction method is proposed by using orthogonal Zernike moment invariants. The suggested method uses a selection process of moment order to reconstruct the affected image. This enables the object detection method efficient. In the second phase, the proposed system is developed by integrating the vision system and robot system. The proposed feature extraction and object detection methods are tested and found efficient for the purpose. In the third stage, robot navigation based on visual feedback are proposed. In the control scheme, general moment invariants, Legendre moment and Zernike moment invariants are used. The selection of best combination of visual features are performed by measuring the hamming distance between all possible combinations of visual features. This results in finding the best combination that makes the image based visual servoing control efficient. An indirect method is employed in determining the moment invariants for Legendre moment and Zernike moment. These moments are used as they are robust to noise. The control laws, based on these three global feature of image, perform efficiently to navigate the robot in the desire environment
    corecore