539 research outputs found

    Shape Representations Using Nested Descriptors

    Get PDF
    The problem of shape representation is a core problem in computer vision. It can be argued that shape representation is the most central representational problem for computer vision, since unlike texture or color, shape alone can be used for perceptual tasks such as image matching, object detection and object categorization. This dissertation introduces a new shape representation called the nested descriptor. A nested descriptor represents shape both globally and locally by pooling salient scaled and oriented complex gradients in a large nested support set. We show that this nesting property introduces a nested correlation structure that enables a new local distance function called the nesting distance, which provides a provably robust similarity function for image matching. Furthermore, the nesting property suggests an elegant flower like normalization strategy called a log-spiral difference. We show that this normalization enables a compact binary representation and is equivalent to a form a bottom up saliency. This suggests that the nested descriptor representational power is due to representing salient edges, which makes a fundamental connection between the saliency and local feature descriptor literature. In this dissertation, we introduce three examples of shape representation using nested descriptors: nested shape descriptors for imagery, nested motion descriptors for video and nested pooling for activities. We show evaluation results for these representations that demonstrate state-of-the-art performance for image matching, wide baseline stereo and activity recognition tasks

    Analyzing Human-Human Interactions: A Survey

    Full text link
    Many videos depict people, and it is their interactions that inform us of their activities, relation to one another and the cultural and social setting. With advances in human action recognition, researchers have begun to address the automated recognition of these human-human interactions from video. The main challenges stem from dealing with the considerable variation in recording setting, the appearance of the people depicted and the coordinated performance of their interaction. This survey provides a summary of these challenges and datasets to address these, followed by an in-depth discussion of relevant vision-based recognition and detection methods. We focus on recent, promising work based on deep learning and convolutional neural networks (CNNs). Finally, we outline directions to overcome the limitations of the current state-of-the-art to analyze and, eventually, understand social human actions

    Promoting Phonological Awareness in Young Children through At-Home Activities: A Video Curriculum

    Get PDF
    Research relating phonological awareness, beginning reading acquisition, and parental involvement in children\u27s literacy development was read, evaluated, and summarized. A positive relationship between phonological awareness and learning to read was indicated from this review, and a correlation between parental literacy activities and children\u27s language and reading acquisition was found. Studies suggesting the existence of a developmental sequence of phonological skills were examined. The literature review provided a rationale and design for phonological awareness instruction. A research supported curriculum containing a teacher\u27s manual, take-home interactive video activities and activity sheets, and assessments was created

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

    Natural Language Processing in-and-for Design Research

    Full text link
    We review the scholarly contributions that utilise Natural Language Processing (NLP) methods to support the design process. Using a heuristic approach, we collected 223 articles published in 32 journals and within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research

    A comparison of statistical machine learning methods in heartbeat detection and classification

    Get PDF
    In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms

    Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective

    Full text link
    Data-driven decision making is becoming an integral part of manufacturing companies. Data is collected and commonly used to improve efficiency and produce high quality items for the customers. IoT-based and other forms of object tracking are an emerging tool for collecting movement data of objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over space and time. Movement data can provide valuable insights like process bottlenecks, resource utilization, effective working time etc. that can be used for decision making and improving efficiency. Turning movement data into valuable information for industrial management and decision making requires analysis methods. We refer to this process as movement analytics. The purpose of this document is to review the current state of work for movement analytics both in manufacturing and more broadly. We survey relevant work from both a theoretical perspective and an application perspective. From the theoretical perspective, we put an emphasis on useful methods from two research areas: machine learning, and logic-based knowledge representation. We also review their combinations in view of movement analytics, and we discuss promising areas for future development and application. Furthermore, we touch on constraint optimization. From an application perspective, we review applications of these methods to movement analytics in a general sense and across various industries. We also describe currently available commercial off-the-shelf products for tracking in manufacturing, and we overview main concepts of digital twins and their applications

    Semantics-aware image understanding

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen
    corecore