539 research outputs found
Shape Representations Using Nested Descriptors
The problem of shape representation is a core problem in computer vision. It can be argued that shape representation is the most central representational problem for computer vision, since unlike texture or color, shape alone can be used for perceptual tasks such as image matching, object detection and object categorization.
This dissertation introduces a new shape representation called the nested descriptor. A nested descriptor represents shape both globally and locally by pooling salient scaled and oriented complex gradients in a large nested support set. We show that this nesting property introduces a nested correlation structure that enables a new local distance function called the nesting distance, which provides a provably robust similarity function for image matching. Furthermore, the nesting property suggests an elegant flower like normalization strategy called a log-spiral difference. We show that this normalization enables a compact binary representation and is equivalent to a form a bottom up saliency. This suggests that the nested descriptor representational power is due to representing salient edges, which makes a fundamental connection between the saliency and local feature descriptor literature. In this dissertation, we introduce three examples of shape representation using nested descriptors: nested shape descriptors for imagery, nested motion descriptors for video and nested pooling for activities. We show evaluation results for these representations that demonstrate state-of-the-art performance for image matching, wide baseline stereo and activity recognition tasks
Analyzing Human-Human Interactions: A Survey
Many videos depict people, and it is their interactions that inform us of
their activities, relation to one another and the cultural and social setting.
With advances in human action recognition, researchers have begun to address
the automated recognition of these human-human interactions from video. The
main challenges stem from dealing with the considerable variation in recording
setting, the appearance of the people depicted and the coordinated performance
of their interaction. This survey provides a summary of these challenges and
datasets to address these, followed by an in-depth discussion of relevant
vision-based recognition and detection methods. We focus on recent, promising
work based on deep learning and convolutional neural networks (CNNs). Finally,
we outline directions to overcome the limitations of the current
state-of-the-art to analyze and, eventually, understand social human actions
Promoting Phonological Awareness in Young Children through At-Home Activities: A Video Curriculum
Research relating phonological awareness, beginning reading acquisition, and parental involvement in children\u27s literacy development was read, evaluated, and summarized. A positive relationship between phonological awareness and learning to read was indicated from this review, and a correlation between parental literacy activities and children\u27s language and reading acquisition was found. Studies suggesting the existence of a developmental sequence of phonological skills were examined. The literature review provided a rationale and design for phonological awareness instruction. A research supported curriculum containing a teacher\u27s manual, take-home interactive video activities and activity sheets, and assessments was created
Semantic multimedia analysis using knowledge and context
PhDThe difficulty of semantic multimedia analysis can be attributed to the
extended diversity in form and appearance exhibited by the majority of
semantic concepts and the difficulty to express them using a finite number
of patterns. In meeting this challenge there has been a scientific debate
on whether the problem should be addressed from the perspective of using
overwhelming amounts of training data to capture all possible instantiations
of a concept, or from the perspective of using explicit knowledge about
the concepts’ relations to infer their presence. In this thesis we address
three problems of pattern recognition and propose solutions that combine
the knowledge extracted implicitly from training data with the knowledge
provided explicitly in structured form. First, we propose a BNs modeling
approach that defines a conceptual space where both domain related evi-
dence and evidence derived from content analysis can be jointly considered
to support or disprove a hypothesis. The use of this space leads to sig-
nificant gains in performance compared to analysis methods that can not
handle combined knowledge. Then, we present an unsupervised method
that exploits the collective nature of social media to automatically obtain
large amounts of annotated image regions. By proving that the quality of
the obtained samples can be almost as good as manually annotated images
when working with large datasets, we significantly contribute towards scal-
able object detection. Finally, we introduce a method that treats images,
visual features and tags as the three observable variables of an aspect model
and extracts a set of latent topics that incorporates the semantics of both
visual and tag information space. By showing that the cross-modal depen-
dencies of tagged images can be exploited to increase the semantic capacity
of the resulting space, we advocate the use of all existing information facets
in the semantic analysis of social media
Natural Language Processing in-and-for Design Research
We review the scholarly contributions that utilise Natural Language
Processing (NLP) methods to support the design process. Using a heuristic
approach, we collected 223 articles published in 32 journals and within the
period 1991-present. We present state-of-the-art NLP in-and-for design research
by reviewing these articles according to the type of natural language text
sources: internal reports, design concepts, discourse transcripts, technical
publications, consumer opinions, and others. Upon summarizing and identifying
the gaps in these contributions, we utilise an existing design innovation
framework to identify the applications that are currently being supported by
NLP. We then propose a few methodological and theoretical directions for future
NLP in-and-for design research
A comparison of statistical machine learning methods in heartbeat detection and classification
In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms
Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective
Data-driven decision making is becoming an integral part of manufacturing
companies. Data is collected and commonly used to improve efficiency and
produce high quality items for the customers. IoT-based and other forms of
object tracking are an emerging tool for collecting movement data of
objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over
space and time. Movement data can provide valuable insights like process
bottlenecks, resource utilization, effective working time etc. that can be used
for decision making and improving efficiency.
Turning movement data into valuable information for industrial management and
decision making requires analysis methods. We refer to this process as movement
analytics. The purpose of this document is to review the current state of work
for movement analytics both in manufacturing and more broadly.
We survey relevant work from both a theoretical perspective and an
application perspective. From the theoretical perspective, we put an emphasis
on useful methods from two research areas: machine learning, and logic-based
knowledge representation. We also review their combinations in view of movement
analytics, and we discuss promising areas for future development and
application. Furthermore, we touch on constraint optimization.
From an application perspective, we review applications of these methods to
movement analytics in a general sense and across various industries. We also
describe currently available commercial off-the-shelf products for tracking in
manufacturing, and we overview main concepts of digital twins and their
applications
Semantics-aware image understanding
L'abstract è presente nell'allegato / the abstract is in the attachmen
- …