17,160 research outputs found
Comprehensive Security Framework for Global Threats Analysis
Cyber criminality activities are changing and becoming more and more professional. With the growth of financial flows through the Internet and the Information System (IS), new kinds of thread arise involving complex scenarios spread within multiple IS components. The IS information modeling and Behavioral Analysis are becoming new solutions to normalize the IS information and counter these new threads. This paper presents a framework which details the principal and necessary steps for monitoring an IS. We present the architecture of the framework, i.e. an ontology of activities carried out within an IS to model security information and User Behavioral analysis. The results of the performed experiments on real data show that the modeling is effective to reduce the amount of events by 91%. The User Behavioral Analysis on uniform modeled data is also effective, detecting more than 80% of legitimate actions of attack scenarios
Context-Aware Mixed Reality: A Framework for Ubiquitous Interaction
Mixed Reality (MR) is a powerful interactive technology that yields new types
of user experience. We present a semantic based interactive MR framework that
exceeds the current geometry level approaches, a step change in generating
high-level context-aware interactions. Our key insight is to build semantic
understanding in MR that not only can greatly enhance user experience through
object-specific behaviours, but also pave the way for solving complex
interaction design challenges. The framework generates semantic properties of
the real world environment through dense scene reconstruction and deep image
understanding. We demonstrate our approach with a material-aware prototype
system for generating context-aware physical interactions between the real and
the virtual objects. Quantitative and qualitative evaluations are carried out
and the results show that the framework delivers accurate and fast semantic
information in interactive MR environment, providing effective semantic level
interactions
Visual Affordance and Function Understanding: A Survey
Nowadays, robots are dominating the manufacturing, entertainment and
healthcare industries. Robot vision aims to equip robots with the ability to
discover information, understand it and interact with the environment. These
capabilities require an agent to effectively understand object affordances and
functionalities in complex visual domains. In this literature survey, we first
focus on Visual affordances and summarize the state of the art as well as open
problems and research gaps. Specifically, we discuss sub-problems such as
affordance detection, categorization, segmentation and high-level reasoning.
Furthermore, we cover functional scene understanding and the prevalent
functional descriptors used in the literature. The survey also provides
necessary background to the problem, sheds light on its significance and
highlights the existing challenges for affordance and functionality learning.Comment: 26 pages, 22 image
Query-free Clothing Retrieval via Implicit Relevance Feedback
Image-based clothing retrieval is receiving increasing interest with the
growth of online shopping. In practice, users may often have a desired piece of
clothing in mind (e.g., either having seen it before on the street or requiring
certain specific clothing attributes) but may be unable to supply an image as a
query. We model this problem as a new type of image retrieval task in which the
target image resides only in the user's mind (called "mental image retrieval"
hereafter). Because of the absence of an explicit query image, we propose to
solve this problem through relevance feedback. Specifically, a new Bayesian
formulation is proposed that simultaneously models the retrieval target and its
high-level representation in the mind of the user (called the "user metric"
hereafter) as posterior distributions of pre-fetched shop images and
heterogeneous features extracted from multiple clothing attributes,
respectively. Requiring only clicks as user feedback, the proposed algorithm is
able to account for the variability in human decision-making. Experiments with
real users demonstrate the effectiveness of the proposed algorithm.Comment: 12 pages, under review at IEEE Transactions on Multimedi
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
Language Bootstrapping: Learning Word Meanings From Perception-Action Association
We address the problem of bootstrapping language acquisition for an
artificial system similarly to what is observed in experiments with human
infants. Our method works by associating meanings to words in manipulation
tasks, as a robot interacts with objects and listens to verbal descriptions of
the interactions. The model is based on an affordance network, i.e., a mapping
between robot actions, robot perceptions, and the perceived effects of these
actions upon objects. We extend the affordance model to incorporate spoken
words, which allows us to ground the verbal symbols to the execution of actions
and the perception of the environment. The model takes verbal descriptions of a
task as the input and uses temporal co-occurrence to create links between
speech utterances and the involved objects, actions, and effects. We show that
the robot is able form useful word-to-meaning associations, even without
considering grammatical structure in the learning process and in the presence
of recognition errors. These word-to-meaning associations are embedded in the
robot's own understanding of its actions. Thus, they can be directly used to
instruct the robot to perform tasks and also allow to incorporate context in
the speech recognition task. We believe that the encouraging results with our
approach may afford robots with a capacity to acquire language descriptors in
their operation's environment as well as to shed some light as to how this
challenging process develops with human infants.Comment: code available at
https://github.com/giampierosalvi/AffordancesAndSpeec
Crowded Scene Analysis: A Survey
Automated scene analysis has been a topic of great interest in computer
vision and cognitive science. Recently, with the growth of crowd phenomena in
the real world, crowded scene analysis has attracted much attention. However,
the visual occlusions and ambiguities in crowded scenes, as well as the complex
behaviors and scene semantics, make the analysis a challenging task. In the
past few years, an increasing number of works on crowded scene analysis have
been reported, covering different aspects including crowd motion pattern
learning, crowd behavior and activity analysis, and anomaly detection in
crowds. This paper surveys the state-of-the-art techniques on this topic. We
first provide the background knowledge and the available features related to
crowded scenes. Then, existing models, popular algorithms, evaluation
protocols, as well as system performance are provided corresponding to
different aspects of crowded scene analysis. We also outline the available
datasets for performance evaluation. Finally, some research problems and
promising future directions are presented with discussions.Comment: 20 pages in IEEE Transactions on Circuits and Systems for Video
Technology, 201
SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model
To realize human-like robot intelligence, a large-scale cognitive
architecture is required for robots to understand the environment through a
variety of sensors with which they are equipped. In this paper, we propose a
novel framework named Serket that enables the construction of a large-scale
generative model and its inference easily by connecting sub-modules to allow
the robots to acquire various capabilities through interaction with their
environments and others. We consider that large-scale cognitive models can be
constructed by connecting smaller fundamental models hierarchically while
maintaining their programmatic independence. Moreover, connected modules are
dependent on each other, and parameters are required to be optimized as a
whole. Conventionally, the equations for parameter estimation have to be
derived and implemented depending on the models. However, it becomes harder to
derive and implement those of a larger scale model. To solve these problems, in
this paper, we propose a method for parameter estimation by communicating the
minimal parameters between various modules while maintaining their programmatic
independence. Therefore, Serket makes it easy to construct large-scale models
and estimate their parameters via the connection of modules. Experimental
results demonstrated that the model can be constructed by connecting modules,
the parameters can be optimized as a whole, and they are comparable with the
original models that we have proposed
Conversation as Action Under Uncertainty
Conversations abound with uncetainties of various kinds. Treating
conversation as inference and decision making under uncertainty, we propose a
task independent, multimodal architecture for supporting robust continuous
spoken dialog called Quartet. We introduce four interdependent levels of
analysis, and describe representations, inference procedures, and decision
strategies for managing uncertainties within and between the levels. We
highlight the approach by reviewing interactions between a user and two spoken
dialog systems developed using the Quartet architecture: Prsenter, a prototype
system for navigating Microsoft PowerPoint presentations, and the Bayesian
Receptionist, a prototype system for dealing with tasks typically handled by
front desk receptionists at the Microsoft corporate campus.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in
Artificial Intelligence (UAI2000
Computational models: Bottom-up and top-down aspects
Computational models of visual attention have become popular over the past
decade, we believe primarily for two reasons: First, models make testable
predictions that can be explored by experimentalists as well as theoreticians,
second, models have practical and technological applications of interest to the
applied science and engineering communities. In this chapter, we take a
critical look at recent attention modeling efforts. We focus on {\em
computational models of attention} as defined by Tsotsos \& Rothenstein
\shortcite{Tsotsos_Rothenstein11}: Models which can process any visual stimulus
(typically, an image or video clip), which can possibly also be given some task
definition, and which make predictions that can be compared to human or animal
behavioral or physiological responses elicited by the same stimulus and task.
Thus, we here place less emphasis on abstract models, phenomenological models,
purely data-driven fitting or extrapolation models, or models specifically
designed for a single task or for a restricted class of stimuli. For
theoretical models, we refer the reader to a number of previous reviews that
address attention theories and models more generally
\cite{Itti_Koch01nrn,Paletta_etal05,Frintrop_etal10,Rothenstein_Tsotsos08,Gottlieb_Balan10,Toet11,Borji_Itti12pami}
- …