311 research outputs found
Deep Affordance-grounded Sensorimotor Object Recognition
It is well-established by cognitive neuroscience that human perception of
objects constitutes a complex process, where object appearance information is
combined with evidence about the so-called object "affordances", namely the
types of actions that humans typically perform when interacting with them. This
fact has recently motivated the "sensorimotor" approach to the challenging task
of automatic object recognition, where both information sources are fused to
improve robustness. In this work, the aforementioned paradigm is adopted,
surpassing current limitations of sensorimotor object recognition research.
Specifically, the deep learning paradigm is introduced to the problem for the
first time, developing a number of novel neuro-biologically and
neuro-physiologically inspired architectures that utilize state-of-the-art
neural networks for fusing the available information sources in multiple ways.
The proposed methods are evaluated using a large RGB-D corpus, which is
specifically collected for the task of sensorimotor object recognition and is
made publicly available. Experimental results demonstrate the utility of
affordance information to object recognition, achieving an up to 29% relative
error reduction by its inclusion.Comment: 9 pages, 7 figures, dataset link included, accepted to CVPR 201
Distributed maze exploration using multiple agents and optimal goal assignment
Robotic exploration has long captivated researchers aiming to map complex
environments efficiently. Techniques such as potential fields and frontier
exploration have traditionally been employed in this pursuit, primarily
focusing on solitary agents. Recent advancements have shifted towards
optimizing exploration efficiency through multiagent systems. However, many
existing approaches overlook critical real-world factors, such as broadcast
range limitations, communication costs, and coverage overlap. This paper
addresses these gaps by proposing a distributed maze exploration strategy
(CU-LVP) that assumes constrained broadcast ranges and utilizes Voronoi
diagrams for better area partitioning. By adapting traditional multiagent
methods to distributed environments with limited broadcast ranges, this study
evaluates their performance across diverse maze topologies, demonstrating the
efficacy and practical applicability of the proposed method. The code and
experimental results supporting this study are available in the following
repository: https://github.com/manouslinard/multiagent-exploration/.Comment: 11 pages, 9 figure
Visual inspection for illicit items in X-ray images using Deep Learning
Automated detection of contraband items in X-ray images can significantly
increase public safety, by enhancing the productivity and alleviating the
mental load of security officers in airports, subways, customs/post offices,
etc. The large volume and high throughput of passengers, mailed parcels, etc.,
during rush hours practically make it a Big Data problem. Modern computer
vision algorithms relying on Deep Neural Networks (DNNs) have proven capable of
undertaking this task even under resource-constrained and embedded execution
scenarios, e.g., as is the case with fast, single-stage object detectors.
However, no comparative experimental assessment of the various relevant DNN
components/methods has been performed under a common evaluation protocol, which
means that reliable cross-method comparisons are missing. This paper presents
exactly such a comparative assessment, utilizing a public relevant dataset and
a well-defined methodology for selecting the specific DNN components/modules
that are being evaluated. The results indicate the superiority of Transformer
detectors, the obsolete nature of auxiliary neural modules that have been
developed in the past few years for security applications and the efficiency of
the CSP-DarkNet backbone CNN.Comment: arXiv admin note: substantial text overlap with arXiv:2305.0193
Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions
The current study focuses on systematically analyzing the recent advances in
the field of Multimodal eXplainable Artificial Intelligence (MXAI). In
particular, the relevant primary prediction tasks and publicly available
datasets are initially described. Subsequently, a structured presentation of
the MXAI methods of the literature is provided, taking into account the
following criteria: a) The number of the involved modalities, b) The stage at
which explanations are produced, and c) The type of the adopted methodology
(i.e. mathematical formalism). Then, the metrics used for MXAI evaluation are
discussed. Finally, a comprehensive analysis of current challenges and future
research directions is provided.Comment: 26 pages, 11 figure
Self-supervised visual learning in the low-data regime: a comparative evaluation
Self-Supervised Learning (SSL) is a valuable and robust training methodology
for contemporary Deep Neural Networks (DNNs), enabling unsupervised pretraining
on a `pretext task' that does not require ground-truth labels/annotation. This
allows efficient representation learning from massive amounts of unlabeled
training data, which in turn leads to increased accuracy in a `downstream task'
by exploiting supervised transfer learning. Despite the relatively
straightforward conceptualization and applicability of SSL, it is not always
feasible to collect and/or to utilize very large pretraining datasets,
especially when it comes to real-world application settings. In particular, in
cases of specialized and domain-specific application scenarios, it may not be
achievable or practical to assemble a relevant image pretraining dataset in the
order of millions of instances or it could be computationally infeasible to
pretrain at this scale. This motivates an investigation on the effectiveness of
common SSL pretext tasks, when the pretraining dataset is of relatively
limited/constrained size. In this context, this work introduces a taxonomy of
modern visual SSL methods, accompanied by detailed explanations and insights
regarding the main categories of approaches, and, subsequently, conducts a
thorough comparative experimental evaluation in the low-data regime, targeting
to identify: a) what is learnt via low-data SSL pretraining, and b) how do
different SSL categories behave in such training scenarios. Interestingly, for
domain-specific downstream tasks, in-domain low-data SSL pretraining
outperforms the common approach of large-scale pretraining on general datasets.
Grounded on the obtained results, valuable insights are highlighted regarding
the performance of each category of SSL methods, which in turn suggest
straightforward future research directions in the field
Knowledge-based semantic annotation and retrieval of multimedia content
aceMedia is a 4 year EC part-funded FP6 Integrated Project, ending in December 2007. The project has developed tools to enable users to manage and share both personal and purchased content across PC, STB and mobile platforms. Knowledge-based analysis and ontologies have been successfully exploited in an end-to-end system to enable automated semantic annotation and retrieval of multimedia content. The paper briefly describes the objectives of aceMedia and the application of knowledge-based analysis in the project
StatAvg: Mitigating Data Heterogeneity in Federated Learning for Intrusion Detection Systems
Federated learning (FL) is a decentralized learning technique that enables
participating devices to collaboratively build a shared Machine Leaning (ML) or
Deep Learning (DL) model without revealing their raw data to a third party. Due
to its privacy-preserving nature, FL has sparked widespread attention for
building Intrusion Detection Systems (IDS) within the realm of cybersecurity.
However, the data heterogeneity across participating domains and entities
presents significant challenges for the reliable implementation of an FL-based
IDS. In this paper, we propose an effective method called Statistical Averaging
(StatAvg) to alleviate non-independently and identically (non-iid) distributed
features across local clients' data in FL. In particular, StatAvg allows the FL
clients to share their individual data statistics with the server, which then
aggregates this information to produce global statistics. The latter are shared
with the clients and used for universal data normalisation. It is worth
mentioning that StatAvg can seamlessly integrate with any FL aggregation
strategy, as it occurs before the actual FL training process. The proposed
method is evaluated against baseline approaches using datasets for network and
host Artificial Intelligence (AI)-powered IDS. The experimental results
demonstrate the efficiency of StatAvg in mitigating non-iid feature
distributions across the FL clients compared to the baseline methods.Comment: 10 pages, 8 figure
Strategies to prevent intraoperative lung injury during cardiopulmonary bypass
During open heart surgery the influence of a series of factors such as cardiopulmonary bypass (CPB), hypothermia, operation and anaesthesia, as well as medication and transfusion can cause a diffuse trauma in the lungs. This injury leads mostly to a postoperative interstitial pulmonary oedema and abnormal gas exchange. Substantial improvements in all of the above mentioned factors may lead to a better lung function postoperatively. By avoiding CPB, reducing its time, or by minimizing the extracorporeal surface area with the use of miniaturized circuits of CPB, beneficial effects on lung function are reported. In addition, replacement of circuit surface with biocompatible surfaces like heparin-coated, and material-independent sources of blood activation, a better postoperative lung function is observed. Meticulous myocardial protection by using hypothermia and cardioplegia methods during ischemia and reperfusion remain one of the cornerstones of postoperative lung function. The partial restoration of pulmonary artery perfusion during CPB possibly contributes to prevent pulmonary ischemia and lung dysfunction. Using medication such as corticosteroids and aprotinin, which protect the lungs during CPB, and leukocyte depletion filters for operations expected to exceed 90 minutes in CPB-time appear to be protective against the toxic impact of CPB in the lungs. The newer methods of ultrafiltration used to scavenge pro-inflammatory factors seem to be protective for the lung function. In a similar way, reducing the use of cardiotomy suction device, as well as the contact-time between free blood and pericardium, it is expected that the postoperative lung function will be improved
MUSiC : a model-unspecific search for new physics in proton-proton collisions at root s=13TeV
Results of the Model Unspecific Search in CMS (MUSiC), using proton-proton collision data recorded at the LHC at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 35.9 fb(-1), are presented. The MUSiC analysis searches for anomalies that could be signatures of physics beyond the standard model. The analysis is based on the comparison of observed data with the standard model prediction, as determined from simulation, in several hundred final states and multiple kinematic distributions. Events containing at least one electron or muon are classified based on their final state topology, and an automated search algorithm surveys the observed data for deviations from the prediction. The sensitivity of the search is validated using multiple methods. No significant deviations from the predictions have been observed. For a wide range of final state topologies, agreement is found between the data and the standard model simulation. This analysis complements dedicated search analyses by significantly expanding the range of final states covered using a model independent approach with the largest data set to date to probe phase space regions beyond the reach of previous general searches.Peer reviewe
- …