3,556 research outputs found
In-situ surface porosity prediction in DED (directed energy deposition) printed SS316L parts using multimodal sensor fusion
This study aims to relate the time-frequency patterns of acoustic emission
(AE) and other multi-modal sensor data collected in a hybrid directed energy
deposition (DED) process to the pore formations at high spatial (0.5 mm) and
time (< 1ms) resolutions. Adapting an explainable AI method in LIME (Local
Interpretable Model-Agnostic Explanations), certain high-frequency waveform
signatures of AE are to be attributed to two major pathways for pore formation
in a DED process, namely, spatter events and insufficient fusion between
adjacent printing tracks from low heat input. This approach opens an exciting
possibility to predict, in real-time, the presence of a pore in every voxel
(0.5 mm in size) as they are printed, a major leap forward compared to prior
efforts. Synchronized multimodal sensor data including force, AE, vibration and
temperature were gathered while an SS316L material sample was printed and
subsequently machined. A deep convolution neural network classifier was used to
identify the presence of pores on a voxel surface based on time-frequency
patterns (spectrograms) of the sensor data collected during the process chain.
The results suggest signals collected during DED were more sensitive compared
to those from machining for detecting porosity in voxels (classification test
accuracy of 87%). The underlying explanations drawn from LIME analysis suggests
that energy captured in high frequency AE waveforms are 33% lower for porous
voxels indicating a relatively lower laser-material interaction in the melt
pool, and hence insufficient fusion and poor overlap between adjacent printing
tracks. The porous voxels for which spatter events were prevalent during
printing had about 27% higher energy contents in the high frequency AE band
compared to other porous voxels. These signatures from AE signal can further
the understanding of pore formation from spatter and insufficient fusion
Intelligent Feature Extraction, Data Fusion and Detection of Concrete Bridge Cracks: Current Development and Challenges
As a common appearance defect of concrete bridges, cracks are important
indices for bridge structure health assessment. Although there has been much
research on crack identification, research on the evolution mechanism of bridge
cracks is still far from practical applications. In this paper, the
state-of-the-art research on intelligent theories and methodologies for
intelligent feature extraction, data fusion and crack detection based on
data-driven approaches is comprehensively reviewed. The research is discussed
from three aspects: the feature extraction level of the multimodal parameters
of bridge cracks, the description level and the diagnosis level of the bridge
crack damage states. We focus on previous research concerning the quantitative
characterization problems of multimodal parameters of bridge cracks and their
implementation in crack identification, while highlighting some of their major
drawbacks. In addition, the current challenges and potential future research
directions are discussed.Comment: Published at Intelligence & Robotics; Its copyright belongs to
author
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Computational approaches to alleviate alarm fatigue in intensive care medicine: A systematic literature review
Patient monitoring technology has been used to guide therapy and alert staff when a vital sign leaves a predefined range in the intensive care unit (ICU) for decades. However, large amounts of technically false or clinically irrelevant alarms provoke alarm fatigue in staff leading to desensitisation towards critical alarms. With this systematic review, we are following the Preferred Reporting Items for Systematic Reviews (PRISMA) checklist in order to summarise scientific efforts that aimed to develop IT systems to reduce alarm fatigue in ICUs. 69 peer-reviewed publications were included. The majority of publications targeted the avoidance of technically false alarms, while the remainder focused on prediction of patient deterioration or alarm presentation. The investigated alarm types were mostly associated with heart rate or arrhythmia, followed by arterial blood pressure, oxygen saturation, and respiratory rate. Most publications focused on the development of software solutions, some on wearables, smartphones, or headmounted displays for delivering alarms to staff. The most commonly used statistical models were tree-based. In conclusion, we found strong evidence that alarm fatigue can be alleviated by IT-based solutions. However, future efforts should focus more on the avoidance of clinically non-actionable alarms which could be accelerated by improving the data availability
Recommended from our members
Morphogenetic Principles of Brain Organisation in Health and Disease
Non-invasive neuroimaging methods, such as MRI, provide a window into the structure of the mammalian brain. However, despite the ubiquity of these methods, the biological interpretation of the information obtained using these tools remains elusive. In order to accurately link this macroscale data to microscale measurements, it is critical that the construct validity is high. This thesis provides novel analyses, pipelines and methods to: i) generate and validate maps of brain organisation obtained via MRI, and ii) demonstrate the utility of these methods in capturing elements of cognition and psychopathology.
First, in Chapter 1, I review some of the neuroscientific context for the new methods presented, from cytoarchitecture to gene expression to connectomes. Chapters 2-4 introduce a new method, “Morphometric Similarity Mapping”, which captures the brain organisation of an individual by mapping the relationships of multiple features of the cerebral cortex. Chapter 2 focuses on the development of the analysis pipeline and the graph theoretical features of the resulting morphometric similarity networks (MSNs), with an emphasis on reproducibility. Chapter 3 highlights the generalisability of MSNs to the macaque monkey, linking MSNs to ex vivo tract tracing experiments and presenting new tools for processing non-human imaging data; as well as evidence that MSN topography is organised by cytoarchitectonic features. Chapter 4 is focused on determining the transcriptomic correlates of MSNs using publicly available gene expression maps, and on applying MSNs to examine the relationship between brain organisation and intelligence.
Chapter 5 is dedicated to rigorous evaluation of the applicability of MSNs to measure specific disease-relevant phenotypes in 8 rare genetic disorder cohorts. This includes the validation of novel methods for utilising data from both single-cell sequencing technologies and differential gene expression experiments (in multiple tissue types) in analysing neuroimaging and bulk transcriptomic brain maps.
Chapter 6 provides a brief summary and presents some ongoing and future projects expanding on this original work. It also importantly discusses a general framework of comparing brain maps, including MSNs and gene expression, as well as other canonical maps of brain structure and function.
Altogether, this thesis presents and evaluates novel methods and applications for integrating multimodal neuroimaging data with genetic data derived from multiple tissue types and through various acquisition strategies. It also includes tools for performing these analyses in non-human primates, and pipelines for statistically comparing brain maps. These results not only provide insight into the manifestation of brain-related changes due to various components of human variation, but also provides a framework for evaluating this variation at multiple biological scales purely from non-invasive neuroimaging data
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
This paper surveys the current state of the art in Natural Language
Generation (NLG), defined as the task of generating text or speech from
non-linguistic input. A survey of NLG is timely in view of the changes that the
field has undergone over the past decade or so, especially in relation to new
(usually data-driven) methods, as well as new applications of NLG technology.
This survey therefore aims to (a) give an up-to-date synthesis of research on
the core tasks in NLG and the architectures adopted in which such tasks are
organised; (b) highlight a number of relatively recent research topics that
have arisen partly as a result of growing synergies between NLG and other areas
of artificial intelligence; (c) draw attention to the challenges in NLG
evaluation, relating them to similar challenges faced in other areas of Natural
Language Processing, with an emphasis on different evaluation methods and the
relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118
pages, 8 figures, 1 tabl
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
With the urgent demand for generalized deep models, many pre-trained big
models are proposed, such as BERT, ViT, GPT, etc. Inspired by the success of
these models in single domains (like computer vision and natural language
processing), the multi-modal pre-trained big models have also drawn more and
more attention in recent years. In this work, we give a comprehensive survey of
these models and hope this paper could provide new insights and helps fresh
researchers to track the most cutting-edge works. Specifically, we firstly
introduce the background of multi-modal pre-training by reviewing the
conventional deep learning, pre-training works in natural language process,
computer vision, and speech. Then, we introduce the task definition, key
challenges, and advantages of multi-modal pre-training models (MM-PTMs), and
discuss the MM-PTMs with a focus on data, objectives, network architectures,
and knowledge enhanced pre-training. After that, we introduce the downstream
tasks used for the validation of large-scale MM-PTMs, including generative,
classification, and regression tasks. We also give visualization and analysis
of the model parameters and results on representative downstream tasks.
Finally, we point out possible research directions for this topic that may
benefit future works. In addition, we maintain a continuously updated paper
list for large-scale pre-trained multi-modal big models:
https://github.com/wangxiao5791509/MultiModal_BigModels_SurveyComment: Accepted by Machine Intelligence Researc
Hashing for Multimedia Similarity Modeling and Large-Scale Retrieval
In recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data. We start by investigating a hashing-based solution for audio-visual similarity modeling and apply it to the audio-visual sound source localization problem. We show that synchronized signals in audio and visual modalities demonstrate similar temporal changing patterns in certain feature spaces. We propose to use a permutation-based random hashing technique to capture the temporal order dynamics of audio and visual features by hashing them along the temporal axis into a common Hamming space. In this way, the audio-visual correlation problem is transformed into a similarity search problem in the Hamming space. Our hashing-based audio-visual similarity modeling has shown superior performances in the localization and segmentation of sounding objects in videos. The success of the permutation-based hashing method motivates us to generalize and formally define the supervised ranking-based hashing problem, and study its application to large-scale image retrieval. Specifically, we propose an effective supervised learning procedure to learn optimized ranking-based hash functions that can be used for large-scale similarity search. Compared with the randomized version, the optimized ranking-based hash codes are much more compact and discriminative. Moreover, it can be easily extended to kernel space to discover more complex ranking structures that cannot be revealed in linear subspaces. Experiments on large image datasets demonstrate the effectiveness of the proposed method for image retrieval. We further studied the ranking-based hashing method for the cross-media similarity search problem. Specifically, we propose two optimization methods to jointly learn two groups of linear subspaces, one for each media type, so that features\u27 ranking orders in different linear subspaces maximally preserve the cross-media similarities. Additionally, we develop this ranking-based hashing method in the cross-media context into a flexible hashing framework with a more general solution. We have demonstrated through extensive experiments on several real-world datasets that the proposed cross-media hashing method can achieve superior cross-media retrieval performances against several state-of-the-art algorithms. Lastly, to make better use of the supervisory label information, as well as to further improve the efficiency and accuracy of supervised hashing, we propose a novel multimedia discrete hashing framework that optimizes an instance-wise loss objective, as compared to the pairwise losses, using an efficient discrete optimization method. In addition, the proposed method decouples the binary codes learning and hash function learning into two separate stages, thus making the proposed method equally applicable for both single-media and cross-media search. Extensive experiments on both single-media and cross-media retrieval tasks demonstrate the effectiveness of the proposed method
- …