1,266 research outputs found
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
Deep Learning for Genomics: A Concise Overview
Advancements in genomic research such as high-throughput sequencing
techniques have driven modern genomic studies into "big data" disciplines. This
data explosion is constantly challenging conventional methods used in genomics.
In parallel with the urgent demand for robust algorithms, deep learning has
succeeded in a variety of fields such as vision, speech, and text processing.
Yet genomics entails unique challenges to deep learning since we are expecting
from deep learning a superhuman intelligence that explores beyond our knowledge
to interpret the genome. A powerful deep learning model should rely on
insightful utilization of task-specific knowledge. In this paper, we briefly
discuss the strengths of different deep learning models from a genomic
perspective so as to fit each particular task with a proper deep architecture,
and remark on practical considerations of developing modern deep learning
architectures for genomics. We also provide a concise review of deep learning
applications in various aspects of genomic research, as well as pointing out
potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning
Application
TriPINet: Tripartite Progressive Integration Network for Image Manipulation Localization
Image manipulation localization aims at distinguishing forged regions from
the whole test image. Although many outstanding prior arts have been proposed
for this task, there are still two issues that need to be further studied: 1)
how to fuse diverse types of features with forgery clues; 2) how to
progressively integrate multistage features for better localization
performance. In this paper, we propose a tripartite progressive integration
network (TriPINet) for end-to-end image manipulation localization. First, we
extract both visual perception information, e.g., RGB input images, and visual
imperceptible features, e.g., frequency and noise traces for forensic feature
learning. Second, we develop a guided cross-modality dual-attention (gCMDA)
module to fuse different types of forged clues. Third, we design a set of
progressive integration squeeze-and-excitation (PI-SE) modules to improve
localization performance by appropriately incorporating multiscale features in
the decoder. Extensive experiments are conducted to compare our method with
state-of-the-art image forensics approaches. The proposed TriPINet obtains
competitive results on several benchmark datasets
Opportunities and obstacles for deep learning in biology and medicine
Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network\u27s prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine
Machine Learning-based Predictive Maintenance for Optical Networks
Optical networks provide the backbone of modern telecommunications by connecting the world faster than ever before. However, such networks are susceptible to several failures (e.g., optical fiber cuts, malfunctioning optical devices), which might result in degradation in the network operation, massive data loss, and network disruption. It is challenging to accurately and quickly detect and localize such failures due to the complexity of such networks, the time required to identify the fault and pinpoint it using conventional approaches, and the lack of proactive efficient fault management mechanisms. Therefore, it is highly beneficial to perform fault management in optical communication systems in order to reduce the mean time to repair, to meet service level agreements more easily, and to enhance the network reliability. In this thesis, the aforementioned challenges and needs are tackled by investigating the use of machine learning (ML) techniques for implementing efficient proactive fault detection, diagnosis, and localization schemes for optical communication systems. In particular, the adoption of ML methods for solving the following problems is explored: - Degradation prediction of semiconductor lasers, - Lifetime (mean time to failure) prediction of semiconductor lasers, - Remaining useful life (the length of time a machine is likely to operate before it requires repair or replacement) prediction of semiconductor lasers, - Optical fiber fault detection, localization, characterization, and identification for different optical network architectures, - Anomaly detection in optical fiber monitoring. Such ML approaches outperform the conventionally employed methods for all the investigated use cases by achieving better prediction accuracy and earlier prediction or detection capability
Explicit Visual Prompting for Universal Foreground Segmentations
Foreground segmentation is a fundamental problem in computer vision, which
includes salient object detection, forgery detection, defocus blur detection,
shadow detection, and camouflage object detection. Previous works have
typically relied on domain-specific solutions to address accuracy and
robustness issues in those applications. In this paper, we present a unified
framework for a number of foreground segmentation tasks without any
task-specific designs. We take inspiration from the widely-used pre-training
and then prompt tuning protocols in NLP and propose a new visual prompting
model, named Explicit Visual Prompting (EVP). Different from the previous
visual prompting which is typically a dataset-level implicit embedding, our key
insight is to enforce the tunable parameters focusing on the explicit visual
content from each individual image, i.e., the features from frozen patch
embeddings and high-frequency components. Our method freezes a pre-trained
model and then learns task-specific knowledge using a few extra parameters.
Despite introducing only a small number of tunable parameters, EVP achieves
superior performance than full fine-tuning and other parameter-efficient
fine-tuning methods. Experiments in fourteen datasets across five tasks show
the proposed method outperforms other task-specific methods while being
considerably simple. The proposed method demonstrates the scalability in
different architectures, pre-trained weights, and tasks. The code is available
at: https://github.com/NiFangBaAGe/Explicit-Visual-Prompt.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1088
- …