21,825 research outputs found
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior
We address two challenges of probabilistic topic modelling in order to better
estimate the probability of a word in a given context, i.e., P(word|context):
(1) No Language Structure in Context: Probabilistic topic models ignore word
order by summarizing a given context as a "bag-of-word" and consequently the
semantics of words in the context is lost. The LSTM-LM learns a vector-space
representation of each word by accounting for word order in local collocation
patterns and models complex characteristics of language (e.g., syntax and
semantics), while the TM simultaneously learns a latent representation from the
entire document and discovers the underlying thematic structure. We unite two
complementary paradigms of learning the meaning of word occurrences by
combining a TM (e.g., DocNADE) and a LM in a unified probabilistic framework,
named as ctx-DocNADE. (2) Limited Context and/or Smaller training corpus of
documents: In settings with a small number of word occurrences (i.e., lack of
context) in short text or data sparsity in a corpus of few documents, the
application of TMs is challenging. We address this challenge by incorporating
external knowledge into neural autoregressive topic models via a language
modelling approach: we use word embeddings as input of a LSTM-LM with the aim
to improve the word-topic mapping on a smaller and/or short-text corpus. The
proposed DocNADE extension is named as ctx-DocNADEe.
We present novel neural autoregressive topic model variants coupled with
neural LMs and embeddings priors that consistently outperform state-of-the-art
generative TMs in terms of generalization (perplexity), interpretability (topic
coherence) and applicability (retrieval and classification) over 6 long-text
and 8 short-text datasets from diverse domains.Comment: Published in #ICLR2019 International Conference on Learning
Representation
Robust Spatial Filtering with Graph Convolutional Neural Networks
Convolutional Neural Networks (CNNs) have recently led to incredible
breakthroughs on a variety of pattern recognition problems. Banks of finite
impulse response filters are learned on a hierarchy of layers, each
contributing more abstract information than the previous layer. The simplicity
and elegance of the convolutional filtering process makes them perfect for
structured problems such as image, video, or voice, where vertices are
homogeneous in the sense of number, location, and strength of neighbors. The
vast majority of classification problems, for example in the pharmaceutical,
homeland security, and financial domains are unstructured. As these problems
are formulated into unstructured graphs, the heterogeneity of these problems,
such as number of vertices, number of connections per vertex, and edge
strength, cannot be tackled with standard convolutional techniques. We propose
a novel neural learning framework that is capable of handling both homogeneous
and heterogeneous data, while retaining the benefits of traditional CNN
successes.
Recently, researchers have proposed variations of CNNs that can handle graph
data. In an effort to create learnable filter banks of graphs, these methods
either induce constraints on the data or require preprocessing. As opposed to
spectral methods, our framework, which we term Graph-CNNs, defines filters as
polynomials of functions of the graph adjacency matrix. Graph-CNNs can handle
both heterogeneous and homogeneous graph data, including graphs having entirely
different vertex or edge sets. We perform experiments to validate the
applicability of Graph-CNNs to a variety of structured and unstructured
classification problems and demonstrate state-of-the-art results on document
and molecule classification problems
Efficient Gender Classification Using a Deep LDA-Pruned Net
Many real-time tasks, such as human-computer interaction, require fast and
efficient facial gender classification. Although deep CNN nets have been very
effective for a multitude of classification tasks, their high space and time
demands make them impractical for personal computers and mobile devices without
a powerful GPU. In this paper, we develop a 16-layer, yet lightweight, neural
network which boosts efficiency while maintaining high accuracy. Our net is
pruned from the VGG-16 model starting from the last convolutional (conv) layer
where we find neuron activations are highly uncorrelated given the gender.
Through Fisher's Linear Discriminant Analysis (LDA), we show that this high
decorrelation makes it safe to discard directly last conv layer neurons with
high within-class variance and low between-class variance. Combined with either
Support Vector Machines (SVM) or Bayesian classification, the reduced CNNs are
capable of achieving comparable (or even higher) accuracies on the LFW and
CelebA datasets than the original net with fully connected layers. On LFW, only
four Conv5_3 neurons are able to maintain a comparably high recognition
accuracy, which results in a reduction of total network size by a factor of 70X
with a 11 fold speedup. Comparisons with a state-of-the-art pruning method as
well as two smaller nets in terms of accuracy loss and convolutional layers
pruning rate are also provided.Comment: The only difference with the previous version v2 is the title on the
arxiv page. I am changing it back to the original title in v1 because
otherwise google scholar cannot track the citations to this arxiv paper
correctly. You could cite either the conference version or this arxiv
version. They are equivalen
BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections
We introduce an architecture for large-scale image categorization that
enables the end-to-end learning of separate visual features for the different
classes to distinguish. The proposed model consists of a deep CNN shaped like a
tree. The stem of the tree includes a sequence of convolutional layers common
to all classes. The stem then splits into multiple branches implementing
parallel feature extractors, which are ultimately connected to the final
classification layer via learned gated connections. These learned gates
determine for each individual class the subset of features to use. Such a
scheme naturally encourages the learning of a heterogeneous set of specialized
features through the separate branches and it allows each class to use the
subset of features that are optimal for its recognition. We show the generality
of our proposed method by reshaping several popular CNNs from the literature
into our proposed architecture. Our experiments on the CIFAR100, CIFAR10, and
Synth datasets show that in each case our resulting model yields a substantial
improvement in accuracy over the original CNN. Our empirical analysis also
suggests that our scheme acts as a form of beneficial regularization improving
generalization performance.Comment: WACV 201
DecomposeMe: Simplifying ConvNets for End-to-End Learning
Deep learning and convolutional neural networks (ConvNets) have been
successfully applied to most relevant tasks in the computer vision community.
However, these networks are computationally demanding and not suitable for
embedded devices where memory and time consumption are relevant.
In this paper, we propose DecomposeMe, a simple but effective technique to
learn features using 1D convolutions. The proposed architecture enables both
simplicity and filter sharing leading to increased learning capacity. A
comprehensive set of large-scale experiments on ImageNet and Places2
demonstrates the ability of our method to improve performance while
significantly reducing the number of parameters required. Notably, on Places2,
we obtain an improvement in relative top-1 classification accuracy of 7.7\%
with an architecture that requires 92% fewer parameters compared to VGG-B. The
proposed network is also demonstrated to generalize to other tasks by
converting existing networks
p-FP: Extraction, Classification, and Prediction of Website Fingerprints with Deep Learning
Recent advances in learning Deep Neural Network (DNN) architectures have
received a great deal of attention due to their ability to outperform
state-of-the-art classifiers across a wide range of applications, with little
or no feature engineering. In this paper, we broadly study the applicability of
deep learning to website fingerprinting. We show that unsupervised DNNs can be
used to extract low-dimensional feature vectors that improve the performance of
state-of-the-art website fingerprinting attacks. When used as classifiers, we
show that they can match or exceed performance of existing attacks across a
range of application scenarios, including fingerprinting Tor website traces,
fingerprinting search engine queries over Tor, defeating fingerprinting
defenses, and fingerprinting TLS-encrypted websites. Finally, we show that DNNs
can be used to predict the fingerprintability of a website based on its
contents, achieving 99% accuracy on a data set of 4500 website downloads.Comment: Under submissio
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation
How do computers and intelligent agents view the world around them? Feature
extraction and representation constitutes one the basic building blocks towards
answering this question. Traditionally, this has been done with carefully
engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is
no ``one size fits all'' approach that satisfies all requirements. In recent
years, the rising popularity of deep learning has resulted in a myriad of
end-to-end solutions to many computer vision problems. These approaches, while
successful, tend to lack scalability and can't easily exploit information
learned by other systems. Instead, we propose SAND features, a dedicated deep
learning solution to feature extraction capable of providing hierarchical
context information. This is achieved by employing sparse relative labels
indicating relationships of similarity/dissimilarity between image locations.
The nature of these labels results in an almost infinite set of dissimilar
examples to choose from. We demonstrate how the selection of negative examples
during training can be used to modify the feature space and vary it's
properties. To demonstrate the generality of this approach, we apply the
proposed features to a multitude of tasks, each requiring different properties.
This includes disparity estimation, semantic segmentation, self-localisation
and SLAM. In all cases, we show how incorporating SAND features results in
better or comparable results to the baseline, whilst requiring little to no
additional training. Code can be found at:
https://github.com/jspenmar/SAND_featuresComment: CVPR201
Surface Defect Classification in Real-Time Using Convolutional Neural Networks
Surface inspection systems are an important application domain for computer
vision, as they are used for defect detection and classification in the
manufacturing industry. Existing systems use hand-crafted features which
require extensive domain knowledge to create. Even though Convolutional neural
networks (CNNs) have proven successful in many large-scale challenges,
industrial inspection systems have yet barely realized their potential due to
two significant challenges: real-time processing speed requirements and
specialized narrow domain-specific datasets which are sometimes limited in
size. In this paper, we propose CNN models that are specifically designed to
handle capacity and real-time speed requirements of surface inspection systems.
To train and evaluate our network models, we created a surface image dataset
containing more than 22000 labeled images with many types of surface materials
and achieved 98.0% accuracy in binary defect classification. To solve the class
imbalance problem in our datasets, we introduce neural data augmentation
methods which are also applicable to similar domains that suffer from the same
problem. Our results show that deep learning based methods are feasible to be
used in surface inspection systems and outperform traditional methods in
accuracy and inference time by considerable margins.Comment: Supplementary material will follo
Automated Website Fingerprinting through Deep Learning
Several studies have shown that the network traffic that is generated by a
visit to a website over Tor reveals information specific to the website through
the timing and sizes of network packets. By capturing traffic traces between
users and their Tor entry guard, a network eavesdropper can leverage this
meta-data to reveal which website Tor users are visiting. The success of such
attacks heavily depends on the particular set of traffic features that are used
to construct the fingerprint. Typically, these features are manually engineered
and, as such, any change introduced to the Tor network can render these
carefully constructed features ineffective. In this paper, we show that an
adversary can automate the feature engineering process, and thus automatically
deanonymize Tor traffic by applying our novel method based on deep learning. We
collect a dataset comprised of more than three million network traces, which is
the largest dataset of web traffic ever used for website fingerprinting, and
find that the performance achieved by our deep learning approaches is
comparable to known methods which include various research efforts spanning
over multiple years. The obtained success rate exceeds 96% for a closed world
of 100 websites and 94% for our biggest closed world of 900 classes. In our
open world evaluation, the most performant deep learning model is 2% more
accurate than the state-of-the-art attack. Furthermore, we show that the
implicit features automatically learned by our approach are far more resilient
to dynamic changes of web content over time. We conclude that the ability to
automatically construct the most relevant traffic features and perform accurate
traffic recognition makes our deep learning based approach an efficient,
flexible and robust technique for website fingerprinting.Comment: To appear in the 25th Symposium on Network and Distributed System
Security (NDSS 2018
Accurate Tissue Interface Segmentation via Adversarial Pre-Segmentation of Anterior Segment OCT Images
Optical Coherence Tomography (OCT) is an imaging modality that has been
widely adopted for visualizing corneal, retinal and limbal tissue structure
with micron resolution. It can be used to diagnose pathological conditions of
the eye, and for developing pre-operative surgical plans. In contrast to the
posterior retina, imaging the anterior tissue structures, such as the limbus
and cornea, results in B-scans that exhibit increased speckle noise patterns
and imaging artifacts. These artifacts, such as shadowing and specularity, pose
a challenge during the analysis of the acquired volumes as they substantially
obfuscate the location of tissue interfaces. To deal with the artifacts and
speckle noise patterns and accurately segment the shallowest tissue interface,
we propose a cascaded neural network framework, which comprises of a
conditional Generative Adversarial Network (cGAN) and a Tissue Interface
Segmentation Network (TISN). The cGAN pre-segments OCT B-scans by removing
undesired specular artifacts and speckle noise patterns just above the
shallowest tissue interface, and the TISN combines the original OCT image with
the pre-segmentation to segment the shallowest interface. We show the
applicability of the cascaded framework to corneal datasets, demonstrate that
it precisely segments the shallowest corneal interface, and also show its
generalization capacity to limbal datasets. We also propose a hybrid framework,
wherein the cGAN pre-segmentation is passed to a traditional image
analysis-based segmentation algorithm, and describe the improved segmentation
performance. To the best of our knowledge, this is the first approach to remove
severe specular artifacts and speckle noise patterns (prior to the shallowest
interface) that affects the interpretation of anterior segment OCT datasets,
thereby resulting in the accurate segmentation of the shallowest tissue
interface.Comment: First two authors contributed equally. Biomedical Optics Express
journal submission. 27 pages, 15 figures. Submitted to the journal on May 6th
2019 at 11:38p
- …