317,452 research outputs found
Table Detection in the Wild: A Novel Diverse Table Detection Dataset and Method
Recent deep learning approaches in table detection achieved outstanding
performance and proved to be effective in identifying document layouts.
Currently, available table detection benchmarks have many limitations,
including the lack of samples diversity, simple table structure, the lack of
training cases, and samples quality. In this paper, we introduce a diverse
large-scale dataset for table detection with more than seven thousand samples
containing a wide variety of table structures collected from many diverse
sources. In addition to that, we also present baseline results using a
convolutional neural network-based method to detect table structure in
documents. Experimental results show the superiority of applying convolutional
deep learning methods over classical computer vision-based methods. The
introduction of this diverse table detection dataset will enable the community
to develop high throughput deep learning methods for understanding document
layout and tabular data processing.Comment: Open source Table detection dataset and baseline result
Recommended from our members
Deep learning networks find unique mammographic differences in previous negative mammograms between interval and screen-detected cancers: a case-case study.
BackgroundTo determine if mammographic features from deep learning networks can be applied in breast cancer to identify groups at interval invasive cancer risk due to masking beyond using traditional breast density measures.MethodsFull-field digital screening mammograms acquired in our clinics between 2006 and 2015 were reviewed. Transfer learning of a deep learning network with weights initialized from ImageNet was performed to classify mammograms that were followed by an invasive interval or screen-detected cancer within 12 months of the mammogram. Hyperparameter optimization was performed and the network was visualized through saliency maps. Prediction loss and accuracy were calculated using this deep learning network. Receiver operating characteristic (ROC) curves and area under the curve (AUC) values were generated with the outcome of interval cancer using the deep learning network and compared to predictions from conditional logistic regression with errors quantified through contingency tables.ResultsPre-cancer mammograms of 182 interval and 173 screen-detected cancers were split into training/test cases at an 80/20 ratio. Using Breast Imaging-Reporting and Data System (BI-RADS) density alone, the ability to correctly classify interval cancers was moderate (AUC = 0.65). The optimized deep learning model achieved an AUC of 0.82. Contingency table analysis showed the network was correctly classifying 75.2% of the mammograms and that incorrect classifications were slightly more common for the interval cancer mammograms. Saliency maps of each cancer case found that local information could highly drive classification of cases more than global image information.ConclusionsPre-cancerous mammograms contain imaging information beyond breast density that can be identified with deep learning networks to predict the probability of breast cancer detection
Oversampling Log Messages Using a Sequence Generative Adversarial Network for Anomaly Detection and Classification
Dealing with imbalanced data is one of the main challenges in machine/deep
learning algorithms for classification. This issue is more important with log
message data as it is typically very imbalanced and negative logs are rare. In
this paper, a model is proposed to generate text log messages using a SeqGAN
network. Then features are extracted using an Autoencoder and anomaly detection
is done using a GRU network. The proposed model is evaluated with two
imbalanced log data sets, namely BGL and Openstack. Results are presented which
show that oversampling and balancing data increases the accuracy of anomaly
detection and classification.Comment: 14 pages, 4 figures, 2 table
- …