16,568 research outputs found

    Lip-reading with Densely Connected Temporal Convolutional Networks

    Full text link
    In this work, we present the Densely Connected Temporal Convolutional Network (DC-TCN) for lip-reading of isolated words. Although Temporal Convolutional Networks (TCN) have recently demonstrated great potential in many vision tasks, its receptive fields are not dense enough to model the complex temporal dynamics in lip-reading scenarios. To address this problem, we introduce dense connections into the network to capture more robust temporal features. Moreover, our approach utilises the Squeeze-and-Excitation block, a light-weight attention mechanism, to further enhance the model's classification power. Without bells and whistles, our DC-TCN method has achieved 88.36% accuracy on the Lip Reading in the Wild (LRW) dataset and 43.65% on the LRW-1000 dataset, which has surpassed all the baseline methods and is the new state-of-the-art on both datasets.Comment: WACV 202

    Imaging through glass diffusers using densely connected convolutional networks

    Get PDF
    Computational imaging through scatter generally is accomplished by first characterizing the scattering medium so that its forward operator is obtained and then imposing additional priors in the form of regularizers on the reconstruction functional to improve the condition of the originally ill-posed inverse problem. In the functional, the forward operator and regularizer must be entered explicitly or parametrically (e.g., scattering matrices and dictionaries, respectively). However, the process of determining these representations is often incomplete, prone to errors, or infeasible. Recently, deep learning architectures have been proposed to instead learn both the forward operator and regularizer through examples. Here, we propose for the first time, to our knowledge, a convolutional neural network architecture called “IDiffNet” for the problem of imaging through diffuse media and demonstrate that IDiffNet has superior generalization capability through extensive tests with well-calibrated diffusers. We also introduce the negative Pearson correlation coefficient (NPCC) loss function for neural net training and show that the NPCC is more appropriate for spatially sparse objects and strong scattering conditions. Our results show that the convolutional architecture is robust to the choice of prior, as demonstrated by the use of multiple training and testing object databases, and capable of achieving higher space–bandwidth product reconstructions than previously reported.Singapore-MIT AllianceUnited States. Office of the Director of National Intelligence. Rapid Analysis of Various Emerging NanoelectronicsUnited States. Department of Energy (DE-FG02-97ER25308)United States. Department of Energy. Computational Science Graduate Fellowship Progra

    Densely Connected Convolutional Neural Networks for Natural Language Processing

    Get PDF
    Densely connected convolutional neural networks are currently one of the best object recognition algorithms. Given the plasticity of neural networks, the DenseNet algorithm should perform similarly in NLP tasks. In its attempt to verify whether the DenseNet algorithm can yield equally impressive results on NLP tasks, this paper has modified the DenseNet algorithm and tested it on text classification. For this purpose, three differently sized datasets have each been encoded as Tf-IDf vectors and word vectors and then the DenseNet’s performance on these different feature sets was compared to more conventional methods including Naïve Bayes classifiers and other neural networks. The paper finds that DenseNets can perform on par with these algorithms but scale especially well with large datasets and semantically rich features

    Rethinking densely connected convolutional networks for diagnosing infectious diseases.

    Get PDF
    Due to its high transmissibility, the COVID-19 pandemic has placed an unprecedented burden on healthcare systems worldwide. X-ray imaging of the chest has emerged as a valuable and cost-effective tool for detecting and diagnosing COVID-19 patients. In this study, we developed a deep learning model using transfer learning with optimized DenseNet-169 and DenseNet-201 models for three-class classification, utilizing the Nadam optimizer. We modified the traditional DenseNet architecture and tuned the hyperparameters to improve the model's performance. The model was evaluated on a novel dataset of 3312 X-ray images from publicly available datasets, using metrics such as accuracy, recall, precision, F1-score, and the area under the receiver operating characteristics curve. Our results showed impressive detection rate accuracy and recall for COVID-19 patients, with 95.98% and 96% achieved using DenseNet-169 and 96.18% and 99% using DenseNet-201. Unique layer configurations and the Nadam optimization algorithm enabled our deep learning model to achieve high rates of accuracy not only for detecting COVID-19 patients but also for identifying normal and pneumonia-affected patients. The mode'ls ability to detect lung problems early on, as well as its low false-positive and false-negative rates, suggest that it has the potential to serve as a reliable diagnostic tool for a variety of lung diseases
    • …
    corecore