5,453 research outputs found

    Bridging languages through images with deep partial canonical correlation analysis

    Get PDF
    We present a deep neural network that leverages images to improve bilingual text embeddings. Relying on bilingual image tags and descriptions, our approach conditions text embedding induction on the shared visual information for both languages, producing highly correlated bilingual embeddings. In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). While the original PCCA finds linear projections of two views in order to maximize their canonical correlation conditioned on a shared third variable, we introduce a non-linear Deep PCCA (DPCCA) model, and develop a new stochastic iterative algorithm for its optimization. We evaluate PCCA and DPCCA on multilingual word similarity and cross-lingual image description retrieval. Our models outperform a large variety of previous methods, despite not having access to any visual signal during test time inference. Our code and data are available at: https://github.com/rotmanguy/DPCCA

    Transfer learning: bridging the gap between deep learning and domain-specific text mining

    Get PDF
    Inspired by the success of deep learning techniques in Natural Language Processing (NLP), this dissertation tackles the domain-specific text mining problems for which the generic deep learning approaches would fail. More specifically, the domain-specific problems are: (1) success prediction in crowdfunding, (2) variants identification in biomedical literature, and (3) text data augmentation for domains with low-resources. In the first part, transfer learning in a multimodal perspective is utilized to facilitate solving the project success prediction on the crowdfunding application. Even though the information in a project profile can be of different modalities such as text, images, and metadata, most existing prediction approaches leverage only the text modality. It is promising to utilize the visual images in project profiles to find out how images could contribute to the success prediction. An advanced neural network scheme is designed and evaluated combining information learned from different modalities for project success prediction. In the second part, transfer learning is combined with deep learning techniques to solve genomic variants Named Entity Recognition (NER) problems in biomedical literature. Most of the advanced generic NER algorithms can fail due to the restricted training corpus. However, those generic deep learning algorithms are capable of learning from a canonical corpus, without any effort on feature engineering. This work aims to build an end-to-end deep learning approach to transfer the domain-specific knowledge to those advanced generic NER algorithms, addressing the challenges in low-resource training and requiring neither hand-crafted features nor post-processing rules. For the last part, transfer learning with knowledge distillation and active learning are utilized to solve text augmentation for domains with low-resources. Most of the recent text augmentation methods heavily rely on large external resources. This work is dedicates to solving the text augmentation problem adaptively and consistently with minimal resources for token-level tasks like NER. The solution can also assure the reliability of machine labels for noisy data and can enhance training consistency with noisy labels. All the works are evaluated on different domain-specific benchmarks, respectively. Experimental results demonstrate the effectiveness of those proposed methods. The advantages also indicate promising potential for transfer learning in domain-specific applications

    U-MENTALISM PATENT: THE BEGINNING OF CINEMATIC SUPERCOMPUTATION

    Get PDF
    This paper discloses in synthesis a super-computation computer architecture (CA) model, presently a provisional Patent Application at INPI (nÂș 116408). The outline is focused on a method to perform computation at or near the speed of light, resorting to an inversion of the Princeton CA. It expands from isomorphic binary/RGB (typical) digital “images”, in a network of (UTM)s over Turing-machines (M)s. From the binary/RGB code, an arithmetic theory of (typical) digital images permits fully synchronous/orthogonal calculus in parallelism, wherefrom an exponential surplus is achieved. One such architecture depends on any “cell”-like exponential-prone basis such as the “pixel”, or rather the RGB “octet-byte”, limited as it may be, once it is congruent with any wave-particle duality principle in observable objects under the electromagnetic spectrum and reprogrammable designed. Well-ordered instructions in binary/RGB modules are, further, programming composed to alter the structure of the Internet, in virtual/virtuous eternal recursion/recurrence, under man-machine/machine-machine communication ontology.info:eu-repo/semantics/publishedVersio

    Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

    Get PDF
    This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

    Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning

    Full text link
    Recent work has highlighted the advantage of jointly learning grounded sentence representations from multiple languages. However, the data used in these studies has been limited to an aligned scenario: the same images annotated with sentences in multiple languages. We focus on the more realistic disjoint scenario in which there is no overlap between the images in multilingual image--caption datasets. We confirm that training with aligned data results in better grounded sentence representations than training with disjoint data, as measured by image--sentence retrieval performance. In order to close this gap in performance, we propose a pseudopairing method to generate synthetically aligned English--German--image triplets from the disjoint sets. The method works by first training a model on the disjoint data, and then creating new triples across datasets using sentence similarity under the learned model. Experiments show that pseudopairs improve image--sentence retrieval performance compared to disjoint training, despite requiring no external data or models. However, we do find that using an external machine translation model to generate the synthetic data sets results in better performance.Comment: 10 page
    • 

    corecore