5,453 research outputs found
Bridging languages through images with deep partial canonical correlation analysis
We present a deep neural network that leverages images to improve bilingual text embeddings. Relying on bilingual image tags and descriptions, our approach conditions text embedding induction on the shared visual information for both languages, producing highly correlated bilingual embeddings. In particular, we propose a novel model based on Partial Canonical Correlation Analysis (PCCA). While the original PCCA finds linear projections of two views in order to maximize their canonical correlation conditioned on a shared third variable, we introduce a non-linear Deep PCCA (DPCCA) model, and develop a new stochastic iterative algorithm for its optimization. We evaluate PCCA and DPCCA on multilingual word similarity and cross-lingual image description retrieval. Our models outperform a large variety of previous methods, despite not having access to any visual signal during test time inference. Our code and data are available at: https://github.com/rotmanguy/DPCCA
Transfer learning: bridging the gap between deep learning and domain-specific text mining
Inspired by the success of deep learning techniques in Natural Language Processing (NLP), this dissertation tackles the domain-specific text mining problems for which the generic deep learning approaches would fail. More specifically, the domain-specific problems are: (1) success prediction in crowdfunding, (2) variants identification in biomedical literature, and (3) text data augmentation for domains with low-resources.
In the first part, transfer learning in a multimodal perspective is utilized to facilitate solving the project success prediction on the crowdfunding application. Even though the information in a project profile can be of different modalities such as text, images, and metadata, most existing prediction approaches leverage only the text modality. It is promising to utilize the visual images in project profiles to find out how images could contribute to the success prediction. An advanced neural network scheme is designed and evaluated combining information learned from different modalities for project success prediction.
In the second part, transfer learning is combined with deep learning techniques to solve genomic variants Named Entity Recognition (NER) problems in biomedical literature. Most of the advanced generic NER algorithms can fail due to the restricted training corpus. However, those generic deep learning algorithms are capable of learning from a canonical corpus, without any effort on feature engineering. This work aims to build an end-to-end deep learning approach to transfer the domain-specific knowledge to those advanced generic NER algorithms, addressing the challenges in low-resource training and requiring neither hand-crafted features nor post-processing rules.
For the last part, transfer learning with knowledge distillation and active learning are utilized to solve text augmentation for domains with low-resources. Most of the recent text augmentation methods heavily rely on large external resources. This work is dedicates to solving the text augmentation problem adaptively and consistently with minimal resources for token-level tasks like NER. The solution can also assure the reliability of machine labels for noisy data and can enhance training consistency with noisy labels.
All the works are evaluated on different domain-specific benchmarks, respectively. Experimental results demonstrate the effectiveness of those proposed methods. The advantages also indicate promising potential for transfer learning in domain-specific applications
U-MENTALISM PATENT: THE BEGINNING OF CINEMATIC SUPERCOMPUTATION
This paper discloses in synthesis a super-computation computer architecture (CA) model, presently a
provisional Patent Application at INPI (nÂș 116408). The outline is focused on a method to perform
computation at or near the speed of light, resorting to an inversion of the Princeton CA. It expands from
isomorphic binary/RGB (typical) digital âimagesâ, in a network of (UTM)s over Turing-machines (M)s.
From the binary/RGB code, an arithmetic theory of (typical) digital images permits fully
synchronous/orthogonal calculus in parallelism, wherefrom an exponential surplus is achieved. One such
architecture depends on any âcellâ-like exponential-prone basis such as the âpixelâ, or rather the RGB
âoctet-byteâ, limited as it may be, once it is congruent with any wave-particle duality principle in
observable objects under the electromagnetic spectrum and reprogrammable designed. Well-ordered
instructions in binary/RGB modules are, further, programming composed to alter the structure of the
Internet, in virtual/virtuous eternal recursion/recurrence, under man-machine/machine-machine
communication ontology.info:eu-repo/semantics/publishedVersio
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning
Recent work has highlighted the advantage of jointly learning grounded
sentence representations from multiple languages. However, the data used in
these studies has been limited to an aligned scenario: the same images
annotated with sentences in multiple languages. We focus on the more realistic
disjoint scenario in which there is no overlap between the images in
multilingual image--caption datasets. We confirm that training with aligned
data results in better grounded sentence representations than training with
disjoint data, as measured by image--sentence retrieval performance. In order
to close this gap in performance, we propose a pseudopairing method to generate
synthetically aligned English--German--image triplets from the disjoint sets.
The method works by first training a model on the disjoint data, and then
creating new triples across datasets using sentence similarity under the
learned model. Experiments show that pseudopairs improve image--sentence
retrieval performance compared to disjoint training, despite requiring no
external data or models. However, we do find that using an external machine
translation model to generate the synthetic data sets results in better
performance.Comment: 10 page
Recommended from our members
The natverse, a versatile toolbox for combining and analysing neuroanatomical data.
To analyse neuron data at scale, neuroscientists expend substantial effort reading documentation, installing dependencies and moving between analysis and visualisation environments. To facilitate this, we have developed a suite of interoperable open-source R packages called the natverse. The natverse allows users to read local and remote data, perform popular analyses including visualisation and clustering and graph-theoretic analysis of neuronal branching. Unlike most tools, the natverse enables comparison across many neurons of morphology and connectivity after imaging or co-registration within a common template space. The natverse also enables transformations between different template spaces and imaging modalities. We demonstrate tools that integrate the vast majority of Drosophila neuroanatomical light microscopy and electron microscopy connectomic datasets. The natverse is an easy-to-use environment for neuroscientists to solve complex, large-scale analysis challenges as well as an open platform to create new code and packages to share with the community
- âŠ