Search CORE

144 research outputs found

Temporal Cross-Media Retrieval with Soft-Smoothing

Author: Andrew Galen
Benevenuto Fabricio
Blei David M.
He Kaiming
Herbrich Ralf
Hu D.
Ngiam Jiquan
Srivastava Nitish
Srivastava Nitish
Uricchio Tiberio
Wang L.
Yan F.
Zhan M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/10/2018
Field of study

Multimedia information have strong temporal correlations that shape the way modalities co-occur over time. In this paper we study the dynamic nature of multimedia and social-media information, where the temporal dimension emerges as a strong source of evidence for learning the temporal correlations across visual and textual modalities. So far, cross-media retrieval models, explored the correlations between different modalities (e.g. text and image) to learn a common subspace, in which semantically similar instances lie in the same neighbourhood. Building on such knowledge, we propose a novel temporal cross-media neural architecture, that departs from standard cross-media methods, by explicitly accounting for the temporal dimension through temporal subspace learning. The model is softly-constrained with temporal and inter-modality constraints that guide the new subspace learning task by favouring temporal correlations between semantically similar and temporally close instances. Experiments on three distinct datasets show that accounting for time turns out to be important for cross-media retrieval. Namely, the proposed method outperforms a set of baselines on the task of temporal cross-media retrieval, demonstrating its effectiveness for performing temporal subspace learning.Comment: To appear in ACM MM 201

arXiv.org e-Print Archive

Crossref

Automatic Prediction of Building Age from Photographs

Author: He Kaiming
Ioffe Sergey
Li Y. L. Y.
Liu Fei
Lowe David G
Muhr V.
Xu Z.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/04/2018
Field of study

We present a first method for the automated age estimation of buildings from unconstrained photographs. To this end, we propose a two-stage approach that firstly learns characteristic visual patterns for different building epochs at patch-level and then globally aggregates patch-level age estimates over the building. We compile evaluation datasets from different sources and perform an detailed evaluation of our approach, its sensitivity to parameters, and the capabilities of the employed deep networks to learn characteristic visual age-related patterns. Results show that our approach is able to estimate building age at a surprisingly high level that even outperforms human evaluators and thereby sets a new performance baseline. This work represents a first step towards the automated assessment of building parameters for automated price prediction.Comment: Preprint of paper to appear in ACM International Conference on Multimedia Retrieval (ICMR) 2018 Conferenc

arXiv.org e-Print Archive

Crossref

The biological and clinical significance of emerging SARS-CoV-2 variants.

Author: de Oliveira Tulio
Fera Daniela
Gupta Ravindra K
Kosakovsky Pond Sergei L
Nouhin Janin
Shafer Robert W
Tao Kaiming
Tzou Philip L
Publication venue: Nat Rev Genet
Publication date: 01/01/2021
Field of study

The past several months have witnessed the emergence of SARS-CoV-2 variants with novel spike protein mutations that are influencing the epidemiological and clinical aspects of the COVID-19 pandemic. These variants can increase rates of virus transmission and/or increase the risk of reinfection and reduce the protection afforded by neutralizing monoclonal antibodies and vaccination. These variants can therefore enable SARS-CoV-2 to continue its spread in the face of rising population immunity while maintaining or increasing its replication fitness. The identification of four rapidly expanding virus lineages since December 2020, designated variants of concern, has ushered in a new stage of the pandemic. The four variants of concern, the Alpha variant (originally identified in the UK), the Beta variant (originally identified in South Africa), the Gamma variant (originally identified in Brazil) and the Delta variant (originally identified in India), share several mutations with one another as well as with an increasing number of other recently identified SARS-CoV-2 variants. Collectively, these SARS-CoV-2 variants complicate the COVID-19 research agenda and necessitate additional avenues of laboratory, epidemiological and clinical research

Works

Apollo (Cambridge)

Multi-view Face Detection Using Deep Convolutional Neural Networks

Author: Garcia C.
Girshick R. B.
Kaiming He S. R.
Krizhevsky A.
Martin Koestinger P. M. R.
Osadchy M.
Osadchy R.
Ramanan D.
Saberian M.
Sermanet P.
Sun Y.
Szegedy C.
Szegedy C.
Szegedy C.
Tompson Y. L.
Vaillant R.
Viola M.
Wu B.
Publication venue
Publication date: 20/04/2015
Field of study

In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR

arXiv.org e-Print Archive

Crossref

Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition

Author: A Todorov
A Vinciarelli
A Vinciarelli
AG Wright
Arulkumar Subramaniam
B Schuller
CY Olivola
F Mairesse
GL Lorenzo
J Schmidhuber
J Willis
JI Biel
Kaiming He
L Teijeiro-Mosquera
LP Naumann
N Srivastava
P Borkenau
RJW Vernon
S Hochreiter
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/09/2016
Field of study

Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not require any feature engineering or visual analysis such as face detection, face landmark alignment or facial expression recognition. Recently, the network won the third place in the ChaLearn First Impressions Challenge with a test accuracy of 0.9109

arXiv.org e-Print Archive

Crossref

Application of nanoimprinting technique for fabrication of trifocal diffractive lens with sine-like radial profile

Author: Evgeni A. Bezus
Greg Swadener
James S. W. Wolffsohn
Kaiming Zhou
Kameel Sawalha
Leonid L. Doskolovich
Soifer
Tom Drew
Vladimir Osipov
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 24/02/2015
Field of study

The fabrication of submicron-height sine-like relief of a trifocal diffractive zone plate using a nanoimprinting technique is studied. The zone plate is intended for use in combined trifocal diffractive-refractive lenses and provides the possibility to form trifocal intraocular lenses with predetermined light intensity distribution between foci. The optical properties of the designed zone plate having the optical powers 3 D, 0, -3D in the three main diffraction orders are theoretically and experimentally investigated. The results of the theoretical investigations are in good agreement with experimental measurements. The effects of the pupil size (lens diameter) as well as the wavelength-dependent behavior of the zone plate are also discussed

Crossref

Aston Publications Explorer

The IRE1α/XBP1 pathway sustains cytokine responses of group 3 innate lymphoid cells in inflammatory bowel disease

Author: Cai Zhangying
Cao Siyan
Ciorba Matthew A
Colonna Marco
Deepak Parakkal
Fachi Jose L
Kaufman Randal J
Ma Kaiming
Ulezko Antonova Alina
Wang Qianli
Publication venue: Digital Commons@Becker
Publication date: 09/05/2024
Field of study

Group 3 innate lymphoid cells (ILC3s) are key players in intestinal homeostasis. ER stress is linked to inflammatory bowel disease (IBD). Here, we used cell culture, mouse models, and human specimens to determine whether ER stress in ILC3s affects IBD pathophysiology. We show that mouse intestinal ILC3s exhibited a 24-hour rhythmic expression pattern of the master ER stress response regulator inositol-requiring kinase 1α/X-box-binding protein 1 (IRE1α/XBP1). Proinflammatory cytokine IL-23 selectively stimulated IRE1α/XBP1 in mouse ILC3s through mitochondrial ROS (mtROS). IRE1α/XBP1 was activated in ILC3s from mice exposed to experimental colitis and in inflamed human IBD specimens. Mice with Ire1α deletion in ILC3s (Ire1αΔRorc) showed reduced expression of the ER stress response and cytokine genes including Il22 in ILC3s and were highly vulnerable to infections and colitis. Administration of IL-22 counteracted their colitis susceptibility. In human ILC3s, IRE1 inhibitors suppressed cytokine production, which was upregulated by an IRE1 activator. Moreover, the frequencies of intestinal XBP1s+ ILC3s in patients with Crohn\u27s disease before administration of ustekinumab, an anti-IL-12/IL-23 antibody, positively correlated with the response to treatment. We demonstrate that a noncanonical mtROS-IRE1α/XBP1 pathway augmented cytokine production by ILC3s and identify XBP1s+ ILC3s as a potential biomarker for predicting the response to anti-IL-23 therapies in IBD

Digital Commons@Becker

Diachronic cross-modal embeddings

Author: Andrew Galen
Bamler Robert
Bengio Yoshua
David
He Kaiming
Herbrich Ralf
Huang Xin
Kim Gunhee
Lau Jey Han
Mikolov Tomas
Mithun Niluthpol Chowdhury
Uricchio Tiberio
Wang L.
Yao T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/09/2019
Field of study

This work has been partially funded by the CMU Portugal research project GoLocal Ref. CMUP-ERI/TIC/0046/2014, by the H2020 ICT project COGNITUS with the grant agreement no 687605 and by the FCT project NOVA LINCS Ref. UID/CEC/04516/2019. We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research.Understanding the semantic shifts of multimodal information is only possible with models that capture cross-modal interactions over time. Under this paradigm, a new embedding is needed that structures visual-textual interactions according to the temporal dimension, thus, preserving data's original temporal organisation. This paper introduces a novel diachronic cross-modal embedding (DCM), where cross-modal correlations are represented in embedding space, throughout the temporal dimension, preserving semantic similarity at each instant t. To achieve this, we trained a neural cross-modal architecture, under a novel ranking loss strategy, that for each multimodal instance, enforces neighbour instances' temporal alignment, through subspace structuring constraints based on a temporal alignment window. Experimental results show that our DCM embedding successfully organises instances over time. Quantitative experiments, confirm that DCM is able to preserve semantic cross-modal correlations at each instant t while also providing better alignment capabilities. Qualitative experiments unveil new ways to browse multimodal content and hint that multimodal understanding tasks can benefit from this new embedding.publishersversionpublishe

arXiv.org e-Print Archive

Crossref

Repositório da Universidade Nova de Lisboa

Observation of out-of-plane spin texture in a SrTiO3 (111) two-dimensional electron gas

Author: Bahramy M. S.
Baumberger F.
Bruno F. Y.
Cai Kaiming
He Pan
Heinonen Olle
Lee Jongmin
Ramaswamy Rajagopalan
Vignale Giovanni
Walker S. McKeown
Yang Hyunsoo
Zhang Steven S. -L.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2018
Field of study

We explore the second order bilinear magnetoelectric resistance (BMER) effect in the d-electron-based two-dimensional electron gas (2DEG) at the SrTiO3 (111) surface. We find an evidence of a spin-split band structure with the archetypal spin-momentum locking of the Rashba effect for the in-plane component. Under an out-of-plane magnetic field, we find a BMER signal that breaks the six-fold symmetry of the electronic dispersion, which is a fingerprint for the presence of a momentum dependent out-of-plane spin component. Relativistic electronic structure calculations reproduce this spin-texture and indicate that the out-of-plane component is a ubiquitous property of oxide 2DEGs arising from strong crystal field effects. We further show that the BMER response of the SrTiO3 (111) 2DEG is tunable and unexpectedly large.Comment: 16 pages, 4 figure

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Archive ouverte UNIGE

Connectome-scale assessments of structural and functional connectivity in MCI: Structural and Functional Connectivity in MCI

Author: Li Kaiming
Liu Tianming
Miller L. Stephen
Puente A. Nicholas
Shen Dinggang
Terry Douglas P.
Wang Lihong
Zhu Dajiang
Publication venue
Publication date: 01/01/2014
Field of study

Mild cognitive impairment (MCI) has received increasing attention not only because of its potential as a precursor for Alzheimer's disease (AD), but also as a predictor of conversion to other neurodegenerative diseases. Although MCI has been defined clinically, accurate and efficient diagnosis is still challenging. While neuroimaging techniques hold promise, compared to commonly-used biomarkers including amyloid plaques, tau protein levels and brain tissue atrophy, neuroimaging biomarkers are less well validated. In the present paper, we propose a connectomes-scale assessment of structural and functional connectivity in MCI via two independent multimodal DTI/fMRI datasets. We first used DTI-derived structural profiles to explore and tailor the most common and consistent landmarks, then applied them in a whole-brain functional connectivity analysis. The next step fused the results from two independent datasets together and resulted in a set of functional connectomes with the most differentiation power, hence named as “connectome signatures”. Our results indicate that these “connectome signatures” have significantly high MCI-vs-controls classification accuracy, at more than 95%. Interestingly, through functional meta-analysis, we found that the majority of “connectome signatures” are mainly derived from the interactions among different functional networks, e.g., cognition-perception and cognition-action domains, rather than from within a single network. Our work provides support for using functional “connectome signatures” as neuroimaging biomarkers of MCI

Carolina Digital Repository