Search CORE

37 research outputs found

Self-supervised Contrastive Video-Speech Representation Learning for Ultrasound

Author: A Owens
CF Baumgartner
H Spitzer
O Russakovsky
R Zhang
X Zhuang
Z Bylinskii
Publication venue
Publication date: 01/01/2020
Field of study

In medical imaging, manual annotations can be expensive to acquire and sometimes infeasible to access, making conventional deep learning-based models difficult to scale. As a result, it would be beneficial if useful representations could be derived from raw data without the need for manual annotations. In this paper, we propose to address the problem of self-supervised representation learning with multi-modal ultrasound video-speech raw data. For this case, we assume that there is a high correlation between the ultrasound video and the corresponding narrative speech audio of the sonographer. In order to learn meaningful representations, the model needs to identify such correlation and at the same time understand the underlying anatomical features. We designed a framework to model the correspondence between video and audio without any kind of human annotations. Within this framework, we introduce cross-modal contrastive learning and an affinity-aware self-paced learning scheme to enhance correlation modelling. Experimental evaluations on multi-modal fetal ultrasound video and audio show that the proposed approach is able to learn strong representations and transfers well to downstream tasks of standard plane detection and eye-gaze prediction.Comment: MICCAI 2020 (early acceptance

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

PubMed Central

Oxford University Research Archive

Problems with Saliency Maps

Author: A Borji
A Clavelli
A Coutrot
A Furnari
A Torralba
B Tatler
B Tatler
B Tatler
BW Tatler
C Koch
F Cristino
G Boccignone
G Boccignone
G Boccignone
HE Egeth
HH Schütt
J Zhang
L Itti
M Cerf
M Kümmerer
NC Anderson
NC Anderson
ND Bruce
O Meur Le
O Meur Le
P Napoletano
TV Nguyen
V Cuculo
Z Bylinskii
Z Bylinskii
Z Bylinskii
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Despite the popularity that saliency models have gained in the computer vision community, they are most often conceived, exploited and benchmarked without taking heed of a number of problems and subtle issues they bring about. When saliency maps are used as proxies for the likelihood of fixating a location in a viewed scene, one such issue is the temporal dimension of visual attention deployment. Through a simple simulation it is shown how neglecting this dimension leads to results that at best cast shadows on the predictive performance of a model and its assessment via benchmarking procedures

Crossref

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

How to look next? A data-driven approach for scanpath prediction

Author: A Borji
A Oliva
A Torralba
B Tatler
B Zhou
BW Tatler
C Xia
F Cristino
G Boccignone
G Boccignone
G Boccignone
G Boccignone
HH Schütt
L Itti
LO Rothkegel
M Cerf
NC Anderson
NC Anderson
ND Bruce
O Le Meur
P Napoletano
PH Tseng
T-Y Lin
TV Nguyen
V Cuculo
VI Levenshtein
Z Bylinskii
Z Bylinskii
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

By and large, current visual attention models mostly rely, when considering static stimuli, on the following procedure. Given an image, a saliency map is computed, which, in turn, might serve the purpose of predicting a sequence of gaze shifts, namely a scanpath instantiating the dynamics of visual attention deployment. The temporal pattern of attention unfolding is thus confined to the scanpath generation stage, whilst salience is conceived as a static map, at best conflating a number of factors (bottom-up information, top-down, spatial biases, etc.). In this note we propose a novel sequential scheme that consists of a three-stage processing relying on a center-bias model, a context/layout model, and an object-based model, respectively. Each stage contributes, at different times, to the sequential sampling of the final scanpath. We compare the method against classic scanpath generation that exploits state-of-the-art static saliency model. Results show that accounting for the structure of the temporal unfolding leads to gaze dynamics close to human gaze behaviour

Crossref

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

Author: A Borji
A Borji
A Borji
A Nuthmann
AM Treisman
BT Vincent
BW Tatler
BW Tatler
BW Tatler
CA Rothkopf
H Adeli
HH Schütt
J Xiao
K Koehler
L Itti
L Itti
L Itti
L Zhang
M Cerf
M Kümmerer
N Riche
N Riche
N Wilming
ND Bruce
NDB Bruce
O Meur Le
RJ Peters
S Barthelme
SSS Kruthiventi
T Jost
W Einhauser
W Kienzle
Z Bylinskii
Z Li
Publication venue
Publication date: 25/07/2018
Field of study

Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN. However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics. Here we show that no single saliency map can perform well under all metrics. Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics. Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density. We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision. We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics. Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: "good" models will perform well in all metrics.Comment: published at ECCV 201

arXiv.org e-Print Archive

Crossref

Velocity tuning of friction with two trapped atoms

Author: A Benassi
A Bylinskii
A Socoliuc
A Vanossi
Alexei Bylinskii
C Mate
D Mandelli
Dorian Gangloff
E Gnecco
E Meyer
E Riedo
EA Jagla
I Barel
Ian Counts
K Jinesh
L Jansen
L Karpa
M Cetina
M Dienwiebel
M Enderlein
M Igarashi
M Müser
M Urbakh
OM Braun
Q Li
RB Linnet
S Medyanik
SY Krylov
T Bohlein
T Pruttivarasin
Vladan Vuletić
Wonho Jhe
X Zhao
X-Z Liu
Y Dong
Y Sang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2015
Field of study

Our ability to control friction remains modest, as our understanding of the underlying microscopic processes is incomplete. Atomic force experiments have provided a wealth of results on the dependence of nanofriction on structure velocity and temperature but limitations in the dynamic range, time resolution, and control at the single-atom level have hampered a description from first principles. Here, using an ion-crystal system with single-atom, single-substrate-site spatial and single-slip temporal resolution we measure the friction force over nearly five orders of magnitude in velocity, and contiguously observe four distinct regimes, while controlling temperature and dissipation. We elucidate the interplay between thermal and structural lubricity for two coupled atoms, and provide a simple explanation in terms of the Peierls–Nabarro potential. This extensive control at the atomic scale enables fundamental studies of the interaction of many-atom surfaces, possibly into the quantum regime

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Unified Image and Video Saliency Modeling

Author: A Borji
A Rozantsev
C Bak
C Guo
HJ Seo
J Liu
L Itti
LVD Maaten
M Cornia
O Le Meur
Q Lai
S Marat
S Mathe
SSS Kruthiventi
V Leboran
V Mahadevan
W Wang
Y Fang
Y Sun
Z Bylinskii
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Visual saliency modeling for images and videos is treated as two independent tasks in recent computer vision literature. While image saliency modeling is a well-studied problem and progress on benchmarks like SALICON and MIT300 is slowing, video saliency models have shown rapid gains on the recent DHF1K benchmark. Here, we take a step back and ask: Can image and video saliency modeling be approached via a unified model, with mutual benefit? We identify different sources of domain shift between image and video saliency data and between different video saliency datasets as a key challenge for effective joint modelling. To address this we propose four novel domain adaptation techniques - Domain-Adaptive Priors, Domain-Adaptive Fusion, Domain-Adaptive Smoothing and Bypass-RNN - in addition to an improved formulation of learned Gaussian priors. We integrate these techniques into a simple and lightweight encoder-RNN-decoder-style network, UNISAL, and train it jointly with image and video saliency data. We evaluate our method on the video saliency datasets DHF1K, Hollywood-2 and UCF-Sports, and the image saliency datasets SALICON and MIT300. With one set of parameters, UNISAL achieves state-of-the-art performance on all video saliency datasets and is on par with the state-of-the-art for image saliency datasets, despite faster runtime and a 5 to 20-fold smaller model size compared to all competing deep methods. We provide retrospective analyses and ablation studies which confirm the importance of the domain shift modeling. The code is available at https://github.com/rdroste/unisalComment: Presented at the European Conference on Computer Vision (ECCV) 2020. R. Droste and J. Jiao contributed equally to this work. v3: Updated Fig. 5a) and added new MTI300 benchmark results to supp. materia

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Topological Qubits with Majorana Fermions in Trapped Ions

Author: A Mezzacapo
Cetina M Bylinskii A Karpa L Gangloff D Beck K M Ge Y Scholz M Grier A T Chuang I Vuletic V
E Solano
Farhi E Goldstone J Gutmann S Sipser M
Hassler F
J Casanova
Kitaev A Laumann C
Kitaev A Y
L Lamata
Lamata L
Müller M
Nielsen M A
Schmied R
Stute A Casabone B Brandstatter B Friebe K Northup T E Blatt R
Publication venue: 'IOP Publishing'
Publication date: 05/03/2013
Field of study

We propose a method of encoding a topologically-protected qubit using Majorana fermions in a trapped-ion chain. This qubit is protected against major sources of decoherence, while local operations and measurements can be realized. Furthermore, we show that an efficient quantum interface and memory for arbitrary multiqubit photonic states can be built, encoding them into a set of entangled Majorana-fermion qubits inside cavities.Comment: 9 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Kinks and nanofriction: Structural phases in few-atom chains

Author: Bylinskii Alexei
Gangloff Dorian A.
Vuletic Vladan
Publication venue: 'American Physical Society (APS)'
Publication date: 15/10/2019
Field of study

The frictional dynamics of interacting surfaces under forced translation are critically dependent on lattice commensurability. The highly nonlinear system of an elastic atomic chain sliding on an incommensurate periodic potential exhibits topological defects, known as kinks, that govern the frictional and translational dynamics. Performing experiments in a trapped-ion friction emulator, we observe two distinct structural and frictional phases: a commensurate high-friction phase where the ions stick-slip simultaneously over the lattice, and an incommensurate low-friction phase where the propagation of a kink breaks that simultaneity. We experimentally track the kink's propagation with atom-by-atom and sublattice site resolution and show that its velocity increases with commensurability. Our results elucidate the commensurate-incommensurate transition and the connection between the appearance of kinks and the reduction of friction in a finite system, with important consequences for controlling friction at nanocontacts

arXiv.org e-Print Archive

DSpace@MIT

A Crowdsourced Alternative to Eye-tracking for Visualization Understanding

Author: Aude Oliva
Hanspeter Pfister
Krzysztof Z. Gajos
Michelle A. Borkin
Nam Wook Kim
Zoya Bylinskii
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/10/2015
Field of study

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). Copyright is held by the author/owner(s)

CiteSeerX

Crossref