3,069 research outputs found
Omni-Line-of-Sight Imaging for Holistic Shape Reconstruction
We introduce Omni-LOS, a neural computational imaging method for conducting
holistic shape reconstruction (HSR) of complex objects utilizing a
Single-Photon Avalanche Diode (SPAD)-based time-of-flight sensor. As
illustrated in Fig. 1, our method enables new capabilities to reconstruct
near- surrounding geometry of an object from a single scan spot. In
such a scenario, traditional line-of-sight (LOS) imaging methods only see the
front part of the object and typically fail to recover the occluded back
regions. Inspired by recent advances of non-line-of-sight (NLOS) imaging
techniques which have demonstrated great power to reconstruct occluded objects,
Omni-LOS marries LOS and NLOS together, leveraging their complementary
advantages to jointly recover the holistic shape of the object from a single
scan position. The core of our method is to put the object nearby diffuse walls
and augment the LOS scan in the front view with the NLOS scans from the
surrounding walls, which serve as virtual ``mirrors'' to trap lights toward the
object. Instead of separately recovering the LOS and NLOS signals, we adopt an
implicit neural network to represent the object, analogous to NeRF and NeTF.
While transients are measured along straight rays in LOS but over the spherical
wavefronts in NLOS, we derive differentiable ray propagation models to
simultaneously model both types of transient measurements so that the NLOS
reconstruction also takes into account the direct LOS measurements and vice
versa. We further develop a proof-of-concept Omni-LOS hardware prototype for
real-world validation. Comprehensive experiments on various wall settings
demonstrate that Omni-LOS successfully resolves shape ambiguities caused by
occlusions, achieves high-fidelity 3D scan quality, and manages to recover
objects of various scales and complexity
PlaNet - Photo Geolocation with Convolutional Neural Networks
Is it possible to build a system to determine the location where a photo was
taken using just its pixels? In general, the problem seems exceptionally
difficult: it is trivial to construct situations where no location can be
inferred. Yet images often contain informative cues such as landmarks, weather
patterns, vegetation, road markings, and architectural details, which in
combination may allow one to determine an approximate location and occasionally
an exact location. Websites such as GeoGuessr and View from your Window suggest
that humans are relatively good at integrating these cues to geolocate images,
especially en-masse. In computer vision, the photo geolocation problem is
usually approached using image retrieval methods. In contrast, we pose the
problem as one of classification by subdividing the surface of the earth into
thousands of multi-scale geographic cells, and train a deep network using
millions of geotagged images. While previous approaches only recognize
landmarks or perform approximate matching using global image descriptors, our
model is able to use and integrate multiple visible cues. We show that the
resulting model, called PlaNet, outperforms previous approaches and even
attains superhuman levels of accuracy in some cases. Moreover, we extend our
model to photo albums by combining it with a long short-term memory (LSTM)
architecture. By learning to exploit temporal coherence to geolocate uncertain
photos, we demonstrate that this model achieves a 50% performance improvement
over the single-image model
The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting
The numerous recent breakthroughs in machine learning (ML) make imperative to
carefully ponder how the scientific community can benefit from a technology
that, although not necessarily new, is today living its golden age. This Grand
Challenge review paper is focused on the present and future role of machine
learning in space weather. The purpose is twofold. On one hand, we will discuss
previous works that use ML for space weather forecasting, focusing in
particular on the few areas that have seen most activity: the forecasting of
geomagnetic indices, of relativistic electrons at geosynchronous orbits, of
solar flares occurrence, of coronal mass ejection propagation time, and of
solar wind speed. On the other hand, this paper serves as a gentle introduction
to the field of machine learning tailored to the space weather community and as
a pointer to a number of open challenges that we believe the community should
undertake in the next decade. The recurring themes throughout the review are
the need to shift our forecasting paradigm to a probabilistic approach focused
on the reliable assessment of uncertainties, and the combination of
physics-based and machine learning approaches, known as gray-box.Comment: under revie
Tracking and modeling focus of attention in meetings [online]
Abstract
This thesis addresses the problem of tracking the focus of
attention of people. In particular, a system to track the focus
of attention of participants in meetings is developed. Obtaining
knowledge about a person\u27s focus of attention is an important
step towards a better understanding of what people do, how and
with what or whom they interact or to what they refer. In
meetings, focus of attention can be used to disambiguate the
addressees of speech acts, to analyze interaction and for
indexing of meeting transcripts. Tracking a user\u27s focus of
attention also greatly contributes to the improvement of
humancomputer interfaces since it can be used to build interfaces
and environments that become aware of what the user is paying
attention to or with what or whom he is interacting.
The direction in which people look; i.e., their gaze, is closely
related to their focus of attention. In this thesis, we estimate
a subject\u27s focus of attention based on his or her head
orientation. While the direction in which someone looks is
determined by head orientation and eye gaze, relevant literature
suggests that head orientation alone is a su#cient cue for the
detection of someone\u27s direction of attention during social
interaction. We present experimental results from a user study
and from several recorded meetings that support this hypothesis.
We have developed a Bayesian approach to model at whom or what
someone is look ing based on his or her head orientation. To
estimate head orientations in meetings, the participants\u27 faces
are automatically tracked in the view of a panoramic camera and
neural networks are used to estimate their head orientations
from preprocessed images of their faces. Using this approach,
the focus of attention target of subjects could be correctly
identified during 73% of the time in a number of evaluation meet
ings with four participants.
In addition, we have investigated whether a person\u27s focus of
attention can be predicted from other cues. Our results show
that focus of attention is correlated to who is speaking in a
meeting and that it is possible to predict a person\u27s focus of
attention
based on the information of who is talking or was talking before
a given moment.
We have trained neural networks to predict at whom a person is
looking, based on information about who was speaking. Using this
approach we were able to predict who is looking at whom with 63%
accuracy on the evaluation meetings using only information about
who was speaking. We show that by using both head orientation
and speaker information to estimate a person\u27s focus, the
accuracy of focus detection can be improved compared to just
using one of the modalities for focus estimation.
To demonstrate the generality of our approach, we have built a
prototype system to demonstrate focusaware interaction with a
household robot and other smart appliances in a room using the
developed components for focus of attention tracking. In the
demonstration environment, a subject could interact with a
simulated household robot, a speechenabled VCR or with other
people in the room, and the recipient of the subject\u27s speech
was disambiguated based on the user\u27s direction of attention.
Zusammenfassung
Die vorliegende Arbeit beschäftigt sich mit der automatischen
Bestimmung und Verfolgung des Aufmerksamkeitsfokus von Personen
in Besprechungen.
Die Bestimmung des Aufmerksamkeitsfokus von Personen ist zum
Verständnis und zur automatischen Auswertung von
Besprechungsprotokollen sehr wichtig. So kann damit
beispielsweise herausgefunden werden, wer zu einem bestimmten
Zeitpunkt wen angesprochen hat beziehungsweise wer wem zugehört
hat. Die automatische Bestimmung des Aufmerksamkeitsfokus kann
desweiteren zur Verbesserung von Mensch-MaschineSchnittstellen
benutzt werden.
Ein wichtiger Hinweis auf die Richtung, in welche eine Person
ihre Aufmerksamkeit richtet, ist die Kopfstellung der Person.
Daher wurde ein Verfahren zur Bestimmung der Kopfstellungen von
Personen entwickelt. Hierzu wurden künstliche neuronale Netze
benutzt, welche als Eingaben vorverarbeitete Bilder des Kopfes
einer Person erhalten, und als Ausgabe eine Schätzung der
Kopfstellung berechnen. Mit den trainierten Netzen wurde auf
Bilddaten neuer Personen, also Personen, deren Bilder nicht in
der Trainingsmenge enthalten waren, ein mittlerer Fehler von
neun bis zehn Grad für die Bestimmung der horizontalen und
vertikalen Kopfstellung erreicht.
Desweiteren wird ein probabilistischer Ansatz zur Bestimmung von
Aufmerksamkeitszielen vorgestellt. Es wird hierbei ein
Bayes\u27scher Ansatzes verwendet um die Aposterior
iWahrscheinlichkeiten verschiedener Aufmerksamkteitsziele,
gegeben beobachteter Kopfstellungen einer Person, zu bestimmen.
Die entwickelten Ansätze wurden auf mehren Besprechungen mit
vier bis fünf Teilnehmern evaluiert.
Ein weiterer Beitrag dieser Arbeit ist die Untersuchung,
inwieweit sich die Blickrichtung der Besprechungsteilnehmer
basierend darauf, wer gerade spricht, vorhersagen läßt. Es wurde
ein Verfahren entwickelt um mit Hilfe von neuronalen Netzen den
Fokus einer Person basierend auf einer kurzen Historie der
Sprecherkonstellationen zu schätzen.
Wir zeigen, dass durch Kombination der bildbasierten und der
sprecherbasierten Schätzung des Aufmerksamkeitsfokus eine
deutliche verbesserte Schätzung erreicht werden kann.
Insgesamt wurde mit dieser Arbeit erstmals ein System
vorgestellt um automatisch die Aufmerksamkeit von Personen in
einem Besprechungsraum zu verfolgen.
Die entwickelten Ansätze und Methoden können auch zur Bestimmung
der Aufmerksamkeit von Personen in anderen Bereichen,
insbesondere zur Steuerung von computerisierten, interaktiven
Umgebungen, verwendet werden. Dies wird an einer
Beispielapplikation gezeigt
- …