2,505 research outputs found
2D+3D Indoor Scene Understanding from a Single Monocular Image
Scene understanding, as a broad field encompassing many
subtopics, has gained great interest in recent years. Among these
subtopics, indoor scene understanding, having its own specific
attributes and challenges compared to outdoor scene under-
standing, has drawn a lot of attention. It has potential
applications in a wide variety of domains, such as robotic
navigation, object grasping for personal robotics, augmented
reality, etc. To our knowledge, existing research for indoor
scenes typically makes use of depth sensors, such as Kinect, that
is however not always available.
In this thesis, we focused on addressing the indoor scene
understanding tasks in a general case, where only a monocular
color image of the scene is available. Specifically, we first
studied the problem of estimating a detailed depth map from a
monocular image. Then, benefiting from deep-learning-based depth
estimation, we tackled the higher-level tasks of 3D box proposal
generation, and scene parsing with instance segmentation,
semantic labeling and support relationship inference from a
monocular image. Our research on indoor scene understanding
provides a comprehensive scene interpretation at various
perspectives and scales.
For monocular image depth estimation, previous approaches are
limited in that they only reason about depth locally on a single
scale, and do not utilize the important information of geometric
scene structures. Here, we developed a novel graphical model,
which reasons about detailed depth while leveraging geometric
scene structures at multiple scales.
For 3D box proposals, to our best knowledge, our approach
constitutes the first attempt to reason about class-independent
3D box proposals from a single monocular image. To this end, we
developed a novel integrated, differentiable framework that
estimates depth, extracts a volumetric scene representation and
generates 3D proposals. At the core of this framework lies a
novel residual, differentiable truncated signed distance function
module, which is able to handle the relatively low accuracy of
the predicted depth map.
For scene parsing, we tackled its three subtasks of instance
segmentation, se- mantic labeling, and the support relationship
inference on instances. Existing work typically reasons about
these individual subtasks independently. Here, we leverage the
fact that they bear strong connections, which can facilitate
addressing these sub- tasks if modeled properly. To this end, we
developed an integrated graphical model that reasons about the
mutual relationships of the above subtasks.
In summary, in this thesis, we introduced novel and effective
methodologies for each of three indoor scene understanding tasks,
i.e., depth estimation, 3D box proposal generation, and scene
parsing, and exploited the dependencies on depth estimates of the
latter two tasks. Evaluation on several benchmark datasets
demonstrated the effectiveness of our algorithms and the benefits
of utilizing depth estimates for higher-level tasks
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
Linguistic typology aims to capture structural and semantic variation across
the world's languages. A large-scale typology could provide excellent guidance
for multilingual Natural Language Processing (NLP), particularly for languages
that suffer from the lack of human labeled resources. We present an extensive
literature survey on the use of typological information in the development of
NLP techniques. Our survey demonstrates that to date, the use of information in
existing typological databases has resulted in consistent but modest
improvements in system performance. We show that this is due to both intrinsic
limitations of databases (in terms of coverage and feature granularity) and
under-employment of the typological features included in them. We advocate for
a new approach that adapts the broad and discrete nature of typological
categories to the contextual and continuous nature of machine learning
algorithms used in contemporary NLP. In particular, we suggest that such
approach could be facilitated by recent developments in data-driven induction
of typological knowledge
- …