171 research outputs found
Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition
We present a unified framework for understanding human social behaviors in
raw image sequences. Our model jointly detects multiple individuals, infers
their social actions, and estimates the collective actions with a single
feed-forward pass through a neural network. We propose a single architecture
that does not rely on external detection algorithms but rather is trained
end-to-end to generate dense proposal maps that are refined via a novel
inference scheme. The temporal consistency is handled via a person-level
matching Recurrent Neural Network. The complete model takes as input a sequence
of frames and outputs detections along with the estimates of individual actions
and collective activities. We demonstrate state-of-the-art performance of our
algorithm on multiple publicly available benchmarks
NPC: Neural Point Characters from Video
High-fidelity human 3D models can now be learned directly from videos,
typically by combining a template-based surface model with neural
representations. However, obtaining a template surface requires expensive
multi-view capture systems, laser scans, or strictly controlled conditions.
Previous methods avoid using a template but rely on a costly or ill-posed
mapping from observation to canonical space. We propose a hybrid point-based
representation for reconstructing animatable characters that does not require
an explicit surface model, while being generalizable to novel poses. For a
given video, our method automatically produces an explicit set of 3D points
representing approximate canonical geometry, and learns an articulated
deformation model that produces pose-dependent point transformations. The
points serve both as a scaffold for high-frequency neural features and an
anchor for efficiently mapping between observation and canonical space. We
demonstrate on established benchmarks that our representation overcomes
limitations of prior work operating in either canonical or in observation
space. Moreover, our automatic point extraction approach enables learning
models of human and animal characters alike, matching the performance of the
methods using rigged surface templates despite being more general. Project
website: https://lemonatsu.github.io/npc/Comment: Project website: https://lemonatsu.github.io/npc
Masksembles for Uncertainty Estimation
Deep neural networks have amply demonstrated their prowess but estimating the
reliability of their predictions remains challenging. Deep Ensembles are widely
considered as being one of the best methods for generating uncertainty
estimates but are very expensive to train and evaluate. MC-Dropout is another
popular alternative, which is less expensive, but also less reliable. Our
central intuition is that there is a continuous spectrum of ensemble-like
models of which MC-Dropout and Deep Ensembles are extreme examples. The first
uses an effectively infinite number of highly correlated models while the
second relies on a finite number of independent models.
To combine the benefits of both, we introduce Masksembles. Instead of
randomly dropping parts of the network as in MC-dropout, Masksemble relies on a
fixed number of binary masks, which are parameterized in a way that allows to
change correlations between individual models. Namely, by controlling the
overlap between the masks and their density one can choose the optimal
configuration for the task at hand. This leads to a simple and easy to
implement method with performance on par with Ensembles at a fraction of the
cost. We experimentally validate Masksembles on two widely used datasets,
CIFAR10 and ImageNet
Variational Methods for Human Modeling
A large part of computer vision research is devoted to building models
and algorithms aimed at understanding human appearance and behaviour
from images and videos. Ultimately, we want to build automated systems
that are at least as capable as people when it comes to
interpreting humans. Most of the tasks that we want these systems to
solve can be posed as a problem of inference in probabilistic
models. Although probabilistic inference in general is a very hard
problem of its own, there exists a very powerful class of inference
algorithms, variational inference, which allows us to build efficient
solutions for a wide range of problems.
In this thesis, we consider a variety of computer vision problems
targeted at modeling human appearance and behaviour, including
detection, activity recognition, semantic segmentation and facial
geometry modeling. For each of those problems, we develop novel methods
that use variational inference to improve the capabilities
of the existing systems.
First, we introduce a novel method for detecting multiple potentially
occluded people in depth images, which we call DPOM. Unlike many other
approaches, our method does probabilistic reasoning jointly,
and thus allows to propagate knowledge about one part of the image
evidence to reason about the rest. This is particularly
important in crowded scenes involving many people, since it helps to
handle ambiguous situations resulting from severe occlusions. We
demonstrate that our approach outperforms existing methods on multiple
datasets.
Second, we develop a new algorithm for variational inference that
works for a large class of probabilistic models, which includes, among
others, DPOM and some of the state-of-the-art models for semantic
segmentation. We provide a formal proof that our method converges,
and demonstrate experimentally that it brings better performance than
the state-of-the-art on several real-world tasks, which include
semantic segmentation and people detection. Importantly, we show that
parallel variational inference in discrete random fields can be seen
as a special case of proximal gradient descent, which allows us to
benefit from many of the advances in gradient-based optimization.
Third, we propose a unified framework for multi-human scene
understanding which simultaneously solves three tasks: multi-person
detection, individual action recognition and collective activity
recognition. Within our framework, we introduce a novel multi-person
detection scheme, which relies on variational inference and
jointly refines detection hypotheses instead of relying on
suboptimal post-processing. Ultimately, our model takes as an inputs a
frame sequence and produces a comprehensive description of the
scene. Finally, we experimentally demonstrate that our method brings
better performance than the state-of-the-art.
Fourth, we propose a new approach for learning facial geometry with
deep probabilistic models and variational methods. Our model is based
on a variational autoencoder with multiple sets of hidden variables,
which are capturing various levels of deformations, ranging from
global to local, high-frequency ones. We experimentally demonstrate
the power of the model on a variety of fitting tasks. Our model is
completely data-driven and can be learned from a relatively small
number of individuals
Morfološki indikatori potkožne implantacije monofilamentne mrežice
Hernias are a significant, non-infectious animal condition. In productive animals, failure to provide surgical treatment leads to premature rejection and potential loss of their productive longevity. In small pets, this becomes a social problem for pet owners related to the keeping and death of affected animals. The aim of this study was to study the histological parameters of tissues during implantation of monofilament mesh in cattle for periods up to four months. The study was conducted on eight bulls of the Black Motley breed, divided into two groups of four animals. In the first group, four bulls received a subcutaneous implant of hernioplasty mesh made of polypropylene monofilament (Herniamesh S.R.I. Via CiRie 22 / A, San Maruro Torinese, Torino, Italy) in the area of the lateral soft abdominal wall on the right and left sides. In the second group, four bulls received implants in the middle third of the neck to the right and left sides. Thus, the subject of research was 16 wounds with implanted mesh. A sterile piece monofilament mesh, 1x2 cm in size and folded in half along the longitudinal side, was inserted vertically into the formed hypodermic pocket on the right side of the wound, in which it was possible to freely place the specified mesh. During the course of the study, Polycon No. 4 thread with intermittent knotted seams was used, and three sutures were applied. To ensure fixation of the mesh, it was stitched centrally. The material for histological studies was taken by biopsy at one, two, three and four months after implantation. Tissue was embedded in paraffin blocks, and sections were stained with haematoxylin-eosin and picrofuchsin according to Van Gieson. The results indicated that after subcutaneous implantation of monofilament mesh in the neck and abdominal wall in cattle, wound healing occurs by primary intention. It was revealed that from the beginning of the histological study to one month, the monofilament mesh is first overgrown with loose connective tissue. By the end of the study, after four months, this is sequentially differentiated into dense connective tissue. No significant differences were observed between the abdominal wall and neck area as sites of implantation, and morphological processes in both sites proceeded in the same way. Thus, the conducted studies allow us to conclude that monofilament mesh is a suitable material for closing the hernial ring in cattle, where it is not possible to use their own tissues for these purposes.Hernije su često, nezarazno stanje životinja. U produktivnih životinja, ukoliko se ne pruži kirurško liječenje, dolazi do preranog odbacivanja ploda i potencijalnog gubitka njihove produktivne dugovječnosti. U malih kućnih ljubimaca ovo je postalo socijalni problem za vlasnike kućnih ljubimaca povezan s čuvanjem i smrću oboljelih životinja. Cilj ove studije bio je proučiti histološke parametre tkiva tijekom implantacije monofilamentne mrežice u goveda za razdoblja do četiri mjeseca. Studija je provedena na osam bikova Black Motley pasmine, podijeljenih u dvije skupine od po četiri životinje. U prvoj skupini, četiri su bika dobila potkožno implantiranu mrežicu za hernioplastiku izrađenu od polipropilenskog monofilamenta (Herniamesh S.R.I. Via CiRie 22/A, San Maruro Torinese, Torino, Italija) na području bočne meke abdominalne stijenke s desne i lijeve strane. U drugoj skupini, četiri su bika dobila implantate u srednjoj trećini vrata na desnoj i lijevoj strani. Time je predmet istraživanja bilo 16 rana s implantiranim mrežicama. Sterilni komad monofilamentne mrežice, 1 x 2 cm veličine i preklopljen na pola duž uzdužne strane, umetnut je okomito u formirani hipodermički džepić s desne strane rane, u koji je bilo moguće slobodno postaviti spomenutu mrežicu. U studiji je rabljena nit polikon br. 4 s isprekidanim čvornim šavovima te su primijenjena tri šava. Kako bi se osiguralo fiksiranje mrežice, ona
je središnje zašivena. Materijal za histološke studije uzet je biopsijom jedan, dva, tri i četiri mjeseca nakon implantacije. Tkivo je uronjeno u parafinske blokove, a isječci tkiva obojeni su hematoksilin-eozinom i pikrofuksinom prema Van Giesonu. Rezultati su pokazali da nakon potkožne implantacije monofilamentne mrežice u vrat i abdominalnu stijenku goveda dolazi do liječenja rane prema primarnoj namjeri. Otkriveno je da je od početka histološke studije do kraja prvog mjeseca, monofilamentna mrežica prvo obrasla labavim veznim tkivom. Do kraja studije, nakon četiri mjeseca, to tkivo se sekvencijski diferenciralo u gusto vezno tkivo. Nisu zamijećene velike razlike između područja abdominalne stijenke i vrata u smislu mjesta implantacije, a morfološki procesi na oba mjesta odvijali su se jednako. Stoga nam provedena studija dopušta donijeti zaključak da je monofilamentna mrežica prikladan materijal za zatvaranje kilnog prstena u stoke, ako nije moguće, u tu svrhu, rabiti njihovo vlastito tkivo
Semi‐Automated Quantification of Retinal and Choroidal Biomarkers in Retinal Vascular Diseases: Agreement of Spectral‐Domain Optical Coherence Tomography with and without Enhanced Depth Imaging Mode
Background: We compared with and without enhanced depth imaging mode (EDI) in semi-automated quantification of retinal and choroidal biomarkers in optical coherence tomography (OCT) in patients with diabetic retinopathy (DR) or retinal vein occlusion (RVO) complicated by macular edema. We chose to study three OCT biomarkers: the numbers of hyperreflective foci (HF), the ellipsoid zone reflectivity ratio (EZR) and the choroidal vascularity index (CVI), all known to be correlated with visual acuity changes or treatment outcomes. Methods: In a single examination, one eye of each patient (n = 60; diabetic retinopathy: n = 27, retinal vein occlusion: n = 33) underwent macular 870 nm spectral domain-OCT (SD-OCT) B-scans without and with EDI mode. Semi-automated quantification of HF, EZR and CVI was applied according to preexisting published protocols. Paired Student’s t-test or Wilcoxon rank-sum test was used to test for differences in subgroups. Intraclass correlation coefficient (ICC) and Bland–Altman plots were applied to describe the agreement between quantification in EDI and conventional OCT mode. The effect of macular edema on semi-automated quantification was evaluated. Results: For the entire cohort, quantification of all three biomarkers was not significantly different in SD-OCT scans with and without EDI mode (p > 0.05). ICC was 0.78, 0.90 and 0.80 for HF, EZR and CVI. The presence of macular edema led to significant differences in the quantification of hyperreflective foci (without EDI: 80.00 ± 33.70, with EDI: 92.08 ± 38.11; mean difference: 12.09, p = 0.03), but not in the quantification of EZR and CVI (p > 0.05). Conclusion: Quantification of EZR and CVI was comparable whether or not EDI mode was used. In conclusion, both retinal and choroidal biomarkers can be quantified from one single 870 nm SD-OCT EDI image
Drivable 3D Gaussian Avatars
We present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable
model for human bodies rendered with Gaussian splats. Current photorealistic
drivable avatars require either accurate 3D registrations during training,
dense input images during testing, or both. The ones based on neural radiance
fields also tend to be prohibitively slow for telepresence applications. This
work uses the recently presented 3D Gaussian Splatting (3DGS) technique to
render realistic humans at real-time framerates, using dense calibrated
multi-view videos as input. To deform those primitives, we depart from the
commonly used point deformation method of linear blend skinning (LBS) and use a
classic volumetric deformation method: cage deformations. Given their smaller
size, we drive these deformations with joint angles and keypoints, which are
more suitable for communication applications. Our experiments on nine subjects
with varied body shapes, clothes, and motions obtain higher-quality results
than state-of-the-art methods when using the same training and test data.Comment: Website: https://zielon.github.io/d3ga
To The Question Of Urbanization In The Territory Of The Pre-Mongol Volga Bulgaria
The achievements of archaeological research in the Republic of Tatarstan of the Russian Federation and in the adjacent territories allow us to reach a new level of generalization and coverage of the medieval urbanization in the Volga-Kama region of Eastern Europe. The purpose of the article is to cover the role of the early feudal state of Volgian (Volga-Kama) Bulgaria at 10 - the first third of 13 centuries. There are limited information from written sources to understand the process of urbanization closely related to the formation of the state of Volga Bulgaria. The basic source for the reconstruction of historical events is the materials of archaeological research. The main results of the study are to determine the development vectors of the Bulgarian urban structures in the general historical context. The influence of the urbanization of the Volga Bulgaria on the historical destinies of the Turkic-speaking, Finno-Ugric and Slavic peoples of Eastern Europe is reflected. This is due, among other things, to the fact that the Volga Bulgaria in the 10th century became the northernmost Muslim region on the periphery of the Islamic world. A comparative analysis was carried out with the processes of urbanization taking place in Russia in the same chronological period. The materials of the article can be useful for specialists dealing with medieval history and archeology of Eastern Europe, as well as in the reconstruction of ethno-cultural processes that led to the formation of state entities
- …