171 research outputs found

    Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

    Get PDF
    We present a unified framework for understanding human social behaviors in raw image sequences. Our model jointly detects multiple individuals, infers their social actions, and estimates the collective actions with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end to generate dense proposal maps that are refined via a novel inference scheme. The temporal consistency is handled via a person-level matching Recurrent Neural Network. The complete model takes as input a sequence of frames and outputs detections along with the estimates of individual actions and collective activities. We demonstrate state-of-the-art performance of our algorithm on multiple publicly available benchmarks

    NPC: Neural Point Characters from Video

    Full text link
    High-fidelity human 3D models can now be learned directly from videos, typically by combining a template-based surface model with neural representations. However, obtaining a template surface requires expensive multi-view capture systems, laser scans, or strictly controlled conditions. Previous methods avoid using a template but rely on a costly or ill-posed mapping from observation to canonical space. We propose a hybrid point-based representation for reconstructing animatable characters that does not require an explicit surface model, while being generalizable to novel poses. For a given video, our method automatically produces an explicit set of 3D points representing approximate canonical geometry, and learns an articulated deformation model that produces pose-dependent point transformations. The points serve both as a scaffold for high-frequency neural features and an anchor for efficiently mapping between observation and canonical space. We demonstrate on established benchmarks that our representation overcomes limitations of prior work operating in either canonical or in observation space. Moreover, our automatic point extraction approach enables learning models of human and animal characters alike, matching the performance of the methods using rigged surface templates despite being more general. Project website: https://lemonatsu.github.io/npc/Comment: Project website: https://lemonatsu.github.io/npc

    Masksembles for Uncertainty Estimation

    Full text link
    Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. Deep Ensembles are widely considered as being one of the best methods for generating uncertainty estimates but are very expensive to train and evaluate. MC-Dropout is another popular alternative, which is less expensive, but also less reliable. Our central intuition is that there is a continuous spectrum of ensemble-like models of which MC-Dropout and Deep Ensembles are extreme examples. The first uses an effectively infinite number of highly correlated models while the second relies on a finite number of independent models. To combine the benefits of both, we introduce Masksembles. Instead of randomly dropping parts of the network as in MC-dropout, Masksemble relies on a fixed number of binary masks, which are parameterized in a way that allows to change correlations between individual models. Namely, by controlling the overlap between the masks and their density one can choose the optimal configuration for the task at hand. This leads to a simple and easy to implement method with performance on par with Ensembles at a fraction of the cost. We experimentally validate Masksembles on two widely used datasets, CIFAR10 and ImageNet

    Variational Methods for Human Modeling

    Get PDF
    A large part of computer vision research is devoted to building models and algorithms aimed at understanding human appearance and behaviour from images and videos. Ultimately, we want to build automated systems that are at least as capable as people when it comes to interpreting humans. Most of the tasks that we want these systems to solve can be posed as a problem of inference in probabilistic models. Although probabilistic inference in general is a very hard problem of its own, there exists a very powerful class of inference algorithms, variational inference, which allows us to build efficient solutions for a wide range of problems. In this thesis, we consider a variety of computer vision problems targeted at modeling human appearance and behaviour, including detection, activity recognition, semantic segmentation and facial geometry modeling. For each of those problems, we develop novel methods that use variational inference to improve the capabilities of the existing systems. First, we introduce a novel method for detecting multiple potentially occluded people in depth images, which we call DPOM. Unlike many other approaches, our method does probabilistic reasoning jointly, and thus allows to propagate knowledge about one part of the image evidence to reason about the rest. This is particularly important in crowded scenes involving many people, since it helps to handle ambiguous situations resulting from severe occlusions. We demonstrate that our approach outperforms existing methods on multiple datasets. Second, we develop a new algorithm for variational inference that works for a large class of probabilistic models, which includes, among others, DPOM and some of the state-of-the-art models for semantic segmentation. We provide a formal proof that our method converges, and demonstrate experimentally that it brings better performance than the state-of-the-art on several real-world tasks, which include semantic segmentation and people detection. Importantly, we show that parallel variational inference in discrete random fields can be seen as a special case of proximal gradient descent, which allows us to benefit from many of the advances in gradient-based optimization. Third, we propose a unified framework for multi-human scene understanding which simultaneously solves three tasks: multi-person detection, individual action recognition and collective activity recognition. Within our framework, we introduce a novel multi-person detection scheme, which relies on variational inference and jointly refines detection hypotheses instead of relying on suboptimal post-processing. Ultimately, our model takes as an inputs a frame sequence and produces a comprehensive description of the scene. Finally, we experimentally demonstrate that our method brings better performance than the state-of-the-art. Fourth, we propose a new approach for learning facial geometry with deep probabilistic models and variational methods. Our model is based on a variational autoencoder with multiple sets of hidden variables, which are capturing various levels of deformations, ranging from global to local, high-frequency ones. We experimentally demonstrate the power of the model on a variety of fitting tasks. Our model is completely data-driven and can be learned from a relatively small number of individuals

    Morfološki indikatori potkožne implantacije monofilamentne mrežice

    Get PDF
    Hernias are a significant, non-infectious animal condition. In productive animals, failure to provide surgical treatment leads to premature rejection and potential loss of their productive longevity. In small pets, this becomes a social problem for pet owners related to the keeping and death of affected animals. The aim of this study was to study the histological parameters of tissues during implantation of monofilament mesh in cattle for periods up to four months. The study was conducted on eight bulls of the Black Motley breed, divided into two groups of four animals. In the first group, four bulls received a subcutaneous implant of hernioplasty mesh made of polypropylene monofilament (Herniamesh S.R.I. Via CiRie 22 / A, San Maruro Torinese, Torino, Italy) in the area of the lateral soft abdominal wall on the right and left sides. In the second group, four bulls received implants in the middle third of the neck to the right and left sides. Thus, the subject of research was 16 wounds with implanted mesh. A sterile piece monofilament mesh, 1x2 cm in size and folded in half along the longitudinal side, was inserted vertically into the formed hypodermic pocket on the right side of the wound, in which it was possible to freely place the specified mesh. During the course of the study, Polycon No. 4 thread with intermittent knotted seams was used, and three sutures were applied. To ensure fixation of the mesh, it was stitched centrally. The material for histological studies was taken by biopsy at one, two, three and four months after implantation. Tissue was embedded in paraffin blocks, and sections were stained with haematoxylin-eosin and picrofuchsin according to Van Gieson. The results indicated that after subcutaneous implantation of monofilament mesh in the neck and abdominal wall in cattle, wound healing occurs by primary intention. It was revealed that from the beginning of the histological study to one month, the monofilament mesh is first overgrown with loose connective tissue. By the end of the study, after four months, this is sequentially differentiated into dense connective tissue. No significant differences were observed between the abdominal wall and neck area as sites of implantation, and morphological processes in both sites proceeded in the same way. Thus, the conducted studies allow us to conclude that monofilament mesh is a suitable material for closing the hernial ring in cattle, where it is not possible to use their own tissues for these purposes.Hernije su često, nezarazno stanje životinja. U produktivnih životinja, ukoliko se ne pruži kirurško liječenje, dolazi do preranog odbacivanja ploda i potencijalnog gubitka njihove produktivne dugovječnosti. U malih kućnih ljubimaca ovo je postalo socijalni problem za vlasnike kućnih ljubimaca povezan s čuvanjem i smrću oboljelih životinja. Cilj ove studije bio je proučiti histološke parametre tkiva tijekom implantacije monofilamentne mrežice u goveda za razdoblja do četiri mjeseca. Studija je provedena na osam bikova Black Motley pasmine, podijeljenih u dvije skupine od po četiri životinje. U prvoj skupini, četiri su bika dobila potkožno implantiranu mrežicu za hernioplastiku izrađenu od polipropilenskog monofilamenta (Herniamesh S.R.I. Via CiRie 22/A, San Maruro Torinese, Torino, Italija) na području bočne meke abdominalne stijenke s desne i lijeve strane. U drugoj skupini, četiri su bika dobila implantate u srednjoj trećini vrata na desnoj i lijevoj strani. Time je predmet istraživanja bilo 16 rana s implantiranim mrežicama. Sterilni komad monofilamentne mrežice, 1 x 2 cm veličine i preklopljen na pola duž uzdužne strane, umetnut je okomito u formirani hipodermički džepić s desne strane rane, u koji je bilo moguće slobodno postaviti spomenutu mrežicu. U studiji je rabljena nit polikon br. 4 s isprekidanim čvornim šavovima te su primijenjena tri šava. Kako bi se osiguralo fiksiranje mrežice, ona je središnje zašivena. Materijal za histološke studije uzet je biopsijom jedan, dva, tri i četiri mjeseca nakon implantacije. Tkivo je uronjeno u parafinske blokove, a isječci tkiva obojeni su hematoksilin-eozinom i pikrofuksinom prema Van Giesonu. Rezultati su pokazali da nakon potkožne implantacije monofilamentne mrežice u vrat i abdominalnu stijenku goveda dolazi do liječenja rane prema primarnoj namjeri. Otkriveno je da je od početka histološke studije do kraja prvog mjeseca, monofilamentna mrežica prvo obrasla labavim veznim tkivom. Do kraja studije, nakon četiri mjeseca, to tkivo se sekvencijski diferenciralo u gusto vezno tkivo. Nisu zamijećene velike razlike između područja abdominalne stijenke i vrata u smislu mjesta implantacije, a morfološki procesi na oba mjesta odvijali su se jednako. Stoga nam provedena studija dopušta donijeti zaključak da je monofilamentna mrežica prikladan materijal za zatvaranje kilnog prstena u stoke, ako nije moguće, u tu svrhu, rabiti njihovo vlastito tkivo

    Semi‐Automated Quantification of Retinal and Choroidal Biomarkers in Retinal Vascular Diseases: Agreement of Spectral‐Domain Optical Coherence Tomography with and without Enhanced Depth Imaging Mode

    Get PDF
    Background: We compared with and without enhanced depth imaging mode (EDI) in semi-automated quantification of retinal and choroidal biomarkers in optical coherence tomography (OCT) in patients with diabetic retinopathy (DR) or retinal vein occlusion (RVO) complicated by macular edema. We chose to study three OCT biomarkers: the numbers of hyperreflective foci (HF), the ellipsoid zone reflectivity ratio (EZR) and the choroidal vascularity index (CVI), all known to be correlated with visual acuity changes or treatment outcomes. Methods: In a single examination, one eye of each patient (n = 60; diabetic retinopathy: n = 27, retinal vein occlusion: n = 33) underwent macular 870 nm spectral domain-OCT (SD-OCT) B-scans without and with EDI mode. Semi-automated quantification of HF, EZR and CVI was applied according to preexisting published protocols. Paired Student’s t-test or Wilcoxon rank-sum test was used to test for differences in subgroups. Intraclass correlation coefficient (ICC) and Bland–Altman plots were applied to describe the agreement between quantification in EDI and conventional OCT mode. The effect of macular edema on semi-automated quantification was evaluated. Results: For the entire cohort, quantification of all three biomarkers was not significantly different in SD-OCT scans with and without EDI mode (p > 0.05). ICC was 0.78, 0.90 and 0.80 for HF, EZR and CVI. The presence of macular edema led to significant differences in the quantification of hyperreflective foci (without EDI: 80.00 ± 33.70, with EDI: 92.08 ± 38.11; mean difference: 12.09, p = 0.03), but not in the quantification of EZR and CVI (p > 0.05). Conclusion: Quantification of EZR and CVI was comparable whether or not EDI mode was used. In conclusion, both retinal and choroidal biomarkers can be quantified from one single 870 nm SD-OCT EDI image

    Drivable 3D Gaussian Avatars

    Full text link
    We present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable model for human bodies rendered with Gaussian splats. Current photorealistic drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. The ones based on neural radiance fields also tend to be prohibitively slow for telepresence applications. This work uses the recently presented 3D Gaussian Splatting (3DGS) technique to render realistic humans at real-time framerates, using dense calibrated multi-view videos as input. To deform those primitives, we depart from the commonly used point deformation method of linear blend skinning (LBS) and use a classic volumetric deformation method: cage deformations. Given their smaller size, we drive these deformations with joint angles and keypoints, which are more suitable for communication applications. Our experiments on nine subjects with varied body shapes, clothes, and motions obtain higher-quality results than state-of-the-art methods when using the same training and test data.Comment: Website: https://zielon.github.io/d3ga

    To The Question Of Urbanization In The Territory Of The Pre-Mongol Volga Bulgaria

    Get PDF
    The achievements of archaeological research in the Republic of Tatarstan of the Russian Federation and in the adjacent territories allow us to reach a new level of generalization and coverage of the medieval urbanization in the Volga-Kama region of Eastern Europe. The purpose of the article is to cover the role of the early feudal state of Volgian (Volga-Kama) Bulgaria at 10 - the first third of 13 centuries. There are limited information from written sources to understand the process of urbanization closely related to the formation of the state of Volga Bulgaria. The basic source for the reconstruction of historical events is the materials of archaeological research. The main results of the study are to determine the development vectors of the Bulgarian urban structures in the general historical context. The influence of the urbanization of the Volga Bulgaria on the historical destinies of the Turkic-speaking, Finno-Ugric and Slavic peoples of Eastern Europe is reflected. This is due, among other things, to the fact that the Volga Bulgaria in the 10th century became the northernmost Muslim region on the periphery of the Islamic world. A comparative analysis was carried out with the processes of urbanization taking place in Russia in the same chronological period. The materials of the article can be useful for specialists dealing with medieval history and archeology of Eastern Europe, as well as in the reconstruction of ethno-cultural processes that led to the formation of state entities
    corecore