1,112 research outputs found

    Anatomy-guided domain adaptation for 3D in-bed human pose estimation

    Full text link
    3D human pose estimation is a key component of clinical monitoring systems. The clinical applicability of deep pose estimation models, however, is limited by their poor generalization under domain shifts along with their need for sufficient labeled training data. As a remedy, we present a novel domain adaptation method, adapting a model from a labeled source to a shifted unlabeled target domain. Our method comprises two complementary adaptation strategies based on prior knowledge about human anatomy. First, we guide the learning process in the target domain by constraining predictions to the space of anatomically plausible poses. To this end, we embed the prior knowledge into an anatomical loss function that penalizes asymmetric limb lengths, implausible bone lengths, and implausible joint angles. Second, we propose to filter pseudo labels for self-training according to their anatomical plausibility and incorporate the concept into the Mean Teacher paradigm. We unify both strategies in a point cloud-based framework applicable to unsupervised and source-free domain adaptation. Evaluation is performed for in-bed pose estimation under two adaptation scenarios, using the public SLP dataset and a newly created dataset. Our method consistently outperforms various state-of-the-art domain adaptation methods, surpasses the baseline model by 31%/66%, and reduces the domain gap by 65%/82%. Source code is available at https://github.com/multimodallearning/da-3dhpe-anatomy.Comment: submitted to Medical Image Analysi

    Adversarial content manipulation for analyzing and improving model robustness

    Get PDF
    The recent rapid progress in machine learning systems has opened up many real-world applications --- from recommendation engines on web platforms to safety critical systems like autonomous vehicles. A model deployed in the real-world will often encounter inputs far from its training distribution. For example, a self-driving car might come across a black stop sign in the wild. To ensure safe operation, it is vital to quantify the robustness of machine learning models to such out-of-distribution data before releasing them into the real-world. However, the standard paradigm of benchmarking machine learning models with fixed size test sets drawn from the same distribution as the training data is insufficient to identify these corner cases efficiently. In principle, if we could generate all valid variations of an input and measure the model response, we could quantify and guarantee model robustness locally. Yet, doing this with real world data is not scalable. In this thesis, we propose an alternative, using generative models to create synthetic data variations at scale and test robustness of target models to these variations. We explore methods to generate semantic data variations in a controlled fashion across visual and text modalities. We build generative models capable of performing controlled manipulation of data like changing visual context, editing appearance of an object in images or changing writing style of text. Leveraging these generative models we propose tools to study robustness of computer vision systems to input variations and systematically identify failure modes. In the text domain, we deploy these generative models to improve diversity of image captioning systems and perform writing style manipulation to obfuscate private attributes of the user. Our studies quantifying model robustness explore two kinds of input manipulations, model-agnostic and model-targeted. The model-agnostic manipulations leverage human knowledge to choose the kinds of changes without considering the target model being tested. This includes automatically editing images to remove objects not directly relevant to the task and create variations in visual context. Alternatively, in the model-targeted approach the input variations performed are directly adversarially guided by the target model. For example, we adversarially manipulate the appearance of an object in the image to fool an object detector, guided by the gradients of the detector. Using these methods, we measure and improve the robustness of various computer vision systems -- specifically image classification, segmentation, object detection and visual question answering systems -- to semantic input variations.Der schnelle Fortschritt von Methoden des maschinellen Lernens hat viele neue Anwendungen ermรถglicht โ€“ von Recommender-Systemen bis hin zu sicherheitskritischen Systemen wie autonomen Fahrzeugen. In der realen Welt werden diese Systeme oft mit Eingaben auรŸerhalb der Verteilung der Trainingsdaten konfrontiert. Zum Beispiel kรถnnte ein autonomes Fahrzeug einem schwarzen Stoppschild begegnen. Um sicheren Betrieb zu gewรคhrleisten, ist es entscheidend, die Robustheit dieser Systeme zu quantifizieren, bevor sie in der Praxis eingesetzt werden. Aktuell werden diese Modelle auf festen Eingaben von derselben Verteilung wie die Trainingsdaten evaluiert. Allerdings ist diese Strategie unzureichend, um solche Ausnahmefรคlle zu identifizieren. Prinzipiell kรถnnte die Robustheit โ€œlokalโ€ bestimmt werden, indem wir alle zulรคssigen Variationen einer Eingabe generieren und die Ausgabe des Systems รผberprรผfen. Jedoch skaliert dieser Ansatz schlecht zu echten Daten. In dieser Arbeit benutzen wir generative Modelle, um synthetische Variationen von Eingaben zu erstellen und so die Robustheit eines Modells zu รผberprรผfen. Wir erforschen Methoden, die es uns erlauben, kontrolliert semantische ร„nderungen an Bild- und Textdaten vorzunehmen. Wir lernen generative Modelle, die kontrollierte Manipulation von Daten ermรถglichen, zum Beispiel den visuellen Kontext zu รคndern, die Erscheinung eines Objekts zu bearbeiten oder den Schreibstil von Text zu รคndern. Basierend auf diesen Modellen entwickeln wir neue Methoden, um die Robustheit von Bilderkennungssystemen bezรผglich Variationen in den Eingaben zu untersuchen und Fehlverhalten zu identifizieren. Im Gebiet von Textdaten verwenden wir diese Modelle, um die Diversitรคt von sogenannten Automatische Bildbeschriftung-Modellen zu verbessern und Schreibtstil-Manipulation zu erlauben, um private Attribute des Benutzers zu verschleiern. Um die Robustheit von Modellen zu quantifizieren, werden zwei Arten von Eingabemanipulationen untersucht: Modell-agnostische und Modell-spezifische Manipulationen. Modell-agnostische Manipulationen basieren auf menschlichem Wissen, um bestimmte ร„nderungen auszuwรคhlen, ohne das entsprechende Modell miteinzubeziehen. Dies beinhaltet das Entfernen von fรผr die Aufgabe irrelevanten Objekten aus Bildern oder Variationen des visuellen Kontextes. In dem alternativen Modell-spezifischen Ansatz werden ร„nderungen vorgenommen, die fรผr das Modell mรถglichst ungรผnstig sind. Zum Beispiel รคndern wir die Erscheinung eines Objekts um ein Modell der Objekterkennung tรคuschen. Dies ist durch den Gradienten des Modells mรถglich. Mithilfe dieser Werkzeuge kรถnnen wir die Robustheit von Systemen zur Bildklassifizierung oder -segmentierung, Objekterkennung und Visuelle Fragenbeantwortung quantifizieren und verbessern

    A Survey on Continual Semantic Segmentation: Theory, Challenge, Method and Application

    Full text link
    Continual learning, also known as incremental learning or life-long learning, stands at the forefront of deep learning and AI systems. It breaks through the obstacle of one-way training on close sets and enables continuous adaptive learning on open-set conditions. In the recent decade, continual learning has been explored and applied in multiple fields especially in computer vision covering classification, detection and segmentation tasks. Continual semantic segmentation (CSS), of which the dense prediction peculiarity makes it a challenging, intricate and burgeoning task. In this paper, we present a review of CSS, committing to building a comprehensive survey on problem formulations, primary challenges, universal datasets, neoteric theories and multifarious applications. Concretely, we begin by elucidating the problem definitions and primary challenges. Based on an in-depth investigation of relevant approaches, we sort out and categorize current CSS models into two main branches including \textit{data-replay} and \textit{data-free} sets. In each branch, the corresponding approaches are similarity-based clustered and thoroughly analyzed, following qualitative comparison and quantitative reproductions on relevant datasets. Besides, we also introduce four CSS specialities with diverse application scenarios and development tendencies. Furthermore, we develop a benchmark for CSS encompassing representative references, evaluation results and reproductions, which is available at~\url{https://github.com/YBIO/SurveyCSS}. We hope this survey can serve as a reference-worthy and stimulating contribution to the advancement of the life-long learning field, while also providing valuable perspectives for related fields.Comment: 20 pages, 12 figures. Undergoing Revie

    Federated Domain Generalization: A Survey

    Full text link
    Machine learning typically relies on the assumption that training and testing distributions are identical and that data is centrally stored for training and testing. However, in real-world scenarios, distributions may differ significantly and data is often distributed across different devices, organizations, or edge nodes. Consequently, it is imperative to develop models that can effectively generalize to unseen distributions where data is distributed across different domains. In response to this challenge, there has been a surge of interest in federated domain generalization (FDG) in recent years. FDG combines the strengths of federated learning (FL) and domain generalization (DG) techniques to enable multiple source domains to collaboratively learn a model capable of directly generalizing to unseen domains while preserving data privacy. However, generalizing the federated model under domain shifts is a technically challenging problem that has received scant attention in the research area so far. This paper presents the first survey of recent advances in this area. Initially, we discuss the development process from traditional machine learning to domain adaptation and domain generalization, leading to FDG as well as provide the corresponding formal definition. Then, we categorize recent methodologies into four classes: federated domain alignment, data manipulation, learning strategies, and aggregation optimization, and present suitable algorithms in detail for each category. Next, we introduce commonly used datasets, applications, evaluations, and benchmarks. Finally, we conclude this survey by providing some potential research topics for the future

    SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication

    Get PDF
    Histograms and synthetic data are of key importance in data analysis. However, researchers have shown that even aggregated data such as histograms, containing no obvious sensitive attributes, can result in privacy leakage. To enable data analysis, a strong notion of privacy is required to avoid risking unintended privacy violations.Such a strong notion of privacy is differential privacy, a statistical notion of privacy that makes privacy leakage quantifiable. The caveat regarding differential privacy is that while it has strong guarantees for privacy, privacy comes at a cost of accuracy. Despite this trade-off being a central and important issue in the adoption of differential privacy, there exists a gap in the literature regarding providing an understanding of the trade-off and how to address it appropriately. Through a systematic literature review (SLR), we investigate the state-of-the-art within accuracy improving differentially private algorithms for histogram and synthetic data publishing. Our contribution is two-fold: 1) we identify trends and connections in the contributions to the field of differential privacy for histograms and synthetic data and 2) we provide an understanding of the privacy/accuracy trade-off challenge by crystallizing different dimensions to accuracy improvement. Accordingly, we position and visualize the ideas in relation to each other and external work, and deconstruct each algorithm to examine the building blocks separately with the aim of pinpointing which dimension of accuracy improvement each technique/approach is targeting. Hence, this systematization of knowledge (SoK) provides an understanding of in which dimensions and how accuracy improvement can be pursued without sacrificing privacy

    3D ์† ํฌ์ฆˆ ์ธ์‹์„ ์œ„ํ•œ ์ธ์กฐ ๋ฐ์ดํ„ฐ์˜ ์ด์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์œตํ•ฉ๊ณผํ•™๊ธฐ์ˆ ๋Œ€ํ•™์› ์œตํ•ฉ๊ณผํ•™๋ถ€(์ง€๋Šฅํ˜•์œตํ•ฉ์‹œ์Šคํ…œ์ „๊ณต), 2021.8. ์–‘ํ•œ์—ด.3D hand pose estimation (HPE) based on RGB images has been studied for a long time. Relevant methods have focused mainly on optimization of neural framework for graphically connected finger joints. Training RGB-based HPE models has not been easy to train because of the scarcity on RGB hand pose datasets; unlike human body pose datasets, the finger joints that span hand postures are structured delicately and exquisitely. Such structure makes accurately annotating each joint with unique 3D world coordinates difficult, which is why many conventional methods rely on synthetic data samples to cover large variations of hand postures. Synthetic dataset consists of very precise annotations of ground truths, and further allows control over the variety of data samples, yielding a learning model to be trained with a large pose space. Most of the studies, however, have performed frame-by-frame estimation based on independent static images. Synthetic visual data can provide practically infinite diversity and rich labels, while avoiding ethical issues with privacy and bias. However, for many tasks, current models trained on synthetic data generalize poorly to real data. The task of 3D human hand pose estimation is a particularly interesting example of this synthetic-to-real problem, because learning-based approaches perform reasonably well given real training data, yet labeled 3D poses are extremely difficult to obtain in the wild, limiting scalability. In this dissertation, we attempt to not only consider the appearance of a hand but incorporate the temporal movement information of a hand in motion into the learning framework for better 3D hand pose estimation performance, which leads to the necessity of a large scale dataset with sequential RGB hand images. We propose a novel method that generates a synthetic dataset that mimics natural human hand movements by re-engineering annotations of an extant static hand pose dataset into pose-flows. With the generated dataset, we train a newly proposed recurrent framework, exploiting visuo-temporal features from sequential images of synthetic hands in motion and emphasizing temporal smoothness of estimations with a temporal consistency constraint. Our novel training strategy of detaching the recurrent layer of the framework during domain finetuning from synthetic to real allows preservation of the visuo-temporal features learned from sequential synthetic hand images. Hand poses that are sequentially estimated consequently produce natural and smooth hand movements which lead to more robust estimations. We show that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations by outperforming state-of-the-art methods in experiments on hand pose estimation benchmarks. Since a fixed set of dataset provides a finite distribution of data samples, the generalization of a learning pose estimation network is limited in terms of pose, RGB and viewpoint spaces. We further propose to augment the data automatically such that the augmented pose sampling is performed in favor of training pose estimators generalization performance. Such auto-augmentation of poses is performed within a learning feature space in order to avoid computational burden of generating synthetic sample for every iteration of updates. The proposed effort can be considered as generating and utilizing synthetic samples for network training in the feature space. This allows training efficiency by requiring less number of real data samples, enhanced generalization power over multiple dataset domains and estimation performance caused by efficient augmentation.2D ์ด๋ฏธ์ง€์—์„œ ์‚ฌ๋žŒ์˜ ์† ๋ชจ์–‘๊ณผ ํฌ์ฆˆ๋ฅผ ์ธ์‹ํ•˜๊ณ  ๊ตฌํ˜„ํ๋Š” ์—ฐ๊ตฌ๋Š” ๊ฐ ์†๊ฐ€๋ฝ ์กฐ์ธํŠธ๋“ค์˜ 3D ์œ„์น˜๋ฅผ ๊ฒ€์ถœํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœํ•œ๋‹ค. ์† ํฌ์ฆˆ๋Š” ์†๊ฐ€๋ฝ ์กฐ์ธํŠธ๋“ค๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๊ณ  ์†๋ชฉ ๊ด€์ ˆ๋ถ€ํ„ฐ MCP, PIP, DIP ์กฐ์ธํŠธ๋“ค๋กœ ์‚ฌ๋žŒ ์†์„ ๊ตฌ์„ฑํ•˜๋Š” ์‹ ์ฒด์  ์š”์†Œ๋“ค์„ ์˜๋ฏธํ•œ๋‹ค. ์† ํฌ์ฆˆ ์ •๋ณด๋Š” ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋ ์ˆ˜ ์žˆ๊ณ  ์† ์ œ์Šค์ณ ๊ฐ์ง€ ์—ฐ๊ตฌ ๋ถ„์•ผ์—์„œ ์† ํฌ์ฆˆ ์ •๋ณด๊ฐ€ ๋งค์šฐ ํ›Œ๋ฅญํ•œ ์ž…๋ ฅ ํŠน์ง• ๊ฐ’์œผ๋กœ ์‚ฌ์šฉ๋œ๋‹ค. ์‚ฌ๋žŒ์˜ ์† ํฌ์ฆˆ ๊ฒ€์ถœ ์—ฐ๊ตฌ๋ฅผ ์‹ค์ œ ์‹œ์Šคํ…œ์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋†’์€ ์ •ํ™•๋„, ์‹ค์‹œ๊ฐ„์„ฑ, ๋‹ค์–‘ํ•œ ๊ธฐ๊ธฐ์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋„๋ก ๊ฐ€๋ฒผ์šด ๋ชจ๋ธ์ด ํ•„์š”ํ•˜๊ณ , ์ด๊ฒƒ์„ ๊ฐ€๋Šฅ์ผ€ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ•™์Šตํ•œ ์ธ๊ณต์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋Š”๋ฐ์—๋Š” ๋งŽ์€ ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”๋กœ ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์‚ฌ๋žŒ ์† ํฌ์ฆˆ๋ฅผ ์ธก์ •ํ•˜๋Š” ๊ธฐ๊ณ„๋“ค์ด ๊ฝค ๋ถˆ์•ˆ์ •ํ•˜๊ณ , ์ด ๊ธฐ๊ณ„๋“ค์„ ์žฅ์ฐฉํ•˜๊ณ  ์žˆ๋Š” ์ด๋ฏธ์ง€๋Š” ์‚ฌ๋žŒ ์† ํ”ผ๋ถ€ ์ƒ‰๊ณผ๋Š” ๋งŽ์ด ๋‹ฌ๋ผ ํ•™์Šต์— ์‚ฌ์šฉํ•˜๊ธฐ๊ฐ€ ์ ์ ˆํ•˜์ง€ ์•Š๋‹ค. ๊ทธ๋Ÿฌ๊ธฐ ๋•Œ๋ฌธ์— ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ธ๊ณต์ ์œผ๋กœ ๋งŒ๋“ค์–ด๋‚ธ ๋ฐ์ดํ„ฐ๋ฅผ ์žฌ๊ฐ€๊ณต ๋ฐ ์ฆ๋Ÿ‰ํ•˜์—ฌ ํ•™์Šต์— ์‚ฌ์šฉํ•˜๊ณ , ๊ทธ๊ฒƒ์„ ํ†ตํ•ด ๋” ์ข‹์€ ํ•™์Šต์„ฑ๊ณผ๋ฅผ ์ด๋ฃจ๋ ค๊ณ  ํ•œ๋‹ค. ์ธ๊ณต์ ์œผ๋กœ ๋งŒ๋“ค์–ด๋‚ธ ์‚ฌ๋žŒ ์† ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋“ค์€ ์‹ค์ œ ์‚ฌ๋žŒ ์† ํ”ผ๋ถ€์ƒ‰๊ณผ๋Š” ๋น„์Šทํ• ์ง€์–ธ์ • ๋””ํ…Œ์ผํ•œ ํ…์Šค์ณ๊ฐ€ ๋งŽ์ด ๋‹ฌ๋ผ, ์‹ค์ œ๋กœ ์ธ๊ณต ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•œ ๋ชจ๋ธ์€ ์‹ค์ œ ์† ๋ฐ์ดํ„ฐ์—์„œ ์„ฑ๋Šฅ์ด ํ˜„์ €ํžˆ ๋งŽ์ด ๋–จ์–ด์ง„๋‹ค. ์ด ๋‘ ๋ฐ์ดํƒ€์˜ ๋„๋ฉ”์ธ์„ ์ค„์ด๊ธฐ ์œ„ํ•ด์„œ ์ฒซ๋ฒˆ์งธ๋กœ๋Š” ์‚ฌ๋žŒ์†์˜ ๊ตฌ์กฐ๋ฅผ ๋จผ์ € ํ•™์Šต ์‹œํ‚ค๊ธฐ์œ„ํ•ด, ์† ๋ชจ์…˜์„ ์žฌ๊ฐ€๊ณตํ•˜์—ฌ ๊ทธ ์›€์ง์ž„ ๊ตฌ์กฐ๋ฅผ ํ•™์Šคํ•œ ์‹œ๊ฐ„์  ์ •๋ณด๋ฅผ ๋บ€ ๋‚˜๋จธ์ง€๋งŒ ์‹ค์ œ ์† ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์— ํ•™์Šตํ•˜์˜€๊ณ  ํฌ๊ฒŒ ํšจ๊ณผ๋ฅผ ๋‚ด์—ˆ๋‹ค. ์ด๋•Œ ์‹ค์ œ ์‚ฌ๋žŒ ์†๋ชจ์…˜์„ ๋ชจ๋ฐฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•˜์˜€๋‹ค. ๋‘๋ฒˆ์งธ๋กœ๋Š” ๋‘ ๋„๋ฉ”์ธ์ด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๋ฅผ ๋„คํŠธ์›Œํฌ ํ”ผ์ณ ๊ณต๊ฐ„์—์„œ align์‹œ์ผฐ๋‹ค. ๊ทธ๋ฟ๋งŒ์•„๋‹ˆ๋ผ ์ธ๊ณต ํฌ์ฆˆ๋ฅผ ํŠน์ • ๋ฐ์ดํ„ฐ๋“ค๋กœ augmentํ•˜์ง€ ์•Š๊ณ  ๋„คํŠธ์›Œํฌ๊ฐ€ ๋งŽ์ด ๋ณด์ง€ ๋ชปํ•œ ํฌ์ฆˆ๊ฐ€ ๋งŒ๋“ค์–ด์ง€๋„๋ก ํ•˜๋‚˜์˜ ํ™•๋ฅ  ๋ชจ๋ธ๋กœ์„œ ์„ค์ •ํ•˜์—ฌ ๊ทธ๊ฒƒ์—์„œ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ๊ตฌ์กฐ๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ธ๊ณต ๋ฐ์ดํ„ฐ๋ฅผ ๋” ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ annotation์ด ์–ด๋ ค์šด ์‹ค์ œ ๋ฐ์ดํ„ฐ๋ฅผ ๋” ๋ชจ์œผ๋Š” ์ˆ˜๊ณ ์Šค๋Ÿฌ์›€ ์—†์ด ์ธ๊ณต ๋ฐ์ดํ„ฐ๋“ค์„ ๋” ํšจ๊ณผ์ ์œผ๋กœ ๋งŒ๋“ค์–ด ๋‚ด๋Š” ๊ฒƒ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ๋” ์•ˆ์ „ํ•˜๊ณ  ์ง€์—ญ์  ํŠน์ง•๊ณผ ์‹œ๊ฐ„์  ํŠน์ง•์„ ํ™œ์šฉํ•ด์„œ ํฌ์ฆˆ์˜ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์•ˆํ–ˆ๋‹ค. ๋˜ํ•œ, ๋„คํŠธ์›Œํฌ๊ฐ€ ์Šค์Šค๋กœ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐพ์•„์„œ ํ•™์Šตํ• ์ˆ˜ ์žˆ๋Š” ์ž๋™ ๋ฐ์ดํ„ฐ ์ฆ๋Ÿ‰ ๋ฐฉ๋ฒ•๋ก ๋„ ํ•จ๊ป˜ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋ ‡๊ฒŒ ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์„ ๊ฒฐํ•ฉํ•ด์„œ ๋” ๋‚˜์€ ์† ํฌ์ฆˆ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ ํ•  ์ˆ˜ ์žˆ๋‹ค.1. Introduction 1 2. Related Works 14 3. Preliminaries: 3D Hand Mesh Model 27 4. SeqHAND: RGB-sequence-based 3D Hand Pose and Shape Estimation 31 5. Hand Pose Auto-Augment 66 6. Conclusion 85 Abstract (Korea) 101 ๊ฐ์‚ฌ์˜ ๊ธ€ 103๋ฐ•
    • โ€ฆ
    corecore