Search CORE

25 research outputs found

DeePoint: Pointing Recognition and Direction Estimation From A Fixed View

Author: Kawanishi Yasutomo
Nakamura Shu
Nishino Ko
Nobuhara Shohei
Publication venue
Publication date: 14/04/2023
Field of study

In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of over 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding

arXiv.org e-Print Archive

Attribute-Aware Loss Function for Accurate Semantic Segmentation Considering the Pedestrian Orientations

Author: Deguchi Daisuke
Hirayama Takatsugu
Ide Ichiro
Kawanishi Yasutomo
Murase Hiroshi
Sulistiyo Mahmud Dwi
Zheng Jiang-Yu
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/01/2020
Field of study

Numerous applications such as autonomous driving, satellite imagery sensing, and biomedical imaging use computer vision as an important tool for perception tasks. For Intelligent Transportation Systems (ITS), it is required to precisely recognize and locate scenes in sensor data. Semantic segmentation is one of computer vision methods intended to perform such tasks. However, the existing semantic segmentation tasks label each pixel with a single object's class. Recognizing object attributes, e.g., pedestrian orientation, will be more informative and help for a better scene understanding. Thus, we propose a method to perform semantic segmentation with pedestrian attribute recognition simultaneously. We introduce an attribute-aware loss function that can be applied to an arbitrary base model. Furthermore, a re-annotation to the existing Cityscapes dataset enriches the ground-truth labels by annotating the attributes of pedestrian orientation. We implement the proposed method and compare the experimental results with others. The attribute-aware semantic segmentation shows the ability to outperform baseline methods both in the traditional object segmentation task and the expanded attribute detection task

IUPUIScholarWorks

Predicting reliable H $_2$ column density maps from molecular line data using machine learning

Author: André Philippe
Arzoumanian Doris
Fujita Shinji
Inoue Tsuyoshi
Ito Atsushi M.
Kaneko Hiroyuki
Kawanishi Yasutomo
Miyamoto Yusuke
Nishimoto Shimpei
Nishimura Atsushi
Onishi Toshikazu
Shimajiri Yoshito
Takekawa Shunya
Tokuda Kazuki
Ueda Shota
Yoneda Ryuki
Publication venue
Publication date: 13/09/2023
Field of study

The total mass estimate of molecular clouds suffers from the uncertainty in the H

_2

-CO conversion factor, the so-called

X_{\rm CO}

factor, which is used to convert the

^{12}

CO (1--0) integrated intensity to the H

_2

column density. We demonstrate the machine learning's ability to predict the H

_2

column density from the

^{12}

CO,

^{13}

CO, and C

^{18}

O (1--0) data set of four star-forming molecular clouds; Orion A, Orion B, Aquila, and M17. When the training is performed on a subset of each cloud, the overall distribution of the predicted column density is consistent with that of the Herschel column density. The total column density predicted and observed is consistent within 10\%, suggesting that the machine learning prediction provides a reasonable total mass estimate of each cloud. However, the distribution of the column density for values

> \sim 2 \times 10^{22}

^{-2}

, which corresponds to the dense gas, could not be predicted well. This indicates that molecular line observations tracing the dense gas are required for the training. We also found a significant difference between the predicted and observed column density when we created the model after training the data on different clouds. This highlights the presence of different

X_{\rm CO}

factors between the clouds, and further training in various clouds is required to correct for these variations. We also demonstrated that this method could predict the column density toward the area not observed by Herschel if the molecular line and column density maps are available for the small portion, and the molecular line data are available for the larger areas.Comment: Accepted for publication in MNRA

arXiv.org e-Print Archive

Distance determination of molecular clouds in the 1st quadrant of the Galactic plane using deep learning : I. Method and Results

Author: Fujita Shinji
Inoue Tsuyoshi
Ito A. M.
Kaneko Hiroyuki
Kawanishi Yasutomo
Kohno Mikito
Miyamoto Yusuke
Nishikawa Kaoru
Nishimoto Shimpei
Nishimura Atsushi
Ohnishi Toshikazu
Shimajiri Yoshito
Takekawa Shunya
Tokuda Kazuki
Torii Kazufumi
Ueda Shota
Yoneda Ryuki
Yoshida Daisuke
Publication venue
Publication date: 12/12/2022
Field of study

Machine learning has been successfully applied in varied field but whether it is a viable tool for determining the distance to molecular clouds in the Galaxy is an open question. In the Galaxy, the kinematic distance is commonly employed as the distance to a molecular cloud. However, there is a problem in that for the inner Galaxy, two different solutions, the ``Near'' solution, and the ``Far'' solution, can be derived simultaneously. We attempted to construct a two-class (``Near'' or ``Far'') inference model using a Convolutional Neural Network (CNN), a form of deep learning that can capture spatial features generally. In this study, we used the CO dataset toward the 1st quadrant of the Galactic plane obtained with the Nobeyama 45-m radio telescope (l = 62-10 degree, |b| < 1 degree). In the model, we applied the three-dimensional distribution (position-position-velocity) of the 12CO (J=1-0) emissions as the main input. The dataset with ``Near'' or ``Far'' annotation was made from the HII region catalog of the infrared astronomy satellite WISE to train the model. As a result, we could construct a CNN model with a 76% accuracy rate on the training dataset. By using the model, we determined the distance to molecular clouds identified by the CLUMPFIND algorithm. We found that the mass of the molecular clouds with a distance of < 8.15 kpc identified in the 12CO data follows a power-law distribution with an index of about -2.3 in the mass range of M >10^3 Msun. Also, the detailed molecular gas distribution of the Galaxy as seen from the Galactic North pole was determined.Comment: 29 pages, 12 figure

arXiv.org e-Print Archive

MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results

Author: Hirayama Takatsugu
Hou Hao-Yu
Hsu Chia-Chi
Huang En-Ming
Huang Yu-Chen
Huo Da
Ide Ichiro
Kastner Marc A.
Kawanishi Yasutomo
Komamizu Takahiro
Kondo Yuki
Lee Chun-Yi
Liang Guang
Liu Tingwei
Liu Xinyao
Shen Mu-Yi
Shinya Yosuke
Ukita Norimichi
Wang Chien-Yao
Xia Yu-Cheng
Yamaguchi Takayuki
Yasui Syusuke
Publication venue
Publication date: 18/07/2023
Field of study

Small Object Detection (SOD) is an important machine vision topic because (i) a variety of real-world applications require object detection for distant objects and (ii) SOD is a challenging task due to the noisy, blurred, and less-informative image appearances of small objects. This paper proposes a new SOD dataset consisting of 39,070 images including 137,121 bird instances, which is called the Small Object Detection for Spotting Birds (SOD4SB) dataset. The detail of the challenge with the SOD4SB dataset is introduced in this paper. In total, 223 participants joined this challenge. This paper briefly introduces the award-winning methods. The dataset, the baseline code, and the website for evaluation on the public testset are publicly available.Comment: This paper is included in the proceedings of the 18th International Conference on Machine Vision Applications (MVA2023). It will be officially published at a later date. Project page : https://www.mva-org.jp/mva2023/challeng

arXiv.org e-Print Archive

動的に変化する環境における固定カメラのための背景画像推定

Author: Kawanishi Yasutomo
Publication venue: 京都大学
Publication date: 26/11/2012
Field of study

京都大学0048新制・課程博士博士(情報学)甲第17244号情博第475号新制||情||85(附属図書館)29990京都大学大学院情報学研究科知能情報学専攻(主査)教授美濃導彦, 教授松山隆司, 教授中村裕一学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDA

Kyoto University Research Information Repository

Future Pose Prediction from 3D Human Skeleton Sequence with Surrounding Situation

Author: Tomohiro Fujita
Yasutomo Kawanishi
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

Human pose prediction is vital for robot applications such as human–robot interaction and autonomous control of robots. Recent prediction methods often use deep learning and are based on a 3D human skeleton sequence to predict future poses. Even if the starting motions of 3D human skeleton sequences are very similar, their future poses will have variety. It makes it difficult to predict future poses only from a given human skeleton sequence. Meanwhile, when carefully observing human motions, we can find that human motions are often affected by objects or other people around the target person. We consider that the presence of surrounding objects is an important clue for the prediction. This paper proposes a method for predicting the future skeleton sequence by incorporating the surrounding situation into the prediction model. The proposed method uses a feature of an image around the target person as the surrounding information. We confirmed the performance improvement of the proposed method through evaluations on publicly available datasets. As a result, the prediction accuracy was improved for object-related and human-related motions

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

SDOF-Tracker: Fast and Accurate Multiple Human Tracking by Skipped-Detection and Optical-Flow

Author: Kawanishi Yasutomo
Komorita Satoshi
Murase Hiroshi
Nishimura Hitoshi
Publication venue
Publication date: 30/04/2022
Field of study

Multiple human tracking is a fundamental problem for scene understanding. Although both accuracy and speed are required in real-world applications, recent tracking methods based on deep learning have focused on accuracy and require substantial running time. This study aims to improve running speed by performing human detection at a certain frame interval because it accounts for most of the running time. The question is how to maintain accuracy while skipping human detection. In this paper, we propose a method that complements the detection results with optical flow, based on the fact that someone's appearance does not change much between adjacent frames. To maintain the tracking accuracy, we introduce robust interest point selection within human regions and a tracking termination metric calculated by the distribution of the interest points. On the MOT20 dataset in the MOTChallenge, the proposed SDOF-Tracker achieved the best performance in terms of the total running speed while maintaining the MOTA metric. Our code is available at https://github.com/hitottiez/sdof-tracker

arXiv.org e-Print Archive

Machine Learning-Based Interpretable Modeling for Subjective Emotional Dynamics Sensing Using Facial EMG

Author: Koh Shimokawa
Naoya Kawamura
Tomohiro Fujita
Wataru Sato
Yasutomo Kawanishi
Publication venue: MDPI AG
Publication date: 01/02/2024
Field of study

Understanding the association between subjective emotional experiences and physiological signals is of practical and theoretical significance. Previous psychophysiological studies have shown a linear relationship between dynamic emotional valence experiences and facial electromyography (EMG) activities. However, whether and how subjective emotional valence dynamics relate to facial EMG changes nonlinearly remains unknown. To investigate this issue, we re-analyzed the data of two previous studies that measured dynamic valence ratings and facial EMG of the corrugator supercilii and zygomatic major muscles from 50 participants who viewed emotional film clips. We employed multilinear regression analyses and two nonlinear machine learning (ML) models: random forest and long short-term memory. In cross-validation, these ML models outperformed linear regression in terms of the mean squared error and correlation coefficient. Interpretation of the random forest model using the SHapley Additive exPlanation tool revealed nonlinear and interactive associations between several EMG features and subjective valence dynamics. These findings suggest that nonlinear ML models can better fit the relationship between subjective emotional valence dynamics and facial EMG than conventional linear models and highlight a nonlinear and complex relationship. The findings encourage emotion sensing using facial EMG and offer insight into the subjective–physiological association

Directory of Open Access Journals