4 research outputs found

    Self-Labeling Online Learning for Mobile Robot Grasping

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๋‡Œ๊ณผํ•™์ „๊ณต, 2022.2. ์žฅ๋ณ‘ํƒ.๋™์ ์ธ ํ™˜๊ฒฝ์—์„œ ๋ฌผ์ฒด ํŒŒ์ง€๋ฅผ ์ •ํ™•ํ•˜๊ณ  ๊ฒฌ๊ณ ํ•˜๊ฒŒ ํ•˜๋Š” ๊ฒƒ์€ ๋ชจ๋ฐ”์ผ ์กฐ์ž‘ ๋กœ๋ด‡์ด ์„ฑ๊ณต์ ์œผ๋กœ ๊ณผ์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š”๋ฐ ํ•„์ˆ˜์ ์ด๋‹ค. ๊ณผ๊ฑฐ ์•” ๋กœ๋ด‡์˜ ์กฐ์ž‘ ์—ฐ๊ตฌ์—์„  ํŒŒ์ง€ ์ธ์‹์„ ์œ„ํ•ด ์ด‰๊ฐ ์„ผ์„œ๋‚˜ ์‹œ๊ฐ ์„ผ์„œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ–ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋™ํ˜• ๋กœ๋ด‡์€ ๋ณ€ํ™”ํ•˜๋Š” ํ™˜๊ฒฝ์—์„œ ์›€์ง์ž„์œผ๋กœ ์ธํ•ด ๋…ธ์ด์ฆˆ๊ฐ€ ๋ฐœ์ƒํ•จ์„ ๊ณ ๋ คํ•ด์•ผ ํ•œ๋‹ค. ์ตœ๊ทผ ํŒŒ์ง€ ์ธ์‹ ์—ฐ๊ตฌ๋Š” ํ•™์Šต ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์˜์กดํ•˜๊ณ  ์žˆ๋‹ค. ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ  ๋ผ๋ฒจ์„ ์ž…๋ ฅํ•˜๋Š”๋ฐ ๋งŽ์€ ์‹œ๊ฐ„๊ณผ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•œ ์ œํ•œ์ด ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ๋…ผ๋ฌธ์€ ๋กœ๋ด‡์˜ ํŒŒ์ง€์ธ์‹ํ•™์Šต์„ ์œ„ํ•ด, ์Šค์Šค๋กœ ๋ผ๋ฒจ๋ง์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ์˜จ๋ผ์ธ ํ•™์Šตํ•˜๋Š” ๊ณผ์ •์„ ์ž๋™ํ™”ํ•˜๋Š” ์ข…๋‹จ๊ฐ„(end-to-end) ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์…€ํ”„ ๋ผ๋ฒจ๋ง์€ ๋กœ๋ด‡์ด ๋ฌผ์ฒด๊ฐ€ ํŒŒ์ง€ ํ›„ ์‚ฌ๋ผ์กŒ๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ์นด๋ฉ”๋ผ๋ฅผ ํ†ตํ•œ ๋ฌผ์ฒด ์ธ์‹์œผ๋กœ ํ™•์ธํ•˜์—ฌ ์ˆ˜ํ–‰ํ•œ๋‹ค. ํŒŒ์ง€ ์ธ์‹์€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํŒŒ์ง€ ์ธ์‹ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ํ•™์Šต๋˜๋ฉฐ, ์ด๋•Œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” ์นด๋ฉ”๋ผ์™€ ๊ทธ๋ฆฌํผ์˜ ์—ฌ๋Ÿฌ ์„ผ์„œ๋ฅผ ํ†ตํ•ด ์–ป์€ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ์‹ค๋‚ด ๊ฑฐ์‹ค ํ™˜๊ฒฝ์—์„œ ์ •๋ฆฌ์ •๋ˆ ๊ณผ์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์‹คํ—˜์„ ์„ค๊ณ„ํ•˜์˜€๋‹ค. HSR ๋กœ๋ด‡์„ ํ™œ์šฉํ•ด 11๊ฐœ์˜ ๋ฌผ์ฒด๋ฅผ ์ •๋ฆฌ์ •๋ˆํ•˜๋Š” ๋‘๊ฐ€์ง€ ๋น„๊ต์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๊ณ , ํŒŒ์ง€ ์ธ์‹ ๋„คํŠธ์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•œ ์‹คํ—˜์ด ์‚ฌ์šฉํ•˜์ง€ ์•Š์€ ์‹คํ—˜๋Œ€๋น„ ํŒŒ์ง€ ์‹คํŒจ๊ฐ€ 3ํšŒ, 5ํšŒ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ๊ณผ์—… ์ˆ˜ํ–‰ ์‹œ๊ฐ„์—์„œ ๊ฐ๊ฐ 10.7%์™€ 14.7%์˜ ํ–ฅ์ƒ์„ ๋ณด์—ฌ ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์˜ ํšจ์œจ์„ฑ์„ ์ž…์ฆํ•˜์˜€๋‹ค.In this paper, we proposed a new grasp perception method for mobile manipulation robot that utilizes both self-labeling and online learning. Self-labeling is implemented by using object detection as supervision, and online learning was achieved by training the model with a randomly sampled batch from a queue-based memory. For robust perception, the GPN model is trained by processing four types of sensory data, and shows high accuracy in performance with various objects. To demonstrate our self-labeling online learning framework, we designed a pick-and-place experiment in a real-world environment with everyday objects. We verified the effectiveness of the GPN by a comparative experiment that measured the task performance by comparing time within two demos: one using the GPN-trained model, and the other using a simple logical method. As a result, it was confirmed that using the GPN does contribute in saving time for picking and placing the objects, especially if more failures occur, or the time spent in delivering the objects increases.์ œ 1 ์žฅ ์„œ ๋ก  1 ์ œ 1 ์ ˆ ํŒŒ์ง€ ์ธ์‹ ์—ฐ๊ตฌ์˜ ํ•„์š”์„ฑ ๋ฐ ์—ฐ๊ตฌ ๋™ํ–ฅ 1 ์ œ 2 ์ ˆ ๋ฐ์ดํ„ฐ ๋ผ๋ฒจ๋ง์˜ ์ž๋™ํ™” ํ•„์š”์„ฑ ๋ฐ ๋ฐฉ์•ˆ ์ œ์‹œ 3 ์ œ 3 ์ ˆ ์—ฐ๊ตฌ์˜ ๋‚ด์šฉ 4 ์ œ 2 ์žฅ ๋ฐฐ๊ฒฝ ์—ฐ๊ตฌ 6 ์ œ 1 ์ ˆ ๋ฌผ์ฒด ํŒŒ์ง€ ์ธ์‹ 6 ์ œ 2 ์ ˆ ์˜จ๋ผ์ธ ํ•™์Šต 6 ์ œ 3 ์ ˆ ์ž๊ธฐ์ง€๋„ ํ•™์Šต๊ณผ ์…€ํ”„ ๋ผ๋ฒจ๋ง 7 ์ œ 3 ์žฅ ์…€ํ”„ ๋ผ๋ฒจ๋ง์„ ํ†ตํ•œ ํŒŒ์ง€ ์ธ์‹ ์˜จ๋ผ์ธ ํ•™์Šต 8 ์ œ 1 ์ ˆ ๋กœ๋ด‡์„ ์ด์šฉํ•œ ์…€ํ”„ ๋ผ๋ฒจ๋ง 8 ์ œ 2 ์ ˆ ๋ฉ”๋ชจ๋ฆฌ ๊ธฐ๋ฐ˜ ์˜จ๋ผ์ธ ํ•™์Šต 11 ์ œ 3 ์ ˆ ํŒŒ์ง€ ์ธ์‹ ๋„คํŠธ์›Œํฌ 12 ์ œ 4 ์žฅ ์‹คํ—˜ ์„ค์ • 13 ์ œ 1 ์ ˆ ๋กœ๋ด‡ ํ”Œ๋žซํผ 13 ์ œ 2 ์ ˆ ํŒŒ์ง€ ์ž‘์—…์„ ์œ„ํ•œ ๋ฌผ์ฒด ๋ชฉ๋ก 15 ์ œ 3 ์ ˆ RGB-D ์นด๋ฉ”๋ผ ๊ธฐ๋ฐ˜ ๊ฑฐ๋ฆฌ ๊ณ„์‚ฐ 18 ์ œ 4 ์ ˆ ์‹คํ—˜ ๋ฐฉ๋ฒ• 20 ์ œ 5 ์žฅ ์‹คํ—˜ ๊ฒฐ๊ณผ 22 ์ œ 1 ์ ˆ ์˜จ๋ผ์ธ ํ•™์Šต์„ ํ†ตํ•œ ํŒŒ์ง€ ์ธ์‹ ๋„คํŠธ์›Œํฌ ํ•™์Šต 22 ์ œ 2 ์ ˆ ํŒŒ์ง€ ์ธ์‹ ๋„คํŠธ์›Œํฌ ์‚ฌ์šฉ์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ ๋น„๊ต 25 ์ œ 6 ์žฅ ๊ณ ์ฐฐ ๋ฐ ๊ฒฐ๋ก  28 ์ œ 1 ์ ˆ ์—ฐ๊ตฌ์˜ ์ •๋ฆฌ 28 ์ œ 2 ์ ˆ ์—ฐ๊ตฌ์˜ ๊ณ ์ฐฐ 29 ์ฐธ๊ณ ๋ฌธํ—Œ 31 Abstract 38์„

    Starting engagement detection towards a companion robot using multimodal features

    Get PDF
    International audienceRecognition of intentions is a subconscious cognitive process vital to human communication. This skill enables anticipation and increases the quality of interactions between humans. Within the context of engagement, non-verbal signals are used to communicate the intention of starting the interaction with a partner. In this paper, we investigated methods to detect these signals in order to allow a robot to know when it is about to be addressed. Originality of our approach resides in taking inspiration from social and cognitive sciences to perform our perception task. We investigate meaningful features, i.e. human readable features, and elicit which of these are important for recognizing someone's intention of starting an interaction. Classically, spatial information like the human position and speed, the human-robot distance are used to detect the engagement. Our approach integrates multimodal features gathered using a companion robot equipped with a Kinect. The evaluation on our corpus collected in spontaneous conditions highlights its robustness and validates the use of such a technique in a real environment. Experimental validation shows that multimodal features set gives better precision and recall than using only spatial and speed features. We also demonstrate that 7 selected features are sufficient to provide a good starting engagement detection score. In our last investigation, we show that among our full 99 features set, the space reduction is not a solved task. This result opens new researches perspectives on multimodal engagement detection
    corecore