7,271 research outputs found
Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness
TrackletMapper: Ground Surface Segmentation and Mapping from Traffic Participant Trajectories
Robustly classifying ground infrastructure such as roads and street crossings
is an essential task for mobile robots operating alongside pedestrians. While
many semantic segmentation datasets are available for autonomous vehicles,
models trained on such datasets exhibit a large domain gap when deployed on
robots operating in pedestrian spaces. Manually annotating images recorded from
pedestrian viewpoints is both expensive and time-consuming. To overcome this
challenge, we propose TrackletMapper, a framework for annotating ground surface
types such as sidewalks, roads, and street crossings from object tracklets
without requiring human-annotated data. To this end, we project the robot
ego-trajectory and the paths of other traffic participants into the ego-view
camera images, creating sparse semantic annotations for multiple types of
ground surfaces from which a ground segmentation model can be trained. We
further show that the model can be self-distilled for additional performance
benefits by aggregating a ground surface map and projecting it into the camera
images, creating a denser set of training annotations compared to the sparse
tracklet annotations. We qualitatively and quantitatively attest our findings
on a novel large-scale dataset for mobile robots operating in pedestrian areas.
Code and dataset will be made available at
http://trackletmapper.cs.uni-freiburg.de.Comment: 19 pages, 14 figures, CoRL 2022 v
Recommended from our members
Supporting Story Synthesis: Bridging the Gap between Visual Analytics and Storytelling
Visual analytics usually deals with complex data and uses sophisticated algorithmic, visual, and interactive techniques. Findings of the analysis often need to be communicated to an audience that lacks visual analytics expertise. This requires analysis outcomes to be presented in simpler ways than that are typically used in visual analytics systems. However, not only analytical visualizations may be too complex for target audience but also the information that needs to be presented. Hence, there exists a gap on the path from obtaining analysis findings to communicating them, which involves two aspects: information and display complexity. We propose a general framework where data analysis and result presentation are linked by story synthesis, in which the analyst creates and organizes story contents. Differently, from the previous research, where analytic findings are represented by stored display states, we treat findings as data constructs. In story synthesis, findings are selected, assembled, and arranged in views using meaningful layouts that take into account the structure of information and inherent properties of its components. We propose a workflow for applying the proposed framework in designing visual analytics systems and demonstrate the generality of the approach by applying it to two domains, social media, and movement analysis
- …