12 research outputs found
Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity
Frequent interactions between individuals are a fundamental challenge for
pose estimation algorithms. Current pipelines either use an object detector
together with a pose estimator (top-down approach), or localize all body parts
first and then link them to predict the pose of individuals (bottom-up). Yet,
when individuals closely interact, top-down methods are ill-defined due to
overlapping individuals, and bottom-up methods often falsely infer connections
to distant bodyparts. Thus, we propose a novel pipeline called bottom-up
conditioned top-down pose estimation (BUCTD) that combines the strengths of
bottom-up and top-down methods. Specifically, we propose to use a bottom-up
model as the detector, which in addition to an estimated bounding box provides
a pose proposal that is fed as condition to an attention-based top-down model.
We demonstrate the performance and efficiency of our approach on animal and
human pose estimation benchmarks. On CrowdPose and OCHuman, we outperform
previous state-of-the-art models by a significant margin. We achieve 78.5 AP on
CrowdPose and 48.5 AP on OCHuman, an improvement of 8.6% and 7.8% over the
prior art, respectively. Furthermore, we show that our method strongly improves
the performance on multi-animal benchmarks involving fish and monkeys. The code
is available at https://github.com/amathislab/BUCTDComment: Published at ICCV 2023; Code at https://github.com/amathislab/BUCTD
Video at https://www.youtube.com/watch?v=BHZnA-CZeZ
Learnable latent embeddings for joint behavioral and neural analysis
Mapping behavioral actions to neural activity is a fundamental goal of
neuroscience. As our ability to record large neural and behavioral data
increases, there is growing interest in modeling neural dynamics during
adaptive behaviors to probe neural representations. In particular, neural
latent embeddings can reveal underlying correlates of behavior, yet, we lack
non-linear techniques that can explicitly and flexibly leverage joint behavior
and neural data. Here, we fill this gap with a novel method, CEBRA, that
jointly uses behavioral and neural data in a hypothesis- or discovery-driven
manner to produce consistent, high-performance latent spaces. We validate its
accuracy and demonstrate our tool's utility for both calcium and
electrophysiology datasets, across sensory and motor tasks, and in simple or
complex behaviors across species. It allows for single and multi-session
datasets to be leveraged for hypothesis testing or can be used label-free.
Lastly, we show that CEBRA can be used for the mapping of space, uncovering
complex kinematic features, and rapid, high-accuracy decoding of natural movies
from visual cortex.Comment: Website: cebra.a
Perspectives in machine learning for wildlife conservation
Data acquisition in animal ecology is rapidly accelerating due to inexpensive
and accessible sensors such as smartphones, drones, satellites, audio recorders
and bio-logging devices. These new technologies and the data they generate hold
great potential for large-scale environmental monitoring and understanding, but
are limited by current data processing approaches which are inefficient in how
they ingest, digest, and distill data into relevant information. We argue that
machine learning, and especially deep learning approaches, can meet this
analytic challenge to enhance our understanding, monitoring capacity, and
conservation of wildlife species. Incorporating machine learning into
ecological workflows could improve inputs for population and behavior models
and eventually lead to integrated hybrid modeling tools, with ecological models
acting as constraints for machine learning models and the latter providing
data-supported insights. In essence, by combining new machine learning
approaches with ecological domain knowledge, animal ecologists can capitalize
on the abundance of data generated by modern sensor technologies in order to
reliably estimate population abundances, study animal behavior and mitigate
human/wildlife conflicts. To succeed, this approach will require close
collaboration and cross-disciplinary education between the computer science and
animal ecology communities in order to ensure the quality of machine learning
approaches and train a new generation of data scientists in ecology and
conservation
A new spin on fidgets
We express decisions through movements, but not all movements matter to the outcome. For example, fidgeting is a common yet ânonessentialâ behavior we exhibit. New evidence suggests that this non-task-related movement profoundly shapes neural activity in expert mice performing tasks
Somatosensory Cortex Plays an Essential Role in Forelimb Motor Adaptation in Mice
Our motor outputs are constantly re-calibrated to adapt to systematic perturbations. This motor adaptation is thought to depend on the ability to form a memory of a systematic perturbation, often called an internal model. However, the mechanisms underlying the formation, storage, and expression of such models remain unknown. Here, we developed a mouse model to study forelimb adaptation to force field perturbations. We found that temporally precise photoinhibition of somatosensory cortex (S1) applied concurrently with the force field abolished the ability to update subsequent motor commands needed to reduce motor errors. This S1 photoinhibition did not impair basic motor patterns, post-perturbation completion of the action, or their performance in a reward-based learning task. Moreover, S1 photoinhibition after partial adaptation blocked further adaptation, but did not affect the expression of already-adapted motor commands. Thus, S1 is critically involved in updating the memory about the perturbation that is essential for forelimb motor adaptation
Panoptic animal pose estimators are zero-shot performers
Animal pose estimation is critical in applications ranging from life science
research, agriculture, to veterinary medicine. Compared to human pose
estimation, the performance of animal pose estimation is limited by the size of
available datasets and the generalization of a model across datasets. Typically
different keypoints are labeled regardless of whether the species are the same
or not, leaving animal pose datasets to have disjoint or partially overlapping
keypoints. As a consequence, a model cannot be used as a plug-and-play solution
across datasets. This reality motivates us to develop panoptic animal pose
estimation models that are able to predict keypoints defined in all datasets.
In this work we propose a simple yet effective way to merge differentially
labeled datasets to obtain the largest quadruped and lab mouse pose dataset.
Using a gradient masking technique, so called SuperAnimal-models are able to
predict keypoints that are distributed across datasets and exhibit strong
zero-shot performance. The models can be further improved by (pseudo) labeled
fine-tuning. These models outperform ImageNet-initialized models
Contrasting action and posture coding with hierarchical deep neural network models of proprioception
Biological motor control is versatile, efficient, and depends on proprioceptive feedback. Muscles are flexible and undergo continuous changes, requiring distributed adaptive control mechanisms that continuously account for the bodyâs state. The canonical role of proprioception is representing the body state. We hypothesize that the proprioceptive system could also be critical for high-level tasks such as action recognition. To test this theory, we pursued a task-driven modeling approach, which allowed us to isolate the study of proprioception. We generated a large synthetic dataset of human arm trajectories tracing characters of the Latin alphabet in 3D space, together with muscle activities obtained from a musculoskeletal model and model-based muscle spindle activity. Next, we compared two classes of tasks: trajectory decoding and action recognition, which allowed us to train hierarchical models to decode either the position and velocity of the end-effector of oneâs posture or the character (action) identity from the spindle firing patterns. We found that artificial neural networks could robustly solve both tasks, and the networksâ units show tuning properties similar to neurons in the primate somatosensory cortex and the brainstem. Remarkably, we found uniformly distributed directional selective units only with the action-recognition-trained models and not the trajectory-decoding-trained models. This suggests that proprioceptive encoding is additionally associated with higher-level functions such as action recognition and therefore provides new, experimentally testable hypotheses of how proprioception aids in adaptive motor control
Task-driven hierarchical deep neural network models of the proprioceptive pathway
Biological motor control is versatile and efficient. Muscles are flexible and undergo continuous changes requiring distributed adaptive control mechanisms. How proprioception solves this problem in the brain is unknown. Here we pursue a task-driven modeling approach that has provided important insights into other sensory systems. However, unlike for vision and audition where large annotated datasets of raw images or sound are readily available, data of relevant proprioceptive stimuli are not. We generated a large-scale dataset of human arm trajectories as the hand is tracing the alphabet in 3D space, then using a musculoskeletal model derived the spindle firing rates during these movements. We propose an action recognition task that allows training of hierarchical models to classify the character identity from the spindle firing patterns. Artificial neural networks could robustly solve this task, and the networksâ units show directional movement tuning akin to neurons in the primate somatosensory cortex. The same architectures with random weights also show similar kinematic feature tuning but do not reproduce the diversity of preferred directional tuning nor do they have invariant tuning across 3D space. Taken together our model is the first to link tuning properties in the proprioceptive system to the behavioral level
DeepLabCut: markerless pose estimation of user-defined body parts with deep learning
Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, but markers are intrusive, and the number and location of the markers must be determined a priori. Here we present an efficient method for markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors. Remarkably, even when only a small number of frames are labeled (~200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy