73 research outputs found
A review of deep learning algorithms for computer vision systems in livestock.
In livestock operations, systematically monitoring animal body weight, bio-metric body measurements, animal behavior, feed bunk, and other difficult-to-measure phenotypes is manually unfeasible due to labor, costs, and animal stress. Applications of computer vision are growing in importance in livestock systems due to their ability to generate real-time, non-invasive, and accurate animal-level information. However, the development of a computer vision system requires sophisticated statistical and computational approaches for efficient data management and appropriate data mining, as it involves mas-sive datasets. This article aims to provide an overview of how deep learning has been implemented in computer vision systems used in livestock, and how such implementation can be an effective tool to predict animal phe-notypes and to accelerate the development of predictive modeling for precise management decisions. First, we reviewed the most recent milestones achieved with computer vision systems and its respective deep learning algorithms implemented in Animal Science studies. Second, we reviewed the published research studies in Animal Science, which used deep learning algorithms as the primary analytical strategy for image classification, object detection, object segmentation, and feature extraction. The great number of reviewed articles published in the last few years demonstrates the high interest and rapid development of deep learning algorithms in computer vision systems across livestock species. Deep learning algorithms for computer vision systems, such as Mask R-CNN, Faster R-CNN, YOLO (v3 and v4), DeepLab v3, U-Net and others have been used in Animal Science research studies. Additionally, network architectures such as ResNet, Inception, Xception, and VGG16 have been implemented in several studies across livestock species. The great performance of these deep learning algorithms suggests an33improved predictive ability in livestock applications and a faster inference.34However, only a few articles fully described the deep learning algorithms and its implementation. Thus, information regarding hyperparameter tuning, pre-trained weights, deep learning backbone, and hierarchical data structure were missed. We summarized peer-reviewed articles by computer vision tasks38(image classification, object detection, and object segmentation), deep learn-39ing algorithms, species, and phenotypes including animal identification and behavior, feed intake, animal body weight, and many others. Understanding the principles of computer vision and the algorithms used for each application is crucial to develop efficient systems in livestock operations. Such development will potentially have a major impact on the livestock industry by predicting real-time and accurate phenotypes, which could be used in the future to improve farm management decisions, breeding programs through high-throughput phenotyping, and optimized data-driven interventions
Automatic Monitoring of dairy cows’ lying behaviour using a computer vision system in open barns
Received: January 31st, 2023 ; Accepted: April 9th, 2023 ; Published: April 27th, 2023 ; Correspondence: [email protected] Livestock Farming offers opportunities for automated, continuous
monitoring of animals, their productivity, welfare and health. The video-based assessment of
animal behaviour is an automated, non-invasive and promising application. The aim of this study
is to identify possible parameters in dairy cows’ lying behaviour that are the basis for a holistic
computer vision-based system to assess animal health and welfare. Based on expert interviews
and a literature review, we define parameters and their optimum in form of gold standards to
evaluate lying behaviour automatically. These include quantitative parameters such as daily lying
time, lying period length, lying period frequency and qualitative parameters such as extension of
the front and hind legs, standing in the lying cubicles, or total lateral position. The lying behaviour
is an example within the research context for the development of a computer vision-based tool
for automated detection of animal behaviour and appropriate housing design
Advances in Sensors, Big Data and Machine Learning in Intelligent Animal Farming
Animal production (e.g., milk, meat, and eggs) provides valuable protein production for human beings and animals. However, animal production is facing several challenges worldwide such as environmental impacts and animal welfare/health concerns. In animal farming operations, accurate and efficient monitoring of animal information and behavior can help analyze the health and welfare status of animals and identify sick or abnormal individuals at an early stage to reduce economic losses and protect animal welfare. In recent years, there has been growing interest in animal welfare. At present, sensors, big data, machine learning, and artificial intelligence are used to improve management efficiency, reduce production costs, and enhance animal welfare. Although these technologies still have challenges and limitations, the application and exploration of these technologies in animal farms will greatly promote the intelligent management of farms. Therefore, this Special Issue will collect original papers with novel contributions based on technologies such as sensors, big data, machine learning, and artificial intelligence to study animal behavior monitoring and recognition, environmental monitoring, health evaluation, etc., to promote intelligent and accurate animal farm management
Detect to Track and Track to Detect
Recent approaches for high accuracy detection and tracking of object
categories in video consist of complex multistage solutions that become more
cumbersome each year. In this paper we propose a ConvNet architecture that
jointly performs detection and tracking, solving the task in a simple and
effective way. Our contributions are threefold: (i) we set up a ConvNet
architecture for simultaneous detection and tracking, using a multi-task
objective for frame-based object detection and across-frame track regression;
(ii) we introduce correlation features that represent object co-occurrences
across time to aid the ConvNet during tracking; and (iii) we link the frame
level detections based on our across-frame tracklets to produce high accuracy
detections at the video level. Our ConvNet architecture for spatiotemporal
object detection is evaluated on the large-scale ImageNet VID dataset where it
achieves state-of-the-art results. Our approach provides better single model
performance than the winning method of the last ImageNet challenge while being
conceptually much simpler. Finally, we show that by increasing the temporal
stride we can dramatically increase the tracker speed.Comment: ICCV 2017. Code and models:
https://github.com/feichtenhofer/Detect-Track Results:
https://www.robots.ox.ac.uk/~vgg/research/detect-track
Neural Baby Talk
We introduce a novel framework for image captioning that can produce natural
language explicitly grounded in entities that object detectors find in the
image. Our approach reconciles classical slot filling approaches (that are
generally better grounded in images) with modern neural captioning approaches
(that are generally more natural sounding and accurate). Our approach first
generates a sentence `template' with slot locations explicitly tied to specific
image regions. These slots are then filled in by visual concepts identified in
the regions by object detectors. The entire architecture (sentence template
generation and slot filling with object detectors) is end-to-end
differentiable. We verify the effectiveness of our proposed model on different
image captioning tasks. On standard image captioning and novel object
captioning, our model reaches state-of-the-art on both COCO and Flickr30k
datasets. We also demonstrate that our model has unique advantages when the
train and test distributions of scene compositions -- and hence language priors
of associated captions -- are different. Code has been made available at:
https://github.com/jiasenlu/NeuralBabyTalkComment: 12 pages, 7 figures, CVPR 201
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Deep learning (DL) enables the development of computer models that are
capable of learning, visualizing, optimizing, refining, and predicting data. In
recent years, DL has been applied in a range of fields, including audio-visual
data processing, agriculture, transportation prediction, natural language,
biomedicine, disaster management, bioinformatics, drug design, genomics, face
recognition, and ecology. To explore the current state of deep learning, it is
necessary to investigate the latest developments and applications of deep
learning in these disciplines. However, the literature is lacking in exploring
the applications of deep learning in all potential sectors. This paper thus
extensively investigates the potential applications of deep learning across all
major fields of study as well as the associated benefits and challenges. As
evidenced in the literature, DL exhibits accuracy in prediction and analysis,
makes it a powerful computational tool, and has the ability to articulate
itself and optimize, making it effective in processing data with no prior
training. Given its independence from training data, deep learning necessitates
massive amounts of data for effective analysis and processing, much like data
volume. To handle the challenge of compiling huge amounts of medical,
scientific, healthcare, and environmental data for use in deep learning, gated
architectures like LSTMs and GRUs can be utilized. For multimodal learning,
shared neurons in the neural network for all activities and specialized neurons
for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
Recommended from our members
Object Part Localization Using Exemplar-based Models
​Object part localization is a fundamental problem in computer vision, which aims to let machines understand object in an image as a configuration of parts. As the visual features at parts are usually weak and misleading, spatial models are needed to constrain the part configuration, ensuring that the estimated part locations respect both image cue and shape prior. Unlike most of the state-of-the-art techniques that employ parametric spatial models, we turn to non-parametric exemplars of part configurations. The benefit is twofold: instead of assuming any parametric yet imprecise distributions on the spatial relations of parts, exemplars literally encode such relations present in the training samples; exemplars allow us to prune the search space of part configurations with high confidence.
This thesis consists of two parts: fine-grained classification and object part localization. We first verify the efficacy of parts in fine-grained classification, where we build working systems that automatically identify dog breeds, fish species, and bird species using localized parts on the object. Then we explore multiple ways to enhance exemplar-based models, such that they can be well applied to deformable objects such as bird and human body. Specifically, we propose to enforce pose and subcategory consistency in exemplar matching, thus obtaining more reliable hypotheses of configuration. We also propose part-pair representation that features novel shape composing with multiple promising hypotheses. In the end, we adapt exemplars to hierarchical representation, and design a principled formulation to predict the part configuration based on multi-scale image cues and multi-level exemplars. These efforts consistently improve the accuracy of object part localization
A Comparative Study of YOLOv5 and YOLOv7 Modifications for Face Detection on a Custom Dataset
Face detection is a fundamental task in computer vision with applications spanning facial recognition, pose estimation, and human-robot interaction. This thesis presents a comprehensive comparative study of two modified versions of the YOLO (You Only Look Once) algorithm, YOLOv5face and YOLOv7face, tailored for landmark detection on a custom dataset of human faces. The study evaluates these models on various aspects, including architecture, accuracy, speed, generalization capa- bility, and specific features.
YOLOv5face strikes a balance between accuracy and speed, rendering it suitable for real-time or near-real-time appli- cations. Equipped with a landmark regression head, it ex- cels in tasks requiring precise facial landmark detection. YOLOv7face, on the other hand, outperforms YOLOv5face in accuracy, even in challenging conditions like occlusion and varying lighting. Its robustness positions it as a reliable choice for real-world applications.
The comparative analysis underscores the importance of selecting the right model based on specific require- ments. YOLOv5face offers efficiency and versatility, while YOLOv7face prioritizes accuracy and robustness. Future re- search directions include diversifying datasets, fine-tuning, real-world testing, efficiency improvements, and applications in human-robot interaction.
This study contributes to the advancement of facial keypoint detection algorithms and guides researchers and practitioners in choosing appropriate models for various computer vision tasks
Detecting cow behaviours associated with parturition using computer vision
Monitoring of dairy cows and their calf during parturition is essential in determining if there are any associated problems for mother and offspring. This is a critical period in the productive life of the mother and offspring. A difficult and assisted calving can impact on the subsequent milk production, health and fertility of a cow, and its potential survival. Furthermore, an alert to the need for any assistance would enhance animal and stockperson wellbeing. Manual monitoring of animal behaviour from images has been used for decades, but is very labour intensive. Recent technological advances in the field of Computer Vision based on the technique of Deep Learning have emerged, which now makes automated monitoring of surveillance video feeds feasible. The benefits of using image analysis compared to other monitoring systems is that image analysis relies upon neither transponder attachments, nor invasive tools and may provide more information at a relatively low cost. Image analysis can also detect and track the calf, which is not possible using other monitoring methods. Using cameras to monitor animals is commonly used, however, automated detection of behaviours is new especially for livestock.
Using the latest state-of-the-art techniques in Computer Vision, and in particular the ground-breaking technique of Deep Learning, this thesis develops a vision-based model to detect the progress of parturition in dairy cows. A large-scale dataset of cow behaviour annotations was created, which included over 46 individual cow calvings and is approximately 690 hours of video footage with over 2.5k of video clips, each between 3-10 seconds. The model was trained on seven different behaviours, which included standing, walking, shuffle, lying, eating, drinking, and contractions while lying. The developed network correctly classified the seven behaviours with an accuracy of between 80 to 95%. The accuracy in predicting contractions while lying down was 83%, which in itself can be an early warning calving alert, as all cows start contractions one to two hours before giving birth. The performance of the model developed was also comparable to methods for human action classification using the Kinetics dataset
- …