9 research outputs found

    Optimized Segmentation of Cellular Tomography through Organelles' Morphology and Image Features

    Get PDF
    Computational tracing of cellular images generally requires painstaking job in optimizing parameter(s). By incorporating prior knowledge about the organelle’s morphology and image features, the required number of parameter tweaking can be reduced substantially. In practical applications, however, the general organelles’ features are often known in advance, yet the actual organelles’ morphology is not elaborated. Two primary contributions of this paper are firstly the classification of insulin granules based on its image features and morphology for accurate segmentation – mainly focused at pre-processing image segmentation and secondly the new hybrid meshing quantification is presented. The method proposed in this study is validated on a set of manually defined ground truths. The study of insulin granules in particular; the location, and its image features has also opened up other options for future studies

    An Improved Object Detection and Trajectory Prediction Method for Traffic Conflicts Analysis

    Get PDF
    Although computer vision-based methods have seen broad utilisation in evaluating traffic situations, there is a lack of research on the assessment and prediction of near misses in traffic. In addition, most object detection algorithms are not very good at detecting small targets. This study proposes a combination of object detection and tracking algorithms, Inverse Perspective Mapping (IPM), and trajectory prediction mechanisms to assess near-miss events. First, an instance segmentation head was proposed to improve the accuracy of the object frame box detection phase. Secondly, IPM was applied to all detection results. The relationship between them is then explored based on their distance to determine whether there is a near-miss event. In this process, the moving speed of the target was considered as a parameter. Finally, the Kalman filter is used to predict the object\u27s trajectory to determine whether there will be a near-miss in the next few seconds. Experiments on Closed-Circuit Television (CCTV) datasets showed results of 0.94 mAP compared to other state-of-the-art methods. In addition to improved detection accuracy, the advantages of instance segmentation fused object detection for small target detection are validated. Therefore, the results will be used to analyse near misses more accurately

    Towards Modelling Trust in Voice at Zero Acquaintance

    Get PDF
    Trust is essential in many human relationships, especially where there is an element of inter-dependency. However, humans tend to make quick judgements about trusting other individuals, even those met at zero acquaintance. Past studies have shown the significance of voice in perceived trustworthiness, but research associating trustworthiness and different vocal features such as speech rate and fundamental frequency (f0) has yet to yield consistent results. Therefore, this paper proposes a method to investigate 1) the association between trustworthiness and different vocal features, 2) the vocal characteristics that Malaysian ethnic groups base their judgement of trustworthiness on and 3) building a neural network model that predicts the degree of trustworthiness in a human voice. In the method proposed, a reliable set of audio clips will be obtained and analyzed with SoundGen to determine the acoustical characteristics. Then the audio clips will be distributed to a large group of untrained respondents to rate their degree of trust in the speakers of each audio clip. The participants will be able to choose from 30 sets of audio clips which will consist of 6 audio clips each. The acoustic characteristics will be analyzed and com-pared with the ratings to determine if there are any correlations between the acoustic characteristic and the trustworthiness ratings. After that, a neural network model will be built based on the collected data. The neural network model will be able to predict the trustworthiness of a person’s voice. Keywords—prosody, trust, voice, vocal cues, zero acquaintance

    EfficientNet-Lite and Hybrid CNN-KNN Implementation for Facial Expression Recognition on Raspberry Pi

    Get PDF
    Facial expression recognition (FER) is the task of determining a person’s current emotion. It plays an important role in healthcare, marketing, and counselling. With the advancement in deep learning algorithms like Convolutional Neural Network (CNN), the system’s accuracy is improving. A hybrid CNN and k-Nearest Neighbour (KNN) model can improve FER’s accuracy. This paper presents a hybrid CNN-KNN model for FER on the Raspberry Pi 4, where we use CNN for feature extraction. Subsequently, the KNN performs expression recognition. We use the transfer learning technique to build our system with an EfficientNet-Lite model. The hybrid model we propose replaces the Softmax layer in the EfficientNet with the KNN. We train our model using the FER-2013 dataset and compare its performance with different architectures trained on the same dataset. We perform optimization on the Fully Connected layer, loss function, loss optimizer, optimizer learning rate, class weights, and KNN distance function with the k-value. Despite running on the Raspberry Pi hardware with very limited processing power, low memory capacity, and small storage capacity, our proposed model achieves a similar accuracy of 75.26% (with a slight improvement of 0.06%) to the state-of-the-art’s Ensemble of 8 CNN model

    Attention Relational Network for Skeleton-Based Group Activity Recognition

    No full text
    Group activity recognition is a significant and challenging task in computer vision. The solution of group activity prediction can be classified with traditional hand-crafted features, RGB video features, and skeleton data-based deep learning architectures, such as Graph Convolutional Networks (GCNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTMs). However, they rarely explore pose information and rarely use relational networks to reason about group activity behavior. In this work, we leverage minimal prior knowledge about the skeleton information to reason about the interactions from group activity. The objective is to obtain discriminative representations and filter out some ambiguous actions to enhance the performance of group activity recognition. Our contribution is a proposed Attention Relation Network (ARN) that fuses the attention mechanisms and joint vector sequences into the relation network. The skeleton joints vector sequences are previously unexplored pose information and assign greater significance attributed to individuals who are more relevant for distinguishing the group activity behavior. First, our model focuses on the specified edge-level information (encompassing both edge and edge motion data) within the skeleton dataset, considering directionality, to analyze the spatiotemporal aspects of the action. Second, recognizing the inherent motion directionality, we establish diverse directions for skeleton edges and extract distinct motion features (including translation and rotation information) aligned with these various orientations, thereby augmenting the utilization of motion attributes related to the action. We also introduce a representation of human motion achieved by combining relational networks and examining their integrated characteristics. Extensive experiments were tested in the Hockey and UT-interaction datasets to evaluate our method, obtaining competitive performance to the state-of-the-art. Results demonstrate the modeling potential of a skeleton-based method for group activity recognition

    Effects of Different Parameter Settings for 3D Data Smoothing and Mesh Simplification on Near Real-Time 3D Reconstruction of High Resolution Bioceramic Bone Void Filling Medical Images

    No full text
    Three-dimensional reconstruction plays a vital role in assisting doctors and surgeons in diagnosing the healing progress of bone defects. Common three-dimensional reconstruction methods include surface and volume rendering. As the focus is on the shape of the bone, this study omits the volume rendering methods. Many improvements have been made to surface rendering methods like Marching Cubes and Marching Tetrahedra, but not many on working towards real-time or near real-time surface rendering for large medical images and studying the effects of different parameter settings for the improvements. Hence, this study attempts near real-time surface rendering for large medical images. Different parameter values are experimented on to study their effect on reconstruction accuracy, reconstruction and rendering time, and the number of vertices and faces. The proposed improvement involving three-dimensional data smoothing with convolution kernel Gaussian size 5 and mesh simplification reduction factor of 0.1 is the best parameter value combination for achieving a good balance between high reconstruction accuracy, low total execution time, and a low number of vertices and faces. It has successfully increased reconstruction accuracy by 0.0235%, decreased the total execution time by 69.81%, and decreased the number of vertices and faces by 86.57% and 86.61%, respectively

    Implementation of Kinetic and Kinematic Variables in Ergonomic Risk Assessment Using Motion Capture Simulation: A Review

    No full text
    Work-related musculoskeletal disorders (WMSDs) are among the most common disorders in any work sector and industry. Ergonomic risk assessment can reduce the risk of WMSDs. Motion capture that can provide accurate and real-time quantitative data has been widely used as a tool for ergonomic risk assessment. However, most ergonomic risk assessments that use motion capture still depend on the traditional ergonomic risk assessment method, focusing on qualitative data. Therefore, this article aims to provide a view on the ergonomic risk assessment and apply current motion capture technology to understand classical mechanics of physics that include velocity, acceleration, force, and momentum in ergonomic risk assessment. This review suggests that using motion capture technologies with kinetic and kinematic variables, such as velocity, acceleration, and force, can help avoid inconsistency and develop more reliable results in ergonomic risk assessment. Most studies related to the physical measurement conducted with motion capture prefer to use non-optical motion capture because it is a low-cost system and simple experimental setup. However, the present review reveals that optical motion capture can provide more accurate data

    Biomechanics Analysis of the Firefighters’ Thorax Movement on Personal Protective Equipment during Lifting Task Using Inertial Measurement Unit Motion Capture

    No full text
    Back injury is a common musculoskeletal injury reported among firefighters (FFs) due to their nature of work and personal protective equipment (PPE). The nature of the work associated with heavy lifting tasks increases FFs’ risk of back injury. This study aimed to assess the biomechanics movement of FFs on personal protective equipment during a lifting task. A set of questionnaires was used to identify the prevalence of musculoskeletal pain experienced by FFs. Inertial measurement unit (IMU) motion capture was used in this study to record the body angle deviation and angular acceleration of FFs’ thorax extension. The descriptive analysis was used to analyze the relationship between the FFs’ age and body mass index with the FFs’ thorax movement during the lifting task with PPE and without PPE. Sixty-three percent of FFs reported lower back pain during work, based on the musculoskeletal pain questionnaire. The biomechanics analysis of thorax angle deviation and angular acceleration has shown that using FFs PPE significantly causes restricted movement and limited mobility for the FFs. As regards human factors, the FFs’ age influences the angle deviation while wearing PPE and FFs’ BMI influences the angular acceleration without wearing PPE during the lifting activity

    SFFSORT Multi-Object Tracking by Shallow Feature Fusion for Vehicle Counting

    No full text
    Standard Multi-Object Tracking (MOT) frameworks are currently categorised into three categories: tracking-by-detection, joint detection, and tracking and attention mechanisms. Infrequently, the latter two frameworks require substantial computing resources. The difficulty of implementing real-time tracking does not apply to vehicle detection at traffic crossings. Not only is it essential to meet real-time requirements for vehicle tracking and detection at traffic intersections, but it is also necessary to address common MOT issues such as target occlusion, repetition technology, error detection, etc. Detection-based tracking has a great deal of potential. This study proposed a shallow feature fusion algorithm based on SORT, called SFFSORT and developed an innovative architecture for vehicle monitoring and counting based on detection tracking. This tracking method is more efficient than both SORT and DeepSORT. It achieved 60.9% MOTA and 65.5% IDF1 in MOT16 while MOTA achieved 60.1% and 64.7% IDF1 in MOT17. Utilizing this tracking method as a foundation, we have developed a vehicle counting framework and successfully implemented it on road traffic videos sourced from the Malaysian transportation department. The tracking algorithm presented here effectively addresses the tracking challenges arising from both detection errors and inaccuracies, providing a robust solution. The experimental findings demonstrate that the deep learning framework is capable of achieving lane-level vehicle counting even in scenarios with limited labelled data
    corecore