Search CORE

130 research outputs found

Effective Uni-Modal to Multi-Modal Crowd Estimation based on Deep Neural Networks

Author: SAJID USMAN
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2021
Field of study

Crowd estimation is a vital component of crowd analysis. It finds many applications in real-worldscenarios, e.g. huge gatherings management like Hajj, sporting and musical events, or political rallies. Automated crowd counting facilitates better and effective management of such events and consequently prevents any undesired situation. This is a very challenging problem in practice since there exists a significant difference in the crowd number in and across different images, varying image resolution, large perspective, severe occlusions, and dense crowd-like cluttered background regions. Current approaches do not handle huge crowd diversity well and thus perform poorly in cases ranging from extreme low to high crowd-density, thus, yielding huge crowd underestimation or overestimation. Also, manual crowd counting proves to be infeasible due to very slow and inaccurate results. To address these major crowd counting issues and challenges, we investigate two different types of input data: uni-modal (image) and multi-modal (image and audio). In the uni-modal setting, we propose and analyze four novel end-to-end crowd counting networks, ranging from multi-scale fusion-based models to uni-scale one-pass and two-pass multitask networks. The multi-scale networks employ the attention mechanism to enhance the model efficacy. On the other hand, the uni-scale models are well-equipped with novel and simple-yet effective patch re-scaling module (PRM) that functions identical but is more lightweight than multi-scale approaches. Experimental evaluation demonstrates that the proposed networks outperform the state-of-the-art in majority cases on four different benchmark datasets with up to 12.6% improvement for the RMSE evaluation metric. The better cross-dataset performance also validates the better generalization ability of our schemes. For the multi-modal input, effective feature-extraction (FE) and strong information fusion between two modalities remain a big challenge. Thus, the multi-modal novel network design focuses on investigating different features fusion techniques amid improving the FE. Based on the comprehensive experimental evaluation, the proposed multi-modal network increases the performance under all standard evaluation criteria with up to 33.8% improvement in comparison to the state-of-the-art. The application of multi-scale uni-modal attention networks also proves more effective in other deep learning domains, as demonstrated successfully on seven different scene-text recognition task datasets with better performance

KU ScholarWorks

Crowd Localization from Gaussian Mixture Scoped Knowledge and Scoped Teacher

Author: Gao Junyu
Wang Juncheng
Wang Qi
Yuan Yuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/02/2023
Field of study

Crowd localization is to predict each instance head position in crowd scenarios. Since the distance of instances being to the camera are variant, there exists tremendous gaps among scales of instances within an image, which is called the intrinsic scale shift. The core reason of intrinsic scale shift being one of the most essential issues in crowd localization is that it is ubiquitous in crowd scenes and makes scale distribution chaotic. To this end, the paper concentrates on access to tackle the chaos of the scale distribution incurred by intrinsic scale shift. We propose Gaussian Mixture Scope (GMS) to regularize the chaotic scale distribution. Concretely, the GMS utilizes a Gaussian mixture distribution to adapt to scale distribution and decouples the mixture model into sub-normal distributions to regularize the chaos within the sub-distributions. Then, an alignment is introduced to regularize the chaos among sub-distributions. However, despite that GMS is effective in regularizing the data distribution, it amounts to dislodging the hard samples in training set, which incurs overfitting. We assert that it is blamed on the block of transferring the latent knowledge exploited by GMS from data to model. Therefore, a Scoped Teacher playing a role of bridge in knowledge transform is proposed. What' s more, the consistency regularization is also introduced to implement knowledge transform. To that effect, the further constraints are deployed on Scoped Teacher to derive feature consistence between teacher and student end. With proposed GMS and Scoped Teacher implemented on five mainstream datasets of crowd localization, the extensive experiments demonstrate the superiority of our work. Moreover, comparing with existing crowd locators, our work achieves state-of-the-art via F1-meansure comprehensively on five datasets.Comment: Accepted by IEEE TI

arXiv.org e-Print Archive

Deep learning in crowd counting: A survey

Author: Deng Lijia
Gorriz Sáez Juan Manuel
Wang Shuihua
Zhang Yudong
Zhou Qinghua
Publication venue: Wiley
Publication date: 14/06/2023
Field of study

Counting high-density objects quickly and accurately is a popular area of research. Crowd counting has significant social and economic value and is a major focus in artificial intelligence. Despite many advancements in this field, many of them are not widely known, especially in terms of research data. The authors proposed a three-tier standardised dataset taxonomy (TSDT). The Taxonomy divides datasets into small-scale, large-scale and hyper-scale, according to different application scenarios. This theory can help researchers make more efficient use of datasets and improve the performance of AI algorithms in specific fields. Additionally, the authors proposed a new evaluation index for the clarity of the dataset: average pixel occupied by each object (APO). This new evaluation index is more suitable for evaluating the clarity of the dataset in the object counting task than the image resolution. Moreover, the authors classified the crowd counting methods from a data-driven perspective: multi-scale networks, single-column networks, multi-column networks, multi-task networks, attention networks and weak-supervised networks and introduced the classic crowd counting methods of each class. The authors classified the existing 36 datasets according to the theory of three-tier standardised dataset taxonomy and discussed and evaluated these datasets. The authors evaluated the performance of more than 100 methods in the past five years on different levels of popular datasets. Recently, progress in research on small-scale datasets has slowed down. There are few new datasets and algorithms on small-scale datasets. The studies focused on large or hyper-scale datasets appear to be reaching a saturation point. The combined use of multiple approaches began to be a major research direction. The authors discussed the theoretical and practical challenges of crowd counting from the perspective of data, algorithms and computing resources. The field of crowd counting is moving towards combining multiple methods and requires fresh, targeted datasets. Despite advancements, the field still faces challenges such as handling real-world scenarios and processing large crowds in real-time. Researchers are exploring transfer learning to overcome the limitations of small datasets. The development of effective algorithms for crowd counting remains a challenging and important task in computer vision and AI, with many opportunities for future research.BHF, AA/18/3/34220Hope Foundation for Cancer Research, RM60G0680GCRF, P202PF11;Sino‐UK Industrial Fund, RP202G0289LIAS, P202ED10, P202RE969Data Science Enhancement Fund, P202RE237Sino‐UK Education Fund, OP202006Fight for Sight, 24NN201Royal Society International Exchanges Cost Share Award, RP202G0230MRC, MC_PC_17171BBSRC, RM32G0178B

Repositorio Institucional Universidad de Granada

Colonoscopy polyp detection and classification: Dataset creation and comparative evaluations

Author: Bansal Ajay
Fathan Mohammad I
Li Kaidong
Patel Krushi
Rastogi Amit
Wang Guanghui
Wang Jean S
Zhang Tianxiao
Zhong Cuncong
Publication venue: Digital Commons@Becker
Publication date: 01/01/2021
Field of study

Colorectal cancer (CRC) is one of the most common types of cancer with a high mortality rate. Colonoscopy is the preferred procedure for CRC screening and has proven to be effective in reducing CRC mortality. Thus, a reliable computer-aided polyp detection and classification system can significantly increase the effectiveness of colonoscopy. In this paper, we create an endoscopic dataset collected from various sources and annotate the ground truth of polyp location and classification results with the help of experienced gastroenterologists. The dataset can serve as a benchmark platform to train and evaluate the machine learning models for polyp classification. We have also compared the performance of eight state-of-the-art deep learning-based object detection models. The results demonstrate that deep CNN models are promising in CRC screening. This work can serve as a baseline for future research in polyp detection and classification

KU ScholarWorks

Digital Commons@Becker

PubMed Central

MolecularRift, a Gesture Based Interaction Tool for Controlling Molecules in 3-D

Author: Norrby Magnus
Publication venue: Lunds universitet/Ergonomi och aerosolteknologi
Publication date: 01/01/2015
Field of study

Visualization of molecular models is a vital part in modern drug design. Improved visualization methods increases the conceptual understanding and enables faster and better decision making. The introduction of virtual reality goggles such as Oculus Rift has introduced new opportunities for the capabilities of such visualisations. A new interactive visualization tool (MolecularRift), which lets the user experience molecular models in a virtual reality environment, was developed in collaboration with AstraZeneca. In an attempt to create a more natural way to interact with the tool, users can steer and control molecules through hand gestures. The gestures are recorded using depth data from a Mircosoft Kinect v2 sensor and interpreted using per pixel algorithms, which only focus on the captured frames thus freeing the user from additional devices such as cursor, keyboard, touchpad or even piezoresistive gloves. MolecularRift was developed from a usability perspective using an iterative developing process and test group evaluations. The iterations allowed an agile process where features easily could be evaluated to monitor behavior and performance, resulting in a user-optimized tool. We conclude with reflections on virtual reality's capabilities in chemistry and possibilities for future projects.Virtual reality är framtiden. Nya tekniker utvecklas konstant och parallellt med att datakapaciteten förbättras finner vi nya sätt att använda dem ihop. Vi har utvecklat ett nytt interaktivt visualiserings verktyg (Molecular Rift) som låter användaren uppleva molekylära modeller i en virtuell verklighet. I dagens medicinindustri är man i ständigt behov av nya metoder för att visualisera potentiella läkemedel i 3-D. Det finns flera verktyg idag som används för att visualisera molekyler i 3-D stereo. Våra nyframtagna tekniker inom virtuell verklighet presenterar möjligheter för medicinutvecklare att ”gå in” i de molekylära strukturerna och uppleva dem på ett helt nytt sätt

Capillary flow of dense colloidal suspensions

Author: Isa Lucio
Publication venue
Publication date: 01/01/2008
Field of study

The purpose of this thesis is to study the flow of dense colloidal suspensions into micronsized capillaries at the particle level. Understanding the flow of complex fluids in terms of their constituents (colloids, polymers, or surfactants) poses deep fundamental challenges, and has wide applications in many industrial processes. Through the use of a novel experimental procedure we find results contrasting with the predicted bulk rheological behaviour of dense colloidal systems and propose an alternative approach based on the analogy with granular systems. Quantitative predictions which successfully explain the data are obtained. In order to obtain quantitative information on the dynamics of the system, we image the flow using a fast confocal microscope and identify the trajectories of each particle. Due to the nature of the flow, conventional techniques for locating and tracking the particles fail to yield satisfactory results. To overcome this limitation, we have developed a novel technique which allows us to successfully track the particles in strongly non-uniform flow fields (published, 2006). We focus our attention on three main aspects of the flow: concentration gradients, velocity profiles and time behaviour. We initially discuss the occurrence of concentration gradients along the flow direction and relate them to the local flow profiles. We observe high density regions where the velocity is uniform across the channel (complete plugs) and lower density regions where shear is present. The observed concentration profiles can be qualitatively explained by considering the relative ow between the solvent and the suspended particles. The flow profiles in the presence of shear consist of a plug in the centre while shear occurs localized adjacent to the channel walls, reminiscent of yield-stress fluid behaviour. However, the observed scaling of the velocity profiles with the flow rate strongly contrasts yield-stress fluid predictions. Instead, the velocity profiles can be captured by a theory of stress fluctuations originally developed for chute flow of dry granular media (published ,2007). We extend the model to our case and discuss it as a function of a series of parameters (boundary conditions, volume fraction, channel size, etc.) highlighting differences and similarities with granular media. Finally we discuss the time behaviour of complete plug flows relating it to the microscopic dynamics of the particles. At variance with dilute systems, dense systems exhibit velocity fluctuations when driven into channels by a constant pressure difference. We find that there exists a threshold value of the flow rate below which oscillations in the velocity are absent and above which their frequency scales as a power law of the flow rate. Despite quantitative predictions on this issue that are still missing, we present a microscopic description of the phenomenon highlighting the interplay between the particles and the solvent

Edinburgh Research Archive

Reframing 'refugeeness' through the digital in the UK context:Beyond statist ways of seeing

Author: Kaur Seerat
Publication venue
Publication date: 09/05/2023
Field of study

Explore Bristol Research