1,194 research outputs found
Attribute Artifacts Removal for Geometry-based Point Cloud Compression
Geometry-based point cloud compression (G-PCC) can achieve remarkable
compression efficiency for point clouds. However, it still leads to serious
attribute compression artifacts, especially under low bitrate scenarios. In
this paper, we propose a Multi-Scale Graph Attention Network (MS-GAT) to remove
the artifacts of point cloud attributes compressed by G-PCC. We first construct
a graph based on point cloud geometry coordinates and then use the Chebyshev
graph convolutions to extract features of point cloud attributes. Considering
that one point may be correlated with points both near and far away from it, we
propose a multi-scale scheme to capture the short- and long-range correlations
between the current point and its neighboring and distant points. To address
the problem that various points may have different degrees of artifacts caused
by adaptive quantization, we introduce the quantization step per point as an
extra input to the proposed network. We also incorporate a weighted graph
attentional layer into the network to pay special attention to the points with
more attribute artifacts. To the best of our knowledge, this is the first
attribute artifacts removal method for G-PCC. We validate the effectiveness of
our method over various point clouds. Objective comparison results show that
our proposed method achieves an average of 9.74% BD-rate reduction compared
with Predlift and 10.13% BD-rate reduction compared with RAHT. Subjective
comparison results present that visual artifacts such as color shifting,
blurring, and quantization noise are reduced
AutoSeqRec: Autoencoder for Efficient Sequential Recommendation
Sequential recommendation demonstrates the capability to recommend items by
modeling the sequential behavior of users. Traditional methods typically treat
users as sequences of items, overlooking the collaborative relationships among
them. Graph-based methods incorporate collaborative information by utilizing
the user-item interaction graph. However, these methods sometimes face
challenges in terms of time complexity and computational efficiency. To address
these limitations, this paper presents AutoSeqRec, an incremental
recommendation model specifically designed for sequential recommendation tasks.
AutoSeqRec is based on autoencoders and consists of an encoder and three
decoders within the autoencoder architecture. These components consider both
the user-item interaction matrix and the rows and columns of the item
transition matrix. The reconstruction of the user-item interaction matrix
captures user long-term preferences through collaborative filtering. In
addition, the rows and columns of the item transition matrix represent the item
out-degree and in-degree hopping behavior, which allows for modeling the user's
short-term interests. When making incremental recommendations, only the input
matrices need to be updated, without the need to update parameters, which makes
AutoSeqRec very efficient. Comprehensive evaluations demonstrate that
AutoSeqRec outperforms existing methods in terms of accuracy, while showcasing
its robustness and efficiency.Comment: 10 pages, accepted by CIKM 202
A computational model of visual attention.
Visual attention is a process by which the Human Visual System (HVS) selects most important information from a scene. Visual attention models are computational or mathematical models developed to predict this information. The performance of the state-of-the-art visual attention models is limited in terms of prediction accuracy and computational complexity. In spite of significant amount of active research in this area, modelling visual attention is still an open research challenge. This thesis proposes a novel computational model of visual attention that achieves higher prediction accuracy with low computational complexity. A new bottom-up visual attention model based on in-focus regions is proposed. To develop the model, an image dataset is created by capturing images with in-focus and out-of-focus regions. The Discrete Cosine Transform (DCT) spectrum of these images is investigated qualitatively and quantitatively to discover the key frequency coefficients that correspond to the in-focus regions. The model detects these key coefficients by formulating a novel relation between the in-focus and out-of-focus regions in the frequency domain. These frequency coefficients are used to detect the salient in-focus regions. The simulation results show that this attention model achieves good prediction accuracy with low complexity. The prediction accuracy of the proposed in-focus visual attention model is further improved by incorporating sensitivity of the HVS towards the image centre and the human faces. Moreover, the computational complexity is further reduced by using Integer Cosine Transform (ICT). The model is parameter tuned using the hill climbing approach to optimise the accuracy. The performance has been analysed qualitatively and quantitatively using two large image datasets with eye tracking fixation ground truth. The results show that the model achieves higher prediction accuracy with a lower computational complexity compared to the state-of-the-art visual attention models. The proposed model is useful in predicting human fixations in computationally constrained environments. Mainly it is useful in applications such as perceptual video coding, image quality assessment, object recognition and image segmentation
Applying psychological science to the CCTV review process: a review of cognitive and ergonomic literature
As CCTV cameras are used more and more often to increase security in communities, police are spending a larger proportion of their resources, including time, in processing CCTV images when investigating crimes that have occurred (Levesley & Martin, 2005; Nichols, 2001). As with all tasks, there are ways to approach this task that will facilitate performance and other approaches that will degrade performance, either by increasing errors or by unnecessarily prolonging the process. A clearer understanding of psychological factors influencing the effectiveness of footage review will facilitate future training in best practice with respect to the review of CCTV footage. The goal of this report is to provide such understanding by reviewing research on footage review, research on related tasks that require similar skills, and experimental laboratory research about the cognitive skills underpinning the task. The report is organised to address five challenges to effectiveness of CCTV review: the effects of the degraded nature of CCTV footage, distractions and interrupts, the length of the task, inappropriate mindset, and variability in people’s abilities and experience. Recommendations for optimising CCTV footage review include (1) doing a cognitive task analysis to increase understanding of the ways in which performance might be limited, (2) exploiting technology advances to maximise the perceptual quality of the footage (3) training people to improve the flexibility of their mindset as they perceive and interpret the images seen, (4) monitoring performance either on an ongoing basis, by using psychophysiological measures of alertness, or periodically, by testing screeners’ ability to find evidence in footage developed for such testing, and (5) evaluating the relevance of possible selection tests to screen effective from ineffective screener
Temporal unpredictability detection of real-time video sequence
Imperial Users onl
- …