1,864 research outputs found

    Self-Selective Correlation Ship Tracking Method for Smart Ocean System

    Full text link
    In recent years, with the development of the marine industry, navigation environment becomes more complicated. Some artificial intelligence technologies, such as computer vision, can recognize, track and count the sailing ships to ensure the maritime security and facilitates the management for Smart Ocean System. Aiming at the scaling problem and boundary effect problem of traditional correlation filtering methods, we propose a self-selective correlation filtering method based on box regression (BRCF). The proposed method mainly include: 1) A self-selective model with negative samples mining method which effectively reduces the boundary effect in strengthening the classification ability of classifier at the same time; 2) A bounding box regression method combined with a key points matching method for the scale prediction, leading to a fast and efficient calculation. The experimental results show that the proposed method can effectively deal with the problem of ship size changes and background interference. The success rates and precisions were higher than Discriminative Scale Space Tracking (DSST) by over 8 percentage points on the marine traffic dataset of our laboratory. In terms of processing speed, the proposed method is higher than DSST by nearly 22 Frames Per Second (FPS)

    Security and blockchain convergence with internet of multimedia things : current trends, research challenges and future directions

    Get PDF
    The Internet of Multimedia Things (IoMT) orchestration enables the integration of systems, software, cloud, and smart sensors into a single platform. The IoMT deals with scalar as well as multimedia data. In these networks, sensor-embedded devices and their data face numerous challenges when it comes to security. In this paper, a comprehensive review of the existing literature for IoMT is presented in the context of security and blockchain. The latest literature on all three aspects of security, i.e., authentication, privacy, and trust is provided to explore the challenges experienced by multimedia data. The convergence of blockchain and IoMT along with multimedia-enabled blockchain platforms are discussed for emerging applications. To highlight the significance of this survey, large-scale commercial projects focused on security and blockchain for multimedia applications are reviewed. The shortcomings of these projects are explored and suggestions for further improvement are provided. Based on the aforementioned discussion, we present our own case study for healthcare industry: a theoretical framework having security and blockchain as key enablers. The case study reflects the importance of security and blockchain in multimedia applications of healthcare sector. Finally, we discuss the convergence of emerging technologies with security, blockchain and IoMT to visualize the future of tomorrow's applications. © 2020 Elsevier Lt

    Applying psychological science to the CCTV review process: a review of cognitive and ergonomic literature

    Get PDF
    As CCTV cameras are used more and more often to increase security in communities, police are spending a larger proportion of their resources, including time, in processing CCTV images when investigating crimes that have occurred (Levesley & Martin, 2005; Nichols, 2001). As with all tasks, there are ways to approach this task that will facilitate performance and other approaches that will degrade performance, either by increasing errors or by unnecessarily prolonging the process. A clearer understanding of psychological factors influencing the effectiveness of footage review will facilitate future training in best practice with respect to the review of CCTV footage. The goal of this report is to provide such understanding by reviewing research on footage review, research on related tasks that require similar skills, and experimental laboratory research about the cognitive skills underpinning the task. The report is organised to address five challenges to effectiveness of CCTV review: the effects of the degraded nature of CCTV footage, distractions and interrupts, the length of the task, inappropriate mindset, and variability in people’s abilities and experience. Recommendations for optimising CCTV footage review include (1) doing a cognitive task analysis to increase understanding of the ways in which performance might be limited, (2) exploiting technology advances to maximise the perceptual quality of the footage (3) training people to improve the flexibility of their mindset as they perceive and interpret the images seen, (4) monitoring performance either on an ongoing basis, by using psychophysiological measures of alertness, or periodically, by testing screeners’ ability to find evidence in footage developed for such testing, and (5) evaluating the relevance of possible selection tests to screen effective from ineffective screener

    Improving The Efficiency Of Video Transmission In Computer Networks

    Get PDF
    In-depth examination of current techniques for enhancing the efficiency of video transmission over digital networks is provided in this study. Due to the growing need for high-quality video content, optimizing video transmission is an important area of research. This review categorizes and in-depth examines a range of methods proposed in the literature to enhance video transmission effectiveness. ABR, DNN architecture, adaptive streaming, Quality of Service (QoS), error resilience, congestion control, video compression, and hardware acceleration for video provisioning are just a few of the cutting-edge techniques that are covered in the discussion, which ranges from the more traditional to the cutting-edge. This essay provides a methodical evaluation of the numerous tactics that are available, along with an analysis of their guiding principles, advantages, and disadvantages. The paper also offers a comparative analysis of various approaches, highlighting trends, gaps, and potential future research directions in this crucial domain, all of which help to create more efficient video compression and transmission paradigms in computer networks

    Scalable Video Coding for Humans and Machines

    Full text link
    Video content is watched not only by humans, but increasingly also by machines. For example, machine learning models analyze surveillance video for security and traffic monitoring, search through YouTube videos for inappropriate content, and so on. In this paper, we propose a scalable video coding framework that supports machine vision (specifically, object detection) through its base layer bitstream and human vision via its enhancement layer bitstream. The proposed framework includes components from both conventional and Deep Neural Network (DNN)-based video coding. The results show that on object detection, the proposed framework achieves 13-19% bit savings compared to state-of-the-art video codecs, while remaining competitive in terms of MS-SSIM on the human vision task.Comment: 6 pages, 5 figures, IEEE MMSP 202

    Prioritizing Content of Interest in Multimedia Data Compression

    Get PDF
    Image and video compression techniques make data transmission and storage in digital multimedia systems more efficient and feasible for the system's limited storage and bandwidth. Many generic image and video compression techniques such as JPEG and H.264/AVC have been standardized and are now widely adopted. Despite their great success, we observe that these standard compression techniques are not the best solution for data compression in special types of multimedia systems such as microscopy videos and low-power wireless broadcast systems. In these application-specific systems where the content of interest in the multimedia data is known and well-defined, we should re-think the design of a data compression pipeline. We hypothesize that by identifying and prioritizing multimedia data's content of interest, new compression methods can be invented that are far more effective than standard techniques. In this dissertation, a set of new data compression methods based on the idea of prioritizing the content of interest has been proposed for three different kinds of multimedia systems. I will show that the key to designing efficient compression techniques in these three cases is to prioritize the content of interest in the data. The definition of the content of interest of multimedia data depends on the application. First, I show that for microscopy videos, the content of interest is defined as the spatial regions in the video frame with pixels that don't only contain noise. Keeping data in those regions with high quality and throwing out other information yields to a novel microscopy video compression technique. Second, I show that for a Bluetooth low energy beacon based system, practical multimedia data storage and transmission is possible by prioritizing content of interest. I designed custom image compression techniques that preserve edges in a binary image, or foreground regions of a color image of indoor or outdoor objects. Last, I present a new indoor Bluetooth low energy beacon based augmented reality system that integrates a 3D moving object compression method that prioritizes the content of interest.Doctor of Philosoph

    Domain-Specific Computing Architectures and Paradigms

    Full text link
    We live in an exciting era where artificial intelligence (AI) is fundamentally shifting the dynamics of industries and businesses around the world. AI algorithms such as deep learning (DL) have drastically advanced the state-of-the-art cognition and learning capabilities. However, the power of modern AI algorithms can only be enabled if the underlying domain-specific computing hardware can deliver orders of magnitude more performance and energy efficiency. This work focuses on this goal and explores three parts of the domain-specific computing acceleration problem; encapsulating specialized hardware and software architectures and paradigms that support the ever-growing processing demand of modern AI applications from the edge to the cloud. This first part of this work investigates the optimizations of a sparse spatio-temporal (ST) cognitive system-on-a-chip (SoC). This design extracts ST features from videos and leverages sparse inference and kernel compression to efficiently perform action classification and motion tracking. The second part of this work explores the significance of dataflows and reduction mechanisms for sparse deep neural network (DNN) acceleration. This design features a dynamic, look-ahead index matching unit in hardware to efficiently discover fine-grained parallelism, achieving high energy efficiency and low control complexity for a wide variety of DNN layers. Lastly, this work expands the scope to real-time machine learning (RTML) acceleration. A new high-level architecture modeling framework is proposed. Specifically, this framework consists of a set of high-performance RTML-specific architecture design templates, and a Python-based high-level modeling and compiler tool chain for efficient cross-stack architecture design and exploration.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162870/1/lchingen_1.pd

    Identification, indexing, and retrieval of cardio-pulmonary resuscitation (CPR) video scenes of simulated medical crisis.

    Get PDF
    Medical simulations, where uncommon clinical situations can be replicated, have proved to provide a more comprehensive training. Simulations involve the use of patient simulators, which are lifelike mannequins. After each session, the physician must manually review and annotate the recordings and then debrief the trainees. This process can be tedious and retrieval of specific video segments should be automated. In this dissertation, we propose a machine learning based approach to detect and classify scenes that involve rhythmic activities such as Cardio-Pulmonary Resuscitation (CPR) from training video sessions simulating medical crises. This applications requires different preprocessing techniques from other video applications. In particular, most processing steps require the integration of multiple features such as motion, color and spatial and temporal constrains. The first step of our approach consists of segmenting the video into shots. This is achieved by extracting color and motion information from each frame and identifying locations where consecutive frames have different features. We propose two different methods to identify shot boundaries. The first one is based on simple thresholding while the second one uses unsupervised learning techniques. The second step of our approach consists of selecting one key frame from each shot and segmenting it into homogeneous regions. Then few regions of interest are identified for further processing. These regions are selected based on the type of motion of their pixels and their likelihood to be skin-like regions. The regions of interest are tracked and a sequence of observations that encode their motion throughout the shot is extracted. The next step of our approach uses an HMM classiffier to discriminate between regions that involve CPR actions and other regions. We experiment with both continuous and discrete HMM. Finally, to improve the accuracy of our system, we also detect faces in each key frame, track them throughout the shot, and fuse their HMM confidence with the region\u27s confidence. To allow the user to view and analyze the video training session much more efficiently, we have also developed a graphical user interface (GUI) for CPR video scene retrieval and analysis with several desirable features. To validate our proposed approach to detect CPR scenes, we use one video simulation session recorded by the SPARC group to train the HMM classifiers and learn the system\u27s parameters. Then, we analyze the proposed system on other video recordings. We show that our approach can identify most CPR scenes with few false alarms
    • …
    corecore