Search CORE

9 research outputs found

Space-Time Attention with Shifted Non-Local Search

Author: Chan Stanley
Gauen Kent
Publication venue
Publication date: 04/12/2023
Field of study

Efficiently computing attention maps for videos is challenging due to the motion of objects between frames. While a standard non-local search is high-quality for a window surrounding each query point, the window's small size cannot accommodate motion. Methods for long-range motion use an auxiliary network to predict the most similar key coordinates as offsets from each query location. However, accurately predicting this flow field of offsets remains challenging, even for large-scale networks. Small spatial inaccuracies significantly impact the attention module's quality. This paper proposes a search strategy that combines the quality of a non-local search with the range of predicted offsets. The method, named Shifted Non-Local Search, executes a small grid search surrounding the predicted offsets to correct small spatial errors. Our method's in-place computation consumes 10 times less memory and is over 3 times faster than previous work. Experimentally, correcting the small spatial errors improves the video frame alignment quality by over 3 dB PSNR. Our search upgrades existing space-time attention modules, which improves video denoising results by 0.30 dB PSNR for a 7.5% increase in overall runtime. We integrate our space-time attention module into a UNet-like architecture to achieve state-of-the-art results on video denoising.Comment: 15 pages, 12 figure

arXiv.org e-Print Archive

Investigating Dataset Distinctiveness

Author: Gauen Kent W.
Kapach Zohar R.
Lu Yung-Hsiang
Merrick Daniel P.
Ulmer Andrew
Publication venue: 'Purdue University (bepress)'
Publication date: 02/08/2018
Field of study

Just as a human might struggle to interpret another human’s handwriting, a computer vision program might fail when asked to perform one task in two different domains. To be more specific, visualize a self-driving car as a human driver who had only ever driven on clear, sunny days, during daylight hours. This driver – the self-driving car – would inevitably face a significant challenge when asked to drive when it is violently raining or foggy during the night, putting the safety of its passengers in danger. An extensive understanding of the data we use to teach computer vision models – such as those that will be driving our cars in the years to come – is absolutely necessary as these sorts of complex systems find their way into everyday human life. This study works to develop a comprehensive meaning of the style of a dataset, or the quantitative difference between cursive lettering and print lettering, with respect to the image data used in the field of computer vision. We accomplished this by asking a machine learning model to predict which commonly used dataset a particular image belongs to, based on detailed features of the images. If the model performed well when classifying an image based on which dataset it belongs to, that dataset was considered distinct. We then developed a linear relationship between this distinctiveness metric and a model’s ability to learn from one dataset and test on another, so as to have a better understanding of how a computer vision system will perform in a given context, before it is trained

Purdue E-Pubs

Comparison of Visual Datasets for Machine Learning

Author: Asokan Nirmal
Chen Shu-Ching
Dailey Ryan
Gauen Kent
Laiman John
Lu Yung-Hsiang
Shyu Mei-Ling
Thiruvathukal George K.
Zi Yuxiang
Publication venue: Loyola eCommons
Publication date: 01/08/2017
Field of study

One of the greatest technological improvements in recent years is the rapid progress using machine learning for processing visual data. Among all factors that contribute to this development, datasets with labels play crucial roles. Several datasets are widely reused for investigating and analyzing different solutions in machine learning. Many systems, such as autonomous vehicles, rely on components using machine learning for recognizing objects. This paper compares different visual datasets and frameworks for machine learning. The comparison is both qualitative and quantitative and investigates object detection labels with respect to size, location, and contextual information. This paper also presents a new approach creating datasets using real-time, geo-tagged visual data, greatly improving the contextual information of the data. The data could be automatically labeled by cross-referencing information from other sources (such as weather)

Crossref

University of Miami: Scholarship Miami

Loyola eCommons

See the World through Network Cameras

Author: Aghajanzadeh Sarah
Dailey Ryan
Gauen Kent
Guo Minghao
Huang Yutong
Kaseb Ahmed S
Lu Yung-Hsiang
Malik Deeptanshu
Rijhwani Damini
Thiruvathukal George K.
Publication venue: Loyola eCommons
Publication date: 01/01/2019
Field of study

Millions of network cameras have been deployed worldwide. Real-time data from many network cameras can offer instant views of multiple locations with applications in public safety, transportation management, urban planning, agriculture, forestry, social sciences, atmospheric information, and more. This paper describes the real-time data available from worldwide network cameras and potential applications. Second, this paper outlines the CAM2 System available to users at https://www.cam2project.net/. This information includes strategies to discover network cameras and create the camera database, user interface, and computing platforms. Third, this paper describes many opportunities provided by data from network cameras and challenges to be addressed

arXiv.org e-Print Archive

Loyola eCommons

Dynamic Sampling in Convolutional Neural Networks for Imbalanced Data Classification

Author: Aghajanzadeh Sarah
Chen Shu-Ching
Dailey Ryan
Gauen Kent
Kaseb Ahmed S
Lu Yung-Hsiang
Mohan Anup
Pouyanfar Samira
Shyu Mei-Ling
Tao Yudong
Tian Haiman
Publication venue: IEEE
Publication date: 01/04/2018
Field of study

Many multimedia systems stream real-time visual data continuously for a wide variety of applications. These systems can produce vast amounts of data, but few studies take advantage of the versatile and real-time data. This paper presents a novel model based on the Convolutional Neural Networks (CNNs) to handle such imbalanced and heterogeneous data and successfully identifies the semantic concepts in these multimedia systems. The proposed model can discover the semantic concepts from the data with a skewed distribution using a dynamic sampling technique. The paper also presents a system that can retrieve real-time visual data from heterogeneous cameras, and the run-time environment allows the analysis programs to process the data from thousands of cameras simultaneously. The evaluation results in comparison with several state-of-the-art methods demonstrate the ability and effectiveness of the proposed model on visual data captured by public network cameras

Crossref

University of Miami: Scholarship Miami

Commentary

Author: BM Vonakis
C Torigoe
C Wofsy
C Wofsy
F Sicheri
JC Cambier
KA Field
KA Field
KA Melkonian
LK Timson Gauen
M Reth
MD Resh
R Paolini
S-YMH Mao
T Yamashita
UM Kent
VS Pribluda
VS Pribluda
VS Pribluda
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1998
Field of study

Crossref

Low-Power Computer Vision: Status, Challenges, and Opportunities

Loyola eCommons

Low-Power Computer Vision: Status, Challenges, Opportunities

Computer vision has achieved impressive progress in recent years. Meanwhile, mobile phones have become the primary computing platforms for millions of people. In addition to mobile phones, many autonomous systems rely on visual data for making decisions and some of these systems have limited energy (such as unmanned aerial vehicles also called drones and mobile robots). These systems rely on batteries and energy efficiency is critical. This article serves two main purposes: (1) Examine the state-of-the-art for low-power solutions to detect objects in images. Since 2015, the IEEE Annual International Low-Power Image Recognition Challenge (LPIRC) has been held to identify the most energy-efficient computer vision solutions. This article summarizes 2018 winners\u27 solutions. (2) Suggest directions for research as well as opportunities for low-power computer vision

arXiv.org e-Print Archive

Loyola eCommons