262 research outputs found
Simple vs complex temporal recurrences for video saliency prediction
This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain. The first modification is the addition of a ConvLSTM within the architecture, while the second is a conceptually simple exponential moving average of an internal convolutional state. We use weights pre-trained on the SALICON dataset and fine-tune our model on DHF1K. Our results show that both modifications achieve state-of-the-art results and produce similar saliency maps. Source code is available at https://git.io/fjPiB
Bags of local convolutional features for scalable instance search
This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW). Assigning each local array of activations in a convolutional layer to a visual word produces an assignment map, a compact representation that relates regions of an image with a visual word. We use the assignment map for fast spatial reranking, obtain- ing object localizations that are used for query expansion. We demonstrate the suitability of the BoW representation based on local CNN features for instance retrieval, achieving competitive performance on the Oxford and Paris buildings benchmarks. We show that our proposed system for CNN feature aggregation with BoW outperforms state-of-the-art techniques using sum pooling at a subset of the challenging TRECVid INS benchmark
Dublin's participation in the predicting media memorability task at MediaEval 2018
This paper outlines 6 approaches taken to computing video memorability, for the MediaEval media memorability task. The approaches are based on video features, an end-to-end approach, saliency, aesthetics, neural feedback, and an ensemble of all approaches
Exploring EEG for object detection and retrieval
This paper explores the potential for using Brain Computer Interfaces (BCI) as a relevance feedback mechanism in contentbased image retrieval. Several experiments are performed using a rapid serial visual presentation (RSVP) of images at different rates (5Hz and 10Hz) on 8 users with different degrees of familiarization with BCI and the dataset. We compare the feedback from the BCI and mouse-based interfaces
in a subset of TRECVid images, finding that, when
users have limited time to annotate the images, both interfaces are comparable in performance. Comparing our best users in a retrieval task, we found that EEG-based relevance feedback can outperform mouse-based feedback
Freeform Fresnel RXI-RR Köhler design with spectrum-splitting for photovoltaics
The development of a novel optical design for the high concentration photovoltaics (HPCV) nonimaging concentrator (>500x) that utilizes a built-in spectrum splitting concept is presented. The primary optical element (POE) is a flat Fresnel lens and the secondary optical element (SOE) is a free-form RXI-type concentrator with a band-pass filter embedded in it. The POE and SOE perform Köhler integration to produce light homogenization on the receiver. The system uses a combination of a commercial concentration GaInP/GaInAs/Ge 3J cell and a concentration Back-PointContact (BPC) silicon cell for efficient spectral utilization, and an external confinement technique for recovering the 3J cell’s reflection. A design target of an “equivalent” cell efficiency ~46% is predicted using commercial 39% 3J and 26% Si cells. A projected CPV module efficiency of greater than 38% is achievable at a concentration level greater than 500X with a wide acceptance angle of ±1º. A first proof-of concept receiver prototype has been manufactured using a simpler optical architecture (with a lower concentration, ~100x and lower simulated added efficiency), and experimental measurements have shown up to 39.8% 4J receiver efficiency using a 3J cell with a peak efficiency of 36.9
Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks
Insight-DCU participated in the instance search (INS) and semantic indexing (SIN) tasks in 2014. Two very different approaches were submitted for instance search, one based on features extracted using pre-trained deep convolutional neural networks (CNNs), and another based on local SIFT features, large vocabulary visual bag-of-words aggregation, inverted index-based lookup, and geometric verification on the top-N retrieved results. Two interactive runs and two automatic runs were submitted, the best interactive runs achieved a mAP of 0.135 and the best automatic 0.12. Our semantic indexing runs were based also on using convolutional neural network features, and on Support Vector Machine classifiers with linear and RBF kernels. One run was submitted to the main task, two to the no annotation task, and one to the progress task. Data for the no-annotation task was gathered from Google Images and ImageNet. The main task run has achieved a mAP of 0.086, the best no-annotation runs had a close performance to the main run by achieving a mAP of 0.080, while the progress run had 0.043
The influence of semantic and phonological factors on syntactic decisions: An event-related brain potential study
During language production and comprehension, information about a word's syntactic properties is sometimes needed. While the decision about the grammatical gender of a word requires access to syntactic knowledge, it has also been hypothesized that semantic (i.e., biological gender) or phonological information (i.e., sound regularities) may influence this decision. Event-related potentials (ERPs) were measured while native speakers of German processed written words that were or were not semantically and/or phonologically marked for gender. Behavioral and ERP results showed that participants were faster in making a gender decision when words were semantically and/or phonologically gender marked than when this was not the case, although the phonological effects were less clear. In conclusion, our data provide evidence that even though participants performed a grammatical gender decision, this task can be influenced by semantic and phonological factors
Analysis of frame-compatible subsampling structures for efficient 3DTV broadcast
The evolution of the television market is led by 3DTV technology, and this tendency can accelerate during the next years according to expert forecasts. However, 3DTV delivery by broadcast networks is not currently developed enough, and acts as a bottleneck for the complete deployment of the technology. Thus, increasing interest is dedicated to ste-reo 3DTV formats compatible with current HDTV video equipment and infrastructure, as they may greatly encourage 3D acceptance. In this paper, different subsampling schemes for HDTV compatible transmission of both progressive and interlaced stereo 3DTV are studied and compared. The frequency characteristics and preserved frequency content of each scheme are analyzed, and a simple interpolation filter is specially designed. Finally, the advantages and disadvantages of the different schemes and filters are evaluated through quality testing on several progressive and interlaced video sequences
- …
