11,647 research outputs found
Multimedia information technology and the annotation of video
The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning
High-resolution SAR images for fire susceptibility estimation in urban forestry
We present an adaptive system for the automatic assessment of both physical and anthropic fire impact factors on periurban forestries. The aim is to provide an integrated methodology exploiting a complex data structure built upon a multi resolution grid gathering historical land exploitation and meteorological data, records of human habits together with suitably segmented and interpreted high resolution X-SAR images, and several other information sources. The contribution of the model and its novelty rely mainly on the definition of a learning schema lifting different factors and aspects of fire causes, including physical, social and behavioural ones, to the design of a fire susceptibility map, of a specific urban forestry. The outcome is an integrated geospatial database providing an infrastructure that merges cartography, heterogeneous data and complex analysis, in so establishing a digital environment where users and tools are interactively connected in an efficient and flexible way
Imaging time series for the classification of EMI discharge sources
In this work, we aim to classify a wider range of Electromagnetic Interference (EMI) discharge sources collected from new power plant sites across multiple assets. This engenders a more complex and challenging classification task. The study involves an investigation and development of new and improved feature extraction and data dimension reduction algorithms based on image processing techniques. The approach is to exploit the Gramian Angular Field technique to map the measured EMI time signals to an image, from which the significant information is extracted while removing redundancy. The image of each discharge type contains a unique fingerprint. Two feature reduction methods called the Local Binary Pattern (LBP) and the Local Phase Quantisation (LPQ) are then used within the mapped images. This provides feature vectors that can be implemented into a Random Forest (RF) classifier. The performance of a previous and the two new proposed methods, on the new database set, is compared in terms of classification accuracy, precision, recall, and F-measure. Results show that the new methods have a higher performance than the previous one, where LBP features achieve the best outcome
CHORUS Deliverable 3.4: Vision Document
The goal of the CHORUS Vision Document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area and to highlight trends and challenges in this domain. The vision of CHORUS is strongly connected to the CHORUS Roadmap Document (D2.3). A concise document integrating the outcomes of the two deliverables will be prepared for the end of the project (NEM Summit)
Improving Generalization of Synthetically Trained Sonar Image Descriptors for Underwater Place Recognition
Autonomous navigation in underwater environments presents challenges due to
factors such as light absorption and water turbidity, limiting the
effectiveness of optical sensors. Sonar systems are commonly used for
perception in underwater operations as they are unaffected by these
limitations. Traditional computer vision algorithms are less effective when
applied to sonar-generated acoustic images, while convolutional neural networks
(CNNs) typically require large amounts of labeled training data that are often
unavailable or difficult to acquire. To this end, we propose a novel compact
deep sonar descriptor pipeline that can generalize to real scenarios while
being trained exclusively on synthetic data. Our architecture is based on a
ResNet18 back-end and a properly parameterized random Gaussian projection
layer, whereas input sonar data is enhanced with standard ad-hoc
normalization/prefiltering techniques. A customized synthetic data generation
procedure is also presented. The proposed method has been evaluated extensively
using both synthetic and publicly available real data, demonstrating its
effectiveness compared to state-of-the-art methods.Comment: This paper has been accepted for publication at the 14th
International Conference on Computer Vision Systems (ICVS 2023
- …