Search CORE

642 research outputs found

Picture processing for enhancement and recognition

Author: Dessì Tiziana
Publication venue
Publication date
Field of study

Recent years have been characterized by an incredible growth in computing power and storage capabilities, communication speed and bandwidth availability, either for desktop platform or mobile device. The combination of these factors have led to a new era of multimedia applications: browsing of huge image archives, consultation of online video databases, location based services and many other. Multimedia is almost everywhere and requires high quality data, easy retrieval of multimedia contents, increase in network access capacity and bandwidth per user. To meet all the mentioned requirements many efforts have to be made in various research areas, ranging from signal processing, image and video analysis, communication protocols, etc. The research activity developed during these three years concerns the field of multimedia signal processing, with particular attention to image and video analysis and processing. Two main topics have been faced: the first is relating to image and video reconstruction/restoration (using super resolution techniques) in web based application for multimedia contents' fruition; the second is relating to image analysis for location based systems in indoor scenario. The first topic is relating to image and video processing, in particular the focus has been put on the development of algorithm for super resolution reconstruction of image and video sequences in order to make easier the fruition of multimedia data over the web. On one hand, latest years have been characterized by an incredible proliferation and surprising success of user generated multimedia contents, and also distributed and collaborative multimedia database over the web. This brought to serious issues related to their management and maintenance: bandwidth limitation and service costs are important factors when dealing with mobile multimedia contents’ fruition. On the other hand, the current multimedia consumer market has been characterized by the advent of cheap but rather high-quality high definition displays. However, this trend is only partially supported by the deployment of high-resolution multimedia services, thus the resulting disparity between content and display formats have to be addressed and older productions need to be either re-mastered or postprocessed in order to be broadcasted for HD exploitation. In the presented scenario, superresolution reconstruction represents a major solution. Image or video super resolution techniques allow restoring the original spatial resolution from low-resolution compressed data. In this way, both content and service providers, not to tell the final users, are relieved from the burden of providing and supporting large multimedia data transfer. The second topic addressed during my Phd research activity is related to the implementation of an image based positioning system for an indoor navigator. As modern mobile device become faster, classical signal processing is suggested to be used for new applications, such location based service. The exponential growth of wearable devices, such as smartphone and PDA in general, equipped with embedded motion (accelerometers) and rotation (gyroscopes) sensors, Internet connection and high-resolution cameras makes it ideal for INS (Inertial Navigation System) applications aiming to support the localization/navigation of objects and/or users in an indoor environment where common localization systems, such as GPS (Global Positioning System), fail. Thus the need to use alternative positioning techniques. A series of intensive tests have been carried out, showing how modern signal processing techniques can be successfully applied in different scenarios, from image and video enhancement up to image recognition for localization purpose, providing low costs solutions and ensuring real-time performance

UniCA Eprints

Recommended from our members

Automated system design for the efficient processing of solar satellite images. Developing novel techniques and software platform for the robust feature detection and the creation of 3D anaglyphs and super-resolution images for solar satellite images.

Author: Zraqou Jamal Sami
Publication venue: School of Computing, Informatics & Media
Publication date: 01/01/2011
Field of study

The Sun is of fundamental importance to life on earth and is studied by scientists from many disciplines. It exhibits phenomena on a wide range of observable scales, timescales and wavelengths and due to technological developments there is a continuing increase in the rate at which solar data is becoming available for study which presents both opportunities and challenges. Two satellites recently launched to observe the sun are STEREO (Solar TErrestrial RElations Observatory), providing simultaneous views of the SUN from two different viewpoints and SDO (Solar Dynamics Observatory) which aims to study the solar atmosphere on small scales and times and in many wavelengths. The STEREO and SDO missions are providing huge volumes of data at rates of about 15 GB per day (initially it was 30 GB per day) and 1.5 terabytes per day respectively. Accessing these huge data volumes efficiently at both high spatial and high time resolutions is important to support scientific discovery but requires increasingly efficient tools to browse, locate and process specific data sets. This thesis investigates the development of new technologies for processing information contained in multiple and overlapping images of the same scene to produce images of improved quality. This area in general is titled Super Resolution (SR), and offers a technique for reducing artefacts and increasing the spatial resolution. Another challenge is to generate 3D images such as Anaglyphs from uncalibrated pairs of SR images. An automated method to generate SR images is presented here. The SR technique consists of three stages: image registration, interpolation and filtration. Then a method to produce enhanced, near real-time, 3D solar images from uncalibrated pairs of images is introduced. Image registration is an essential enabling step in SR and Anaglyph processing. An accurate point-to-point mapping between views is estimated, with multiple images registered using only information contained within the images themselves. The performances of the proposed methods are evaluated using benchmark evaluation techniques. A software application called the SOLARSTUDIO has been developed to integrate and run all the methods introduced in this thesis. SOLARSTUDIO offers a number of useful image processing tools associated with activities highly focused on solar images including: Active Region (AR) segmentation, anaglyph creation, solar limb extraction, solar events tracking and video creation

Bradford Scholars

Development of an Image Retrieval Model for Biomedical Image Databases

Author: Afolabi Babajide
Oluwagbemi Oluwatolani
Oluwaranti Adeniran
Philip Achimugu
Publication venue: 'IntechOpen'
Publication date: 06/09/2011
Field of study

IntechOpen

Crossref

Multiple Media Correlation: Theory and Applications

Author: Owen Charles B
Publication venue: Dartmouth Digital Commons
Publication date: 19/06/1998
Field of study

This thesis introduces multiple media correlation, a new technology for the automatic alignment of multiple media objects such as text, audio, and video. This research began with the question: what can be learned when multiple multimedia components are analyzed simultaneously? Most ongoing research in computational multimedia has focused on queries, indexing, and retrieval within a single media type. Video is compressed and searched independently of audio, text is indexed without regard to temporal relationships it may have to other media data. Multiple media correlation provides a framework for locating and exploiting correlations between multiple, potentially heterogeneous, media streams. The goal is computed synchronization, the determination of temporal and spatial alignments that optimize a correlation function and indicate commonality and synchronization between media objects. The model also provides a basis for comparison of media in unrelated domains. There are many real-world applications for this technology, including speaker localization, musical score alignment, and degraded media realignment. Two applications, text-to-speech alignment and parallel text alignment, are described in detail with experimental validation. Text-to-speech alignment computes the alignment between a textual transcript and speech-based audio. The presented solutions are effective for a wide variety of content and are useful not only for retrieval of content, but in support of automatic captioning of movies and video. Parallel text alignment provides a tool for the comparison of alternative translations of the same document that is particularly useful to the classics scholar interested in comparing translation techniques or styles. The results presented in this thesis include (a) new media models more useful in analysis applications, (b) a theoretical model for multiple media correlation, (c) two practical application solutions that have wide-spread applicability, and (d) Xtrieve, a multimedia database retrieval system that demonstrates this new technology and demonstrates application of multiple media correlation to information retrieval. This thesis demonstrates that computed alignment of media objects is practical and can provide immediate solutions to many information retrieval and content presentation problems. It also introduces a new area for research in media data analysis

Dartmouth Digital Commons (Dartmouth College)

Personalized data generation and summarization using learned ranking models

Author: Saquil Yassir
Publication venue
Publication date: 17/01/2022
Field of study

OPUS

Exploring Sparse, Unstructured Video Collections of Places

Author: Tompkin JH
Publication venue: UCL (University College London)
Publication date: 28/02/2013
Field of study

The abundance of mobile devices and digital cameras with video capture makes it easy to obtain large collections of video clips that contain the same location, environment, or event. However, such an unstructured collection is difficult to comprehend and explore. We propose a system that analyses collections of unstructured but related video data to create a Videoscape: a data structure that enables interactive exploration of video collections by visually navigating — spatially and/or temporally — between different clips. We automatically identify transition opportunities, or portals. From these portals, we construct the Videoscape, a graph whose edges are video clips and whose nodes are portals between clips. Now structured, the videos can be interactively explored by walking the graph or by geographic map. Given this system, we gauge preference for different video transition styles in a user study, and generate heuristics that automatically choose an appropriate transition style. We evaluate our system using three further user studies, which allows us to conclude that Videoscapes provides significant benefits over related methods. Our system leads to previously unseen ways of interactive spatio-temporal exploration of casually captured videos, and we demonstrate this on several video collections

UCL Discovery

Computer Generation of Integral Images using Interpolative Shading Techniques

Author: Milnthorpe Graham E
Publication venue: School of Engineering and Manufacture
Publication date: 01/01/2003
Field of study

Research to produce artificial 3D images that duplicates the human stereovision has been ongoing for hundreds of years. What has taken millions of years to evolve in humans is proving elusive even for present day technological advancements. The difficulties are compounded when real-time generation is contemplated. The problem is one of depth. When perceiving the world around us it has been shown that the sense of depth is the result of many different factors. These can be described as monocular and binocular. Monocular depth cues include overlapping or occlusion, shading and shadows, texture etc. Another monocular cue is accommodation (and binocular to some extent) where the focal length of the crystalline lens is adjusted to view an image. The important binocular cues are convergence and parallax. Convergence allows the observer to judge distance by the difference in angle between the viewing axes of left and right eyes when both are focussing on a point. Parallax relates to the fact that each eye sees a slightly shifted view of the image. If a system can be produced that requires the observer to use all of these cues, as when viewing the real world, then the transition to and from viewing a 3D display will be seamless. However, for many 3D imaging techniques, which current work is primarily directed towards, this is not the case and raises a serious issue of viewer comfort. Researchers worldwide, in university and industry, are pursuing their approaches in the development of 3D systems, and physiological disturbances that can cause nausea in some observers will not be acceptable. The ideal 3D system would require, as minimum, accurate depth reproduction, multiviewer capability, and all-round seamless viewing. The necessity not to wear stereoscopic or polarising glasses would be ideal and lack of viewer fatigue essential. Finally, for whatever the use of the system, be it CAD, medical, scientific visualisation, remote inspection etc on the one hand, or consumer markets such as 3D video games and 3DTV on the other, the system has to be relatively inexpensive. Integral photography is a ‘real camera’ system that attempts to comply with this ideal; it was invented in 1908 but due to technological reasons was not capable of being a useful autostereoscopic system. However, more recently, along with advances in technology, it is becoming a more attractive proposition for those interested in developing a suitable system for 3DTV. The fast computer generation of integral images is the subject of this thesis; the adjective ‘fast’ being used to distinguish it from the much slower technique of ray tracing integral images. These two techniques are the standard in monoscopic computer graphics whereby ray tracing generates photo-realistic images and the fast forward geometric approach that uses interpolative shading techniques is the method used for real-time generation. Before this present work began it was not known if it was possible to create volumetric integral images using a similar fast approach as that employed by standard computer graphics, but it soon became apparent that it would be successful and hence a valuable contribution in this area. Presented herein is a full description of the development of two derived methods for producing rendered integral image animations using interpolative shading. The main body of the work is the development of code to put these methods into practice along with many observations and discoveries that the author came across during this task.The Defence and Research Agency (DERA), a contract (LAIRD) under the European Link/EPSRC photonics initiative, and DTI/EPSRC sponsorship within the PROMETHEUS project

De Montfort University Open Research Archive

OpenGrey Repository

Towards Making JavaScript Applications Secure and Private

Author: Li Song
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 25/07/2022
Field of study

JavaScript is a popular programming language widely used on both the browser and the server sides. Researchers have extensively studied different aspects of the security and privacy of JavaScript, for instance, the vulnerability detection of the server-side Node.JS applications and the browser-side fingerprinting techniques. Despite the research efforts, multiple challenges of JavaScript remain unsolved: on the server-side, existing vulnerability detection approaches do not generalize to a wide range of popular vulnerabilities and the detection rate is not satisfactory; on the client-side, service providers can only fingerprint users within a single browser but not cross different browsers. In this dissertation, we propose a flow-, branch- and context-sensitive static analysis approach to generate a novel graph structure, named Object Dependence Graph (ODG), to address the server-side vulnerability detection challenges, and a cross-browser fingerprinting method that utilizes multiple novel OS and hardware level features to solve the client-side fingerprinting challenge. On the server-side, ODG represents JavaScript objects as nodes and their relations with Abstract Syntax Tree (AST) as edges, and allows users to detect multiple types of vulnerabilities during and after the generation process of ODG and by graph queries. Our evaluation shows that for server-side vulnerability detection, our approach outperforms all the state-of-the-art JavaScript vulnerability detection tools in terms of false-positive rate and false-negative rate. We apply our tool to detect six types of vulnerabilities on top of an NPM package dataset, which correctly reports 241 zero-day vulnerable packages, and 81 of them are assigned with CVE identifiers. On the client-side, our approach utilizes multiple novel OS and hardware level features, such as those from graphics cards and CPUs, to achieve better accuracy and stability. The evaluation shows that our approach can identify 99.24% of the browsers and 84.64% of the devices, as opposed to 90.83% and 68.98% of the state-of-the-art approaches, respectively

JScholarship

Diverse Contributions to Implicit Human-Computer Interaction

Author: Leiva Torres Luis Alberto
Publication venue: 'Universitat Politecnica de Valencia'
Publication date: 13/11/2012
Field of study

Cuando las personas interactúan con los ordenadores, hay mucha información que no se proporciona a propósito. Mediante el estudio de estas interacciones implícitas es posible entender qué características de la interfaz de usuario son beneficiosas (o no), derivando así en implicaciones para el diseño de futuros sistemas interactivos. La principal ventaja de aprovechar datos implícitos del usuario en aplicaciones informáticas es que cualquier interacción con el sistema puede contribuir a mejorar su utilidad. Además, dichos datos eliminan el coste de tener que interrumpir al usuario para que envíe información explícitamente sobre un tema que en principio no tiene por qué guardar relación con la intención de utilizar el sistema. Por el contrario, en ocasiones las interacciones implícitas no proporcionan datos claros y concretos. Por ello, hay que prestar especial atención a la manera de gestionar esta fuente de información. El propósito de esta investigación es doble: 1) aplicar una nueva visión tanto al diseño como al desarrollo de aplicaciones que puedan reaccionar consecuentemente a las interacciones implícitas del usuario, y 2) proporcionar una serie de metodologías para la evaluación de dichos sistemas interactivos. Cinco escenarios sirven para ilustrar la viabilidad y la adecuación del marco de trabajo de la tesis. Resultados empíricos con usuarios reales demuestran que aprovechar la interacción implícita es un medio tanto adecuado como conveniente para mejorar de múltiples maneras los sistemas interactivos.Leiva Torres, LA. (2012). Diverse Contributions to Implicit Human-Computer Interaction [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17803Palanci

Crossref

RiuNet