1,499 research outputs found
Iconic Indexing for Video Search
Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London
A framework for interrogating social media images to reveal an emergent archive of war
The visual image has long been central to how war is seen, contested and legitimised, remembered and forgotten. Archives are pivotal to these ends as is their ownership and access, from state and other official repositories through to the countless photographs scattered and hidden from a collective understanding of what war looks like in individual collections and dusty attics. With the advent and rapid development of social media, however, the amateur and the professional, the illicit and the sanctioned, the personal and the official, and the past and the present, all seem to inhabit the same connected and chaotic space.However, to even begin to render intelligible the complexity, scale and volume of what war looks like in social media archives is a considerable task, given the limitations of any traditional human-based method of collection and analysis. We thus propose the production of a series of βsnapshotsβ, using computer-aided extraction and identification techniques to try to offer an experimental way in to conceiving a new imaginary of war. We were particularly interested in testing to see if twentieth century wars, obviously initially captured via pre-digital means, had become more βsettledβ over time in terms of their remediated presence today through their visual representations and connections on social media, compared with wars fought in digital media ecologies (i.e. those fought and initially represented amidst the volume and pervasiveness of social media images).To this end, we developed a framework for automatically extracting and analysing war images that appear in social media, using both the features of the images themselves, and the text and metadata associated with each image. The framework utilises a workflow comprising four core stages: (1) information retrieval, (2) data pre-processing, (3) feature extraction, and (4) machine learning. Our corpus was drawn from the social media platforms Facebook and Flickr
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a βpiece β of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Trends integration process as input data for Kansei Engineering Systems
This paper aims at studying new ways of users integration in βemotional Designβ or βKansei engineering systemsβ. The main goal of this study was the integration of the trend factor in design, with an early emotional evaluation of βTrend cardsβ produced by the designers. After a definition of the study context, we explain the experimental protocol which was followed. It was based on a questionnaire method involving 56 French subjects and applied in the field of shoes design. The data analysis was mainly proceed by the way of a Principal Component Analysis. The expected results were centred on the emotional evaluation of the Trendcards in order to establish further design rules for a Kansei Engineering system. In conclusion, we can recognize important semantic effects and influences which can be used as information for the implementation of the design elements data base
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Application of Machine Learning within Visual Content Production
We are living in an era where digital content is being produced at a dazzling pace. The heterogeneity of contents and contexts is so varied that a numerous amount of applications have been created to respond to people and market demands. The visual content production pipeline is the generalisation of the process that allows a content editor to create and evaluate their product, such as a video, an image, a 3D model, etc. Such data is then displayed on one or more devices such as TVs, PC monitors, virtual reality head-mounted displays, tablets, mobiles, or even smartwatches. Content creation can be simple as clicking a button to film a video and then share it into a social network, or complex as managing a dense user interface full of parameters by using keyboard and mouse to generate a realistic 3D model for a VR game. In this second example, such sophistication results in a steep learning curve for beginner-level users. In contrast, expert users regularly need to refine their skills via expensive lessons, time-consuming tutorials, or experience. Thus, user interaction plays an essential role in the diffusion of content creation software, primarily when it is targeted to untrained people. In particular, with the fast spread of virtual reality devices into the consumer market, new opportunities for designing reliable and intuitive interfaces have been created. Such new interactions need to take a step beyond the point and click interaction typical of the 2D desktop environment. The interactions need to be smart, intuitive and reliable, to interpret 3D gestures and therefore, more accurate algorithms are needed to recognise patterns. In recent years, machine learning and in particular deep learning have achieved outstanding results in many branches of computer science, such as computer graphics and human-computer interface, outperforming algorithms that were considered state of the art, however, there are only fleeting efforts to translate this into virtual reality. In this thesis, we seek to apply and take advantage of deep learning models to two different content production pipeline areas embracing the following subjects of interest: advanced methods for user interaction and visual quality assessment. First, we focus on 3D sketching to retrieve models from an extensive database of complex geometries and textures, while the user is immersed in a virtual environment. We explore both 2D and 3D strokes as tools for model retrieval in VR. Therefore, we implement a novel system for improving accuracy in searching for a 3D model. We contribute an efficient method to describe models through 3D sketch via an iterative descriptor generation, focusing both on accuracy and user experience. To evaluate it, we design a user study to compare different interactions for sketch generation. Second, we explore the combination of sketch input and vocal description to correct and fine-tune the search for 3D models in a database containing fine-grained variation. We analyse sketch and speech queries, identifying a way to incorporate both of them into our system's interaction loop. Third, in the context of the visual content production pipeline, we present a detailed study of visual metrics. We propose a novel method for detecting rendering-based artefacts in images. It exploits analogous deep learning algorithms used when extracting features from sketches
Contributions to the Content-Based Image Retrieval Using Pictorial Queris
L'accΓ©s massiu a les cΓ meres digitals, els ordinadors personals i a Internet, ha propiciat la creaciΓ³ de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevΓ ncia totes aquelles eines dissenyades per organitzar la informaciΓ³ i facilitar la seva cerca.Les imatges sΓ³n un cas particular de dades que requereixen tΓ¨cniques especΓfiques de descripciΓ³ i indexaciΓ³. L'Γ rea de la visiΓ³ per computador encarregada de l'estudi d'aquestes tΓ¨cniques rep el nom de RecuperaciΓ³ d'Imatges per Contingut, en anglΓ¨s Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinΓ³ que es basen en caracterΓstiques extretes de les prΓ²pies imatges. En contrast a les mΓ©s de 6000 llengΓΌes parlades en el mΓ³n, les descripcions basades en caracterΓstiques visuals representen una via d'expressiΓ³ universal.La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en Γ rees de coneixement molt diverses. AixΓ doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecciΓ³ de la propietat intelΒ·lectual, el periodisme, el disseny grΓ fic, la cerca d'informaciΓ³ en Internet, la preservaciΓ³ dels patrimoni cultural, etc. Un dels punts importants d'una aplicaciΓ³ de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari Γ©s l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenciΓ³ en aquells sistemes en quΓ¨ la consulta es formula a partir d'una representaciΓ³ pictΓ²rica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-SelecciΓ³, Consulta-segons-ComposiciΓ³-IcΓ²nica, Consulta-segons-EsboΓ§ i Consulta-segons-IlΒ·lustraciΓ³. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecciΓ³ d'una imatge, fins a la creaciΓ³ d'una ilΒ·lustraciΓ³ en color, l'usuari Γ©s qui pren el control de les dades d'entrada del sistema. Al llarg dels capΓtols d'aquesta tesi hem analitzat la influΓ¨ncia que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera tambΓ© hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista prΓ ctic mitjanΓ§ant una aplicaciΓ³ final
Contributions to the content-based image retrieval using pictorial queries
DescripciΓ³ del recurs: el 02 de novembre de 2010L'accΓ©s massiu a les cΓ meres digitals, els ordinadors personals i a Internet, ha propiciat la creaciΓ³ de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellevΓ ncia totes aquelles eines dissenyades per organitzar la informaciΓ³ i facilitar la seva cerca. Les imatges sΓ³n un cas particular de dades que requereixen tΓ¨cniques especΓfiques de descripciΓ³ i indexaciΓ³. L'Γ rea de la visiΓ³ per computador encarregada de l'estudi d'aquestes tΓ¨cniques rep el nom de RecuperaciΓ³ d'Imatges per Contingut, en anglΓ¨s Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sinΓ³ que es basen en caracterΓstiques extretes de les prΓ²pies imatges. En contrast a les mΓ©s de 6000 llengΓΌes parlades en el mΓ³n, les descripcions basades en caracterΓstiques visuals representen una via d'expressiΓ³ universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en Γ rees de coneixement molt diverses. AixΓ doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecciΓ³ de la propietat intelΒ·lectual, el periodisme, el disseny grΓ fic, la cerca d'informaciΓ³ en Internet, la preservaciΓ³ dels patrimoni cultural, etc. Un dels punts importants d'una aplicaciΓ³ de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari Γ©s l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenciΓ³ en aquells sistemes en quΓ¨ la consulta es formula a partir d'una representaciΓ³ pictΓ²rica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-SelecciΓ³, Consulta-segons-ComposiciΓ³-IcΓ²nica, Consulta-segons-EsboΓ§ i Consulta-segons-IlΒ·lustraciΓ³. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecciΓ³ d'una imatge, fins a la creaciΓ³ d'una ilΒ·lustraciΓ³ en color, l'usuari Γ©s qui pren el control de les dades d'entrada del sistema. Al llarg dels capΓtols d'aquesta tesi hem analitzat la influΓ¨ncia que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera tambΓ© hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista prΓ ctic mitjanΓ§ant una aplicaciΓ³ final
Strategies for image visualisation and browsing
PhDThe exploration of large information spaces has remained a challenging task even
though the proliferation of database management systems and the state-of-the art
retrieval algorithms is becoming pervasive. Signi cant research attention in the
multimedia domain is focused on nding automatic algorithms for organising digital
image collections into meaningful structures and providing high-semantic image
indices. On the other hand, utilisation of graphical and interactive methods from
information visualisation domain, provide promising direction for creating e cient
user-oriented systems for image management. Methods such as exploratory browsing
and query, as well as intuitive visual overviews of image collection, can assist
the users in nding patterns and developing the understanding of structures and
content in complex image data-sets.
The focus of the thesis is combining the features of automatic data processing
algorithms with information visualisation. The rst part of this thesis focuses on
the layout method for displaying the collection of images indexed by low-level visual
descriptors. The proposed solution generates graphical overview of the data-set as
a combination of similarity based visualisation and random layout approach.
Second part of the thesis deals with problem of visualisation and exploration for
hierarchical organisation of images. Due to the absence of the semantic information,
images are considered the only source of high-level information. The content preview
and display of hierarchical structure are combined in order to support image
retrieval. In addition to this, novel exploration and navigation methods are proposed
to enable the user to nd the way through database structure and retrieve
the content.
On the other hand, semantic information is available in cases where automatic
or semi-automatic image classi ers are employed. The automatic annotation of
image items provides what is referred to as higher-level information. This type
of information is a cornerstone of multi-concept visualisation framework which is
developed as a third part of this thesis. This solution enables dynamic generation
of user-queries by combining semantic concepts, supported by content overview and
information ltering.
Comparative analysis and user tests, performed for the evaluation of the proposed
solutions, focus on the ways information visualisation a ects the image content
exploration and retrieval; how e cient and comfortable are the users when
using di erent interaction methods and the ways users seek for information through
di erent types of database organisation
- β¦