490 research outputs found

    An ontology-based framework for the automated analysis and interpretation of comic books' images

    Get PDF
    International audienceSince the beginning of the twenty-first century, the cultural industry has been through a massive and historical mutation induced by the rise of digital technologies. The comic books industry keeps looking for the right solution and has not yet produced anything as convincing as the music or movie have. A lot of energy has been spent to transfer printed material to digital supports so far. The specificities of those supports are not always exploited at the best of their capabilities, while they could potentially be used to create new reading conventions. In spite of the needs induced by the large amount of data created since the beginning of the comics history, content indexing has been left behind. It is indeed quite a challenge to index such a composition of textual and visual information. While a growing number of researchers are working on comic books' image analysis from a low-level point of view, only a few are tackling the issue of representing the content at a high semantic level. We propose in this article a framework to handle the content of a comic book, to support the automatic extraction of its visual components and to formalize the semantic of the domain's codes. We tested our framework over two applications: 1) the unsupervised content discovery of comic books' images, 2) its capabilities to handle complex layouts and to produce a respectful browsing experience to the digital comics reader

    Conceptual graph-based knowledge representation for supporting reasoning in African traditional medicine

    Get PDF
    Although African patients use both conventional or modern and traditional healthcare simultaneously, it has been proven that 80% of people rely on African traditional medicine (ATM). ATM includes medical activities stemming from practices, customs and traditions which were integral to the distinctive African cultures. It is based mainly on the oral transfer of knowledge, with the risk of losing critical knowledge. Moreover, practices differ according to the regions and the availability of medicinal plants. Therefore, it is necessary to compile tacit, disseminated and complex knowledge from various Tradi-Practitioners (TP) in order to determine interesting patterns for treating a given disease. Knowledge engineering methods for traditional medicine are useful to model suitably complex information needs, formalize knowledge of domain experts and highlight the effective practices for their integration to conventional medicine. The work described in this paper presents an approach which addresses two issues. First it aims at proposing a formal representation model of ATM knowledge and practices to facilitate their sharing and reusing. Then, it aims at providing a visual reasoning mechanism for selecting best available procedures and medicinal plants to treat diseases. The approach is based on the use of the Delphi method for capturing knowledge from various experts which necessitate reaching a consensus. Conceptual graph formalism is used to model ATM knowledge with visual reasoning capabilities and processes. The nested conceptual graphs are used to visually express the semantic meaning of Computational Tree Logic (CTL) constructs that are useful for formal specification of temporal properties of ATM domain knowledge. Our approach presents the advantage of mitigating knowledge loss with conceptual development assistance to improve the quality of ATM care (medical diagnosis and therapeutics), but also patient safety (drug monitoring)

    What Can Human Sketches Do for Object Detection?

    Full text link
    Sketches are highly expressive, inherently capturing subjective and fine-grained visual cues. The exploration of such innate properties of human sketches has, however, been limited to that of image retrieval. In this paper, for the first time, we cultivate the expressiveness of sketches but for the fundamental vision task of object detection. The end result is a sketch-enabled object detection framework that detects based on what \textit{you} sketch -- \textit{that} ``zebra'' (e.g., one that is eating the grass) in a herd of zebras (instance-aware detection), and only the \textit{part} (e.g., ``head" of a ``zebra") that you desire (part-aware detection). We further dictate that our model works without (i) knowing which category to expect at testing (zero-shot) and (ii) not requiring additional bounding boxes (as per fully supervised) and class labels (as per weakly supervised). Instead of devising a model from the ground up, we show an intuitive synergy between foundation models (e.g., CLIP) and existing sketch models build for sketch-based image retrieval (SBIR), which can already elegantly solve the task -- CLIP to provide model generalisation, and SBIR to bridge the (sketch\rightarrowphoto) gap. In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP. We then devise a training paradigm to adapt the learned encoders for object detection, such that the region embeddings of detected boxes are aligned with the sketch and photo embeddings from SBIR. Evaluating our framework on standard object detection datasets like PASCAL-VOC and MS-COCO outperforms both supervised (SOD) and weakly-supervised object detectors (WSOD) on zero-shot setups. Project Page: \url{https://pinakinathc.github.io/sketch-detect}Comment: Accepted as Top 12 Best Papers. Will be presented in special single-track plenary sessions to all attendees in Computer Vision and Pattern Recognition (CVPR), 2023. Project Page: www.pinakinathc.me/sketch-detec

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Deficient Human Aspects in Current Multimedia Indexing and Retrieval (MIR) of Large Social Networks Databases

    Get PDF
    An inside look at the contents of social networks databases shows a significant diversion from traditional database contents and functionality. There is also enormous evidences that Social networks are changing the way multimedia content is shared on the web, by allowing users to upload their photos, videos, and audio content, produced by any means of digital recorders such as mobile/smart-phones, and web/digital cameras. In this article, an overview of multimedia indexing and searching algorithms, following the data growth curve is presented in detail. This paper concludes with the social aspects and new, interesting views on multimedia retrieval in the large social media databases.Keywords: multimedia, indexing, social media, algorithms social networks, databases, retrieva

    Enhancing Accessibility to Heterogeneous Sri Lankan Cultural Heritage Information across Museums through Metadata Aggregation

    Get PDF
    Thesis (Master of Science in Library and Information Studies)--University of Tsukuba, no. 36035, 2016.8.3

    Iconic Indexing for Video Search

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London

    Pocket Monsters And Pirate Treasure: Fantasy And Social Platforms In The 21st Century

    Get PDF
    POCKET MONSTERS AND PIRATE TREASURE: FANTASY AND SOCIAL PLATFORMS IN THE 21ST CENTURY is an anthropology project examining media, fantasy, ideology, and social groups in order to build a better foundation for the ways in which economic and social changes influence social networking, popular media, and values by using the anime-manga subculture as an example. The thesis draws from three major theorists: Thomas LaMarre, Anne Allison, and Ian Condry as well as major anthropological theorists such as Pierre Bourdieu. As an ethnography, the project was split into two sections: one consisting of interviews with eight anime-manga subculture participants drawn primarily from the University of Mississippi Anime Club, and the second constructed from participant observation in a variety of activities important for constructing the community, such as conventions and group watching of animated series. I conclude that through the synthesis of different strains of contemporary ideas—along with my own contribution of theory in the form of a redefinition of Levi-Strauss’s concept of bricolage—a better way of understanding both resistance in the consumption of popular media and the formation of group cultures in social networks. Larger conclusions on this regard are posed as ongoing studies and challenges to the field of media studies and anthropology, and as targets of further research

    Meaning above (and in) the head: Combinatorial visual morphology from comics and emoji

    Get PDF
    AbstractCompositionality is a primary feature of language, but graphics can also create combinatorial meaning, like with items above faces (e.g., lightbulbs to mean inspiration). We posit that these “upfixes” (i.e., upwards affixes) involve a productive schema enabling both stored and novel face–upfix dyads. In two experiments, participants viewed either conventional (e.g., lightbulb) or unconventional (e.g., clover-leaves) upfixes with faces which either matched (e.g., lightbulb/smile) or mismatched (e.g., lightbulb/frown). In Experiment 1, matching dyads sponsored higher comprehensibility ratings and faster response times, modulated by conventionality. In Experiment 2, event-related brain potentials (ERPs) revealed conventional upfixes, regardless of matching, evoked larger N250s, indicating perceptual expertise, but mismatching and unconventional dyads elicited larger semantic processing costs (N400) than conventional-matching dyads. Yet mismatches evoked a late negativity, suggesting congruent novel dyads remained construable compared with violations. These results support that combinatorial graphics involve a constrained productive schema, similar to the lexicon of language.</jats:p
    corecore