4,511 research outputs found

    Auto-Encoding Scene Graphs for Image Captioning

    Full text link
    We propose Scene Graph Auto-Encoder (SGAE) that incorporates the language inductive bias into the encoder-decoder image captioning framework for more human-like captions. Intuitively, we humans use the inductive bias to compose collocations and contextual inference in discourse. For example, when we see the relation `person on bike', it is natural to replace `on' with `ride' and infer `person riding bike on a road' even the `road' is not evident. Therefore, exploiting such bias as a language prior is expected to help the conventional encoder-decoder models less likely overfit to the dataset bias and focus on reasoning. Specifically, we use the scene graph --- a directed graph (G\mathcal{G}) where an object node is connected by adjective nodes and relationship nodes --- to represent the complex structural layout of both image (I\mathcal{I}) and sentence (S\mathcal{S}). In the textual domain, we use SGAE to learn a dictionary (D\mathcal{D}) that helps to reconstruct sentences in the S→G→D→S\mathcal{S}\rightarrow \mathcal{G} \rightarrow \mathcal{D} \rightarrow \mathcal{S} pipeline, where D\mathcal{D} encodes the desired language prior; in the vision-language domain, we use the shared D\mathcal{D} to guide the encoder-decoder in the I→G→D→S\mathcal{I}\rightarrow \mathcal{G}\rightarrow \mathcal{D} \rightarrow \mathcal{S} pipeline. Thanks to the scene graph representation and shared dictionary, the inductive bias is transferred across domains in principle. We validate the effectiveness of SGAE on the challenging MS-COCO image captioning benchmark, e.g., our SGAE-based single-model achieves a new state-of-the-art 127.8127.8 CIDEr-D on the Karpathy split, and a competitive 125.5125.5 CIDEr-D (c40) on the official server even compared to other ensemble models

    Adaptive content mapping for internet navigation

    Get PDF
    The Internet as the biggest human library ever assembled keeps on growing. Although all kinds of information carriers (e.g. audio/video/hybrid file formats) are available, text based documents dominate. It is estimated that about 80% of all information worldwide stored electronically exists in (or can be converted into) text form. More and more, all kinds of documents are generated by means of a text processing system and are therefore available electronically. Nowadays, many printed journals are also published online and may even discontinue to appear in print form tomorrow. This development has many convincing advantages: the documents are both available faster (cf. prepress services) and cheaper, they can be searched more easily, the physical storage only needs a fraction of the space previously necessary and the medium will not age. For most people, fast and easy access is the most interesting feature of the new age; computer-aided search for specific documents or Web pages becomes the basic tool for information-oriented work. But this tool has problems. The current keyword based search machines available on the Internet are not really appropriate for such a task; either there are (way) too many documents matching the specified keywords are presented or none at all. The problem lies in the fact that it is often very difficult to choose appropriate terms describing the desired topic in the first place. This contribution discusses the current state-of-the-art techniques in content-based searching (along with common visualization/browsing approaches) and proposes a particular adaptive solution for intuitive Internet document navigation, which not only enables the user to provide full texts instead of manually selected keywords (if available), but also allows him/her to explore the whole database

    Cross-lingual C*ST*RD: English access to Hindi information

    Get PDF
    We present C*ST*RD, a cross-language information delivery system that supports cross-language information retrieval, information space visualization and navigation, machine translation, and text summarization of single documents and clusters of documents. C*ST*RD was assembled and trained within 1 month, in the context of DARPA’s Surprise Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given the brief time, we could not create deep Hindi capabilities for all the modules, but instead experimented with combining shallow Hindi capabilities, or even English-only modules, into one integrated system. Various possible configurations, with different tradeoffs in processing speed and ease of use, enable the rapid deployment of C*ST*RD to new languages under various conditions

    Object-aware Inversion and Reassembly for Image Editing

    Full text link
    By comparing the original and target prompts in editing task, we can obtain numerous editing pairs, each comprising an object and its corresponding editing target. To allow editability while maintaining fidelity to the input image, existing editing methods typically involve a fixed number of inversion steps that project the whole input image to its noisier latent representation, followed by a denoising process guided by the target prompt. However, we find that the optimal number of inversion steps for achieving ideal editing results varies significantly among different editing pairs, owing to varying editing difficulties. Therefore, the current literature, which relies on a fixed number of inversion steps, produces sub-optimal generation quality, especially when handling multiple editing pairs in a natural image. To this end, we propose a new image editing paradigm, dubbed Object-aware Inversion and Reassembly (OIR), to enable object-level fine-grained editing. Specifically, we design a new search metric, which determines the optimal inversion steps for each editing pair, by jointly considering the editability of the target and the fidelity of the non-editing region. We use our search metric to find the optimal inversion step for each editing pair when editing an image. We then edit these editing pairs separately to avoid concept mismatch. Subsequently, we propose an additional reassembly step to seamlessly integrate the respective editing results and the non-editing region to obtain the final edited image. To systematically evaluate the effectiveness of our method, we collect two datasets for benchmarking single- and multi-object editing, respectively. Experiments demonstrate that our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.Comment: Project Page: https://aim-uofa.github.io/OIR-Diffusion

    A Portuguese Case Study

    Get PDF
    There is a high national dependency on Position, Navigation and Timing (PNT) Systems for several individuals, services and organisations that depend on this information on a daily basis. Those who rely on precise, accurate and continuous information need to have resilient systems in order to be highly efficient and reliable. A resilient structure and constantly available systems makes it easier to predict a threat or rapidly recover in a hazardous environment. One of these organisations is the Portuguese Navy, whose main purposes are to combat and maintain maritime safety. In combat, resilient PNT systems are needed for providing robustness in case of any threat or even a simple occasional system failure. In order to guarantee maritime safety, for example in Search and Rescue Missions, the need of PNT information is constant and indispensable for positioning control. The large diversity of PNT-dependent equipment, developed over the last two decades, is a valid showcase for the high GPS dependency that is seen nowadays – which is vulnerable to various factors like interference, jamming, spoofing and ionospheric conditions. The recent interest over integrated PNT system resolutions is related to the search for redundancy, accuracy, precision, availability, low cost, coverage, reliability and continuity. This study aimed to build a current PNT Portuguese picture based on Stakeholder Analysis and Interviews; assess the vulnerability of those who depend mainly on GPS for PNT information and, find out what the next steps should be in order to create a National PNT Strategy.Existe uma elevada dependência nacional em sistemas de Posição, Navegação e Tempo (PNT) por parte de diversos indivíduos, serviços e organizações que dependem desta informação no seu dia-a-dia. Todos os que dependem de informação precisa, exata e contínua, necessitam de ter sistemas resilientes para que sejam altamente eficientes e fiáveis. Uma estrutura resiliente e sistemas continuamente disponíveis facilitam a previsão de possíveis ameaças ou a expedita recuperação da funcionalidade, em ambientes hostis. Uma destas organizações é a Marinha Portuguesa cujas funções principais são o combate, a salvaguarda da vida humana no mar e a segurança marítima e da navegação. Para o combate, são necessários sistemas PNT, resilientes, que ofereçam robustez em caso de uma simples ameaça ou falha temporária dos sistemas. Por forma a ser possível cumprir a missão, a necessidade de ter informação PNT, fidedigna e atualizada, é constante e indispensável para o controlo preciso e exato da posição. Uma unidade naval, por forma a permanecer continuamente no mar, manter a sua prontidão, treinar a sua guarnição ou ser empenhada num cenário de guerra, necessita de saber, com confiança e sem erros, a sua posição e referência de tempo. A grande diversidade de sistemas dependentes de informação PNT, desenvolveu-se em larga escala nas últimas duas décadas e sustenta cada vez mais a alta dependência do GPS, que é vulnerável a diversas fontes de erro, tais como interferência, empastelamento, mistificação e condições ionosféricas. Atualmente, o elevado interesse na criação de sistemas PNT integrados está associado à procura da redundância, exatidão, precisão, disponibilidade, baixo custo, cobertura, fiabilidade e continuidade. Este estudo teve como objetivos construir o panorama atual, em Portugal, ao nível dos Sistemas PNT, baseando-se numa análise de Stakeholders e entrevistas; avaliar a vulnerabilidade de organizações e serviços que dependam exclusivamente do GPS como fonte de informação PNT; e propor um possível caminho para que seja possível criar uma Estratégia PNT Naciona

    Exploratory Search on Mobile Devices

    Get PDF
    The goal of this thesis is to provide a general framework (MobEx) for exploratory search especially on mobile devices. The central part is the design, implementation, and evaluation of several core modules for on-demand unsupervised information extraction well suited for exploratory search on mobile devices and creating the MobEx framework. These core processing elements, combined with a multitouch - able user interface specially designed for two families of mobile devices, i.e. smartphones and tablets, have been finally implemented in a research prototype. The initial information request, in form of a query topic description, is issued online by a user to the system. The system then retrieves web snippets by using standard search engines. These snippets are passed through a chain of NLP components which perform an ondemand or ad-hoc interactive Query Disambiguation, Named Entity Recognition, and Relation Extraction task. By on-demand or ad-hoc we mean the components are capable to perform their operations on an unrestricted open domain within special time constraints. The result of the whole process is a topic graph containing the detected associated topics as nodes and the extracted relation ships as labelled edges between the nodes. The Topic Graph is presented to the user in different ways depending on the size of the device she is using. Various evaluations have been conducted that help us to understand the potentials and limitations of the framework and the prototype

    Structures of authority : postwar masculinity and the British police

    Get PDF
    The British police procedural novel of the 1950s has attracted little critical attention, perhaps because the decade is seen as a ‘golden age’ of police legitimacy (Loader and Mulcahy, 2003). This perception is reinforced by the cinema of the period, where the police are predominantly represented as embodying traditional masculinities and demonstrating familiar national virtues. They are also shown to be policing a society that was itself fundamentally homogenous. Yet this template bore little resemblance to the realities of crime in the late 1940s and early 1950s, and it needs to be set against developments in the crime novel. While cinema used the genre to reassure, it is less clear whether the police procedural of the period attempted or achieved the same end. This hypothesis is explored through an examination of John Creasey’s popular Gideon books. Characterised by open endings and a disturbing level of violence, these novels demonstrate a significant transition in the representation of the police in British crime fiction, suggesting that the 1950s procedural was not a source of reassurance, but a textual space that recognised and negotiated the pressures of a changing society.PostprintPeer reviewe

    Volume 39- Issue 3- December 1929

    Get PDF
    The Rose Thorn, Rose-Hulman\u27s independent student newspaper.https://scholar.rose-hulman.edu/rosethorn/2082/thumbnail.jp
    • …
    corecore