212 research outputs found
Cloud-Based Benchmarking of Medical Image Analysis
Medical imagin
Feature based dynamic intra-video indexing
A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate
Recommended from our members
Towards Robust Long-form Text Generation Systems
Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to world knowledge and the input prompt; (2) it is difficult to accurately evaluate the quality of long-form generated text; (3) it is difficult to identify whether a piece of long-form text was AI-generated, a task necessary to prevent widespread misinformation and plagiarism.
In this thesis I design algorithms aimed at making progress towards these three issues in current LLMs. I will first describe a retrieval-augmented system we built for long-form question answering, to improve factual correctness of long-form generated text. However, a careful empirical analysis reveals issues related to input/output consistency of generated text, and an inherent difficulty in evaluation. I will then describe our model RankGen, which uses large-scale contrastive learning on documents to significantly outperform competing long-form text generation methods to generate text more faithful to the input. Next, I will describe our efforts to improve human evaluation of long-form generation (issue #2) by proposing the LongEval guidelines. LongEval is a set of three simple empirically-motivated ideas to make human evaluation of long-form generation more consistent, less expensive, and cognitively easier for evaluators. Finally, I describe my work on AI-generated text detection (issue #3), and showcase the brittleness of existing methods to paraphrasing attacks I designed. I will describe a simple new AI-generated text detection algorithm using information retrieval, which is significantly more robust to paraphrasing attacks.
Finally, I conclude this thesis with some future research directions that I am excited about, including plan-based long-form text generation, and a deeper dive into understanding large language model training dynamics
Proceedings of the Conference on Natural Language Processing 2010
This book contains state-of-the-art contributions to the 10th
conference on Natural Language Processing, KONVENS 2010
(Konferenz zur Verarbeitung natürlicher Sprache), with a focus
on semantic processing.
The KONVENS in general aims at offering a broad perspective
on current research and developments within the interdisciplinary
field of natural language processing. The central theme
draws specific attention towards addressing linguistic aspects
ofmeaning, covering deep as well as shallow approaches to semantic
processing. The contributions address both knowledgebased
and data-driven methods for modelling and acquiring
semantic information, and discuss the role of semantic information
in applications of language technology.
The articles demonstrate the importance of semantic processing,
and present novel and creative approaches to natural
language processing in general. Some contributions put their
focus on developing and improving NLP systems for tasks like
Named Entity Recognition or Word Sense Disambiguation, or
focus on semantic knowledge acquisition and exploitation with
respect to collaboratively built ressources, or harvesting semantic
information in virtual games. Others are set within the
context of real-world applications, such as Authoring Aids, Text
Summarisation and Information Retrieval. The collection highlights
the importance of semantic processing for different areas
and applications in Natural Language Processing, and provides
the reader with an overview of current research in this field
Enhancing scene text recognition with visual context information
This thesis addresses the problem of improving text spotting systems, which aim to detect and recognize text in unrestricted images (e.g. a street sign, an advertisement, a bus destination, etc.). The goal is to improve the performance of off-the-shelf vision systems by exploiting the semantic information derived from the image itself. The rationale is that knowing the content of the image or the visual context can help to decide which words are the correct andidate
words.
For example, the fact that an image shows a coffee shop makes it more likely that a word on a signboard reads as Dunkin and not unkind.
We address this problem by drawing on successful developments in natural language processing and machine learning, in particular, learning to re-rank and neural networks, to present post-process frameworks that improve state-of-the-art text spotting systems without the need for costly data-driven re-training or tuning procedures.
Discovering the degree of semantic relatedness of candidate words and their image context is a task related to assessing the semantic similarity between words or text fragments. However, semantic relatedness is more general than similarity (e.g. car, road, and traffic light are related but not similar) and requires certain adaptations. To meet the requirements of these broader perspectives of semantic similarity, we develop two approaches to learn the semantic related-ness of the spotted word and its environmental context: word-to-word (object) or word-to-sentence (caption). In the word-to-word approach, word embed-ding based re-rankers are developed. The re-ranker takes the words from the text spotting baseline and re-ranks them based on the visual context from the object classifier. For the second, an end-to-end neural approach is designed to drive image description (caption) at the sentence-level as well as the word-level (objects) and re-rank them based not only on the visual context but also on the co-occurrence between them.
As an additional contribution, to meet the requirements of data-driven ap-proaches such as neural networks, we propose a visual context dataset for this task, in which the publicly available COCO-text dataset [Veit et al. 2016] has been extended with information about the scene (including the objects and places appearing in the image) to enable researchers to include the semantic relations between texts and scene in their Text Spotting systems, and to offer a common evaluation baseline for such approaches.Aquesta tesi aborda el problema de millorar els sistemes de reconeixement de text, que permeten detectar i reconèixer text en imatges no restringides (per exemple, un cartell al carrer, un anunci, una destinació d’autobús, etc.). L’objectiu és millorar el rendiment dels sistemes de visió existents explotant la informació semà ntica derivada de la pròpia imatge. La idea principal és que conèixer el contingut de la imatge o el context visual en el que un text apareix, pot ajudar a decidir quines són les paraules correctes. Per exemple, el fet que una imatge mostri una cafeteria fa que sigui més probable que una paraula en un rètol es llegeixi com a Dunkin que no pas com unkind. Abordem aquest problema recorrent a avenços en el processament del llenguatge natural i l’aprenentatge automà tic, en particular, aprenent re-rankers i xarxes neuronals, per presentar solucions de postprocés que milloren els sistemes de l’estat de l’art de reconeixement de text, sense necessitat de costosos procediments de reentrenament o afinació que requereixin grans quantitats de dades. Descobrir el grau de relació semà ntica entre les paraules candidates i el seu context d’imatge és una tasca relacionada amb l’avaluació de la semblança semà ntica entre paraules o fragments de text. Tanmateix, determinar l’existència d’una relació semà ntica és una tasca més general que avaluar la semblança (per exemple, cotxe, carretera i semà for estan relacionats però no són similars) i per tant els mètodes existents requereixen certes adaptacions. Per satisfer els requisits d’aquestes perspectives més à mplies de relació semà ntica, desenvolupem dos enfocaments per aprendre la relació semà ntica de la paraula reconeguda i el seu context: paraula-a-paraula (amb els objectes a la imatge) o paraula-a-frase (subtÃtol de la imatge). En l’enfocament de paraula-a-paraula s’usen re-rankers basats en word-embeddings. El re-ranker pren les paraules proposades pel sistema base i les torna a reordenar en funció del context visual proporcionat pel classificador d’objectes. Per al segon cas, s’ha dissenyat un enfocament neuronal d’extrem a extrem per explotar la descripció de la imatge (subtÃtol) tant a nivell de frase com a nivell de paraula i re-ordenar les paraules candidates basant-se tant en el context visual com en les co-ocurrències amb el subtÃtol. Com a contribució addicional, per satisfer els requisits dels enfocs basats en dades com ara les xarxes neuronals, presentem un conjunt de dades de contextos visuals per a aquesta tasca, en el què el conjunt de dades COCO-text disponible públicament [Veit et al. 2016] s’ha ampliat amb informació sobre l’escena (inclosos els objectes i els llocs que apareixen a la imatge) per permetre als investigadors incloure les relacions semà ntiques entre textos i escena als seus sistemes de reconeixement de text, i oferir una base d’avaluació comuna per a aquests enfocaments
Semi-automated Ontology Generation for Biocuration and Semantic Search
Background:
In the life sciences, the amount of literature and experimental data grows at a tremendous rate. In order to effectively access and integrate these data, biomedical ontologies – controlled, hierarchical vocabularies – are being developed.
Creating and maintaining such ontologies is a difficult, labour-intensive, manual process. Many computational methods which can support ontology construction have been proposed in the past. However, good, validated systems are largely missing.
Motivation:
The biocuration community plays a central role in the development of ontologies. Any method that can support their efforts has the potential to have a huge impact in the life sciences.
Recently, a number of semantic search engines were created that make use of biomedical ontologies for document retrieval. To transfer the technology to other knowledge domains, suitable ontologies need to be created. One area where ontologies may prove particularly useful is the search for alternative methods to animal testing, an area where comprehensive search is of special interest to determine the availability or unavailability of alternative methods.
Results:
The Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG) developed in this thesis is a system which supports the creation and extension of ontologies by semi-automatically generating terms, definitions, and parent-child relations from text in PubMed, the web, and PDF repositories. The system is seamlessly integrated into OBO-Edit and Protégé, two widely used ontology editors in the life sciences. DOG4DAG generates terms by identifying statistically significant noun-phrases in text. For definitions and parent-child relations it employs pattern-based web searches. Each generation step has been systematically evaluated using manually validated benchmarks. The term generation leads to high quality terms also found in manually created ontologies. Definitions can be retrieved for up to 78% of terms, child ancestor relations for up to 54%. No other validated system exists that achieves comparable results.
To improve the search for information on alternative methods to animal testing an ontology has been developed that contains 17,151 terms of which 10% were newly created and 90% were re-used from existing resources. This ontology is the core of Go3R, the first semantic search engine in this field. When a user performs a search query with Go3R, the search engine expands this request using the structure and terminology of the ontology. The machine classification employed in Go3R is capable of distinguishing documents related to alternative methods from those which are not with an F-measure of 90% on a manual benchmark. Approximately 200,000 of the 19 million documents listed in PubMed were identified as relevant, either because a specific term was contained or due to the automatic classification. The Go3R search engine is available on-line under www.Go3R.org
Semi-automated Ontology Generation for Biocuration and Semantic Search
Background:
In the life sciences, the amount of literature and experimental data grows at a tremendous rate. In order to effectively access and integrate these data, biomedical ontologies – controlled, hierarchical vocabularies – are being developed.
Creating and maintaining such ontologies is a difficult, labour-intensive, manual process. Many computational methods which can support ontology construction have been proposed in the past. However, good, validated systems are largely missing.
Motivation:
The biocuration community plays a central role in the development of ontologies. Any method that can support their efforts has the potential to have a huge impact in the life sciences.
Recently, a number of semantic search engines were created that make use of biomedical ontologies for document retrieval. To transfer the technology to other knowledge domains, suitable ontologies need to be created. One area where ontologies may prove particularly useful is the search for alternative methods to animal testing, an area where comprehensive search is of special interest to determine the availability or unavailability of alternative methods.
Results:
The Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG) developed in this thesis is a system which supports the creation and extension of ontologies by semi-automatically generating terms, definitions, and parent-child relations from text in PubMed, the web, and PDF repositories. The system is seamlessly integrated into OBO-Edit and Protégé, two widely used ontology editors in the life sciences. DOG4DAG generates terms by identifying statistically significant noun-phrases in text. For definitions and parent-child relations it employs pattern-based web searches. Each generation step has been systematically evaluated using manually validated benchmarks. The term generation leads to high quality terms also found in manually created ontologies. Definitions can be retrieved for up to 78% of terms, child ancestor relations for up to 54%. No other validated system exists that achieves comparable results.
To improve the search for information on alternative methods to animal testing an ontology has been developed that contains 17,151 terms of which 10% were newly created and 90% were re-used from existing resources. This ontology is the core of Go3R, the first semantic search engine in this field. When a user performs a search query with Go3R, the search engine expands this request using the structure and terminology of the ontology. The machine classification employed in Go3R is capable of distinguishing documents related to alternative methods from those which are not with an F-measure of 90% on a manual benchmark. Approximately 200,000 of the 19 million documents listed in PubMed were identified as relevant, either because a specific term was contained or due to the automatic classification. The Go3R search engine is available on-line under www.Go3R.org
Semantic Interaction in Web-based Retrieval Systems : Adopting Semantic Web Technologies and Social Networking Paradigms for Interacting with Semi-structured Web Data
Existing web retrieval models for exploration and interaction with web data do not take into account semantic information, nor do they allow for new forms of interaction by employing meaningful interaction and navigation metaphors in 2D/3D. This thesis researches means for introducing a semantic dimension into the search and exploration process of web content to enable a significantly positive user experience. Therefore, an inherently dynamic view beyond single concepts and models from semantic information processing, information extraction and human-machine interaction is adopted. Essential tasks for semantic interaction such as semantic annotation, semantic mediation and semantic human-computer interaction were identified and elaborated for two general application scenarios in web retrieval: Web-based Question Answering in a knowledge-based dialogue system and semantic exploration of information spaces in 2D/3D
- …