86 research outputs found

    Embedded Electronics In Medical Applications

    Get PDF
    Proceedings of"Conference on Recent Advances in Biomaterials Dec 17-18 '10"Held at Saveetha School of Engineering, Saveetha University, Thandalam, Chennai-602 105, Tamilnadu, IndiaTheme 10Embedded Electronics In Medical Application

    Integrating Deep Contextualized Word Embeddings into Text Summarization Systems

    Get PDF
    In questa tesi saranno usate tecniche di deep learning per affrontare unodei problemi più difficili dell’elaborazione automatica del linguaggio naturale:la generazione automatica di riassunti. Dato un corpus di testo, l’obiettivoè quello di generare un riassunto che sia in grado di distillare e comprimerel’informazione dall’intero testo di partenza. Con i primi approcci si é provatoa catturare il significato del testo attraverso l’uso di regole scritte dagliumani. Dopo questa era simbolica basata su regole, gli approcchi statistici hanno preso il sopravvento. Negli ultimi anni il deep learning ha impattato positivamente ogni area dell’elaborazione automatica del linguaggionaturale, incluso la generazione automatica dei riassunti. In questo lavoroi modelli pointer-generator [See et al., 2017] sono utilizzati in combinazionea pre-trained deep contextualized word embeddings [Peters et al., 2018]. Sivaluta l’approccio sui due più grossi dataset per la generazione automaticadei riassunti disponibili ora: il dataset CNN/Daily Mail e il dataset Newsroom. Il dataset CNN/Daily Mail è stato generato partendo dal dataset diQuestion Answering pubblicato da DeepMind [Hermann et al., 2015], concatenando le frasi di highlight delle news e formando cosı̀ dei riassunti multifrase. Il dataset Newsroom [Grusky et al., 2018] è, invece, il primo datasetesplicitamente costruito per la generazione automatica di riassunti. Comprende un milione di coppie articolo-riassunto con diversi gradi di estrattività/astrattività a diversi ratio di compressione.L’approccio è valutato sui test-set con l’uso della metrica Recall-Oriented Understudy for Gisting Evaluation (ROUGE). Questo approccio causa un sostanzioso aumento nelle performance per il dataset Newsroom raggiungendo lo stato dell’arte sul valore di ROUGE-1 e valori competitivi per ROUGE-2 e ROUGE-L

    Text-image synergy for multimodal retrieval and annotation

    Get PDF
    Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text und Bild sind die beiden häufigsten Arten von Inhalten im Internet. Während es für Menschen einfach ist, gerade aus dem Zusammenspiel von Text- und Bildinhalten Informationen zu erfassen, stellt diese kombinierte Darstellung von Inhalten Softwaresysteme vor große Herausforderungen. In dieser Dissertation werden Probleme studiert, für deren Lösung das Verständnis des Zusammenspiels von Text- und Bildinhalten wesentlich ist. Es werden Methoden und Vorschläge präsentiert und empirisch bewertet, die semantische Verbindungen zwischen Text und Bild in multimodalen Daten herstellen. Wir stellen in dieser Dissertation vier miteinander verbundene Text- und Bildprobleme vor: • Bildersuche. Ob Bilder anhand von textbasierten Suchanfragen gefunden werden, hängt stark davon ab, ob der Text in der Nähe des Bildes mit dem der Anfrage übereinstimmt. Bilder ohne textuellen Kontext, oder sogar mit thematisch passendem Kontext, aber ohne direkte Übereinstimmungen der vorhandenen Schlagworte zur Suchanfrage, können häufig nicht gefunden werden. Zur Abhilfe schlagen wir vor, drei Arten von Informationen in Kombination zu nutzen: visuelle Informationen (in Form von automatisch generierten Bildbeschreibungen), textuelle Informationen (Stichworte aus vorangegangenen Suchanfragen), und Alltagswissen. • Verbesserte Bildbeschreibungen. Bei der Objekterkennung durch Computer Vision kommt es des Öfteren zu Fehldetektionen und Inkohärenzen. Die korrekte Identifikation von Bildinhalten ist jedoch eine wichtige Voraussetzung für die Suche nach Bildern mittels textueller Suchanfragen. Um die Fehleranfälligkeit bei der Objekterkennung zu minimieren, schlagen wir vor Alltagswissen einzubeziehen. Durch zusätzliche Bild-Annotationen, welche sich durch den gesunden Menschenverstand als thematisch passend erweisen, können viele fehlerhafte und zusammenhanglose Erkennungen vermieden werden. • Bild-Text Platzierung. Auf Internetseiten mit Text- und Bildinhalten (wie Nachrichtenseiten, Blogbeiträge, Artikel in sozialen Medien) werden Bilder in der Regel an semantisch sinnvollen Positionen im Textfluss platziert. Wir nutzen dies um ein Framework vorzuschlagen, in dem relevante Bilder ausgesucht werden und mit den passenden Abschnitten eines Textes assoziiert werden. • Bildunterschriften. Bilder, die als Teil von multimodalen Inhalten zur Verbesserung der Lesbarkeit von Texten dienen, haben typischerweise Bildunterschriften, die zum Kontext des umgebenden Texts passen. Wir schlagen vor, den Kontext beim automatischen Generieren von Bildunterschriften ebenfalls einzubeziehen. Üblicherweise werden hierfür die Bilder allein analysiert. Wir stellen die kontextbezogene Bildunterschriftengenerierung vor. Unsere vielversprechenden Beobachtungen und Ergebnisse eröffnen interessante Möglichkeiten für weitergehende Forschung zur computergestützten Erfassung des Zusammenspiels von Text- und Bildinhalten

    Arabic multi-document text summarisation

    Get PDF
    Multi-document summarisation is the process of producing a single summary of a collection of related documents. Much of the current work on multi-document text summarisation is concerned with the English language; relevant resources are numerous and readily available. These resources include human generated (gold-standard) and automatic summaries. Arabic multi-document summarisation is still in its infancy. One of the obstacles to progress is the limited availability of Arabic resources to support this research. When we started our research there were no publicly available Arabic multi-document gold-standard summaries, which are needed to automatically evaluate system generated summaries. The Document Understanding Conference (DUC) and Text Analysis Conference (TAC) at that time provided resources such as gold-standard extractive and abstractive summaries (both human and system generated) that were only available in English. Our aim was to push forward the state-of-the-art in Arabic multi-document summarisation. This required advancements in at least two areas. The first area was the creation of Arabic test collections. The second area was concerned with the actual summarisation process to find methods that improve the quality of Arabic summaries. To address both points we created single and multi-document Arabic test collections both automatically and manually using a commonly used English dataset and by having human participants. We developed extractive language dependent and language independent single and multi-document summarisers, both for Arabic and English. In our work we provided state-of-the-art approaches for Arabic multi-document summarisation. We succeeded in including Arabic in one of the leading summarisation conferences the Text Analysis Conference (TAC). Researchers on Arabic multi-document summarisation now have resources and tools that can be used to advance the research in this field

    A joint text mining-rank size investigation of the rhetoric structures of the US Presidents’ speeches

    Get PDF
    © 2019 Elsevier Ltd This work presents a text mining context and its use for a deep analysis of the messages delivered by politicians. Specifically, we deal with an expert systems-based exploration of the rhetoric dynamics of a large collection of US Presidents’ speeches, ranging from Washington to Trump. In particular, speeches are viewed as complex expert systems whose structures can be effectively analyzed through rank-size laws. The methodological contribution of the paper is twofold. First, we develop a text mining-based procedure for the construction of the dataset by using a web scraping routine on the Miller Center website – the repository site collecting the speeches. Second, we explore the implicit structure of the discourse data by implementing a rank-size procedure over the individual speeches, being the words of each speech ranked in terms of their frequencies. The scientific significance of the proposed combination of text-mining and rank-size approaches can be found in its flexibility and generality, which let it be reproducible to a wide set of expert systems and text mining contexts. The usefulness of the proposed method and of the speeches analysis is demonstrated by the findings themselves. Indeed, in terms of impact, it is worth noting that interesting conclusions of social, political and linguistic nature on how 45 United States Presidents, from April 30, 1789 till February 28, 2017 delivered political messages can be carried out. Indeed, the proposed analysis shows some remarkable regularities, not only inside a given speech, but also among different speeches. Moreover, under a purely methodological perspective, the presented contribution suggests possible ways of generating a linguistic decision-making algorithm

    The Best Explanation:Beyond Right and Wrong in Question Answering

    Get PDF

    Synthetic steganography: Methods for generating and detecting covert channels in generated media

    Get PDF
    Issues of privacy in communication are becoming increasingly important. For many people and businesses, the use of strong cryptographic protocols is sufficient to protect their communications. However, the overt use of strong cryptography may be prohibited or individual entities may be prohibited from communicating directly. In these cases, a secure alternative to the overt use of strong cryptography is required. One promising alternative is to hide the use of cryptography by transforming ciphertext into innocuous-seeming messages to be transmitted in the clear. ^ In this dissertation, we consider the problem of synthetic steganography: generating and detecting covert channels in generated media. We start by demonstrating how to generate synthetic time series data that not only mimic an authentic source of the data, but also hide data at any of several different locations in the reversible generation process. We then design a steganographic context-sensitive tiling system capable of hiding secret data in a variety of procedurally-generated multimedia objects. Next, we show how to securely hide data in the structure of a Huffman tree without affecting the length of the codes. Next, we present a method for hiding data in Sudoku puzzles, both in the solved board and the clue configuration. Finally, we present a general framework for exploiting steganographic capacity in structured interactions like online multiplayer games, network protocols, auctions, and negotiations. Recognizing that structured interactions represent a vast field of novel media for steganography, we also design and implement an open-source extensible software testbed for analyzing steganographic interactions and use it to measure the steganographic capacity of several classic games. ^ We analyze the steganographic capacity and security of each method that we present and show that existing steganalysis techniques cannot accurately detect the usage of the covert channels. We develop targeted steganalysis techniques which improve detection accuracy and then use the insights gained from those methods to improve the security of the steganographic systems. We find that secure synthetic steganography, and accurate steganalysis thereof, depends on having access to an accurate model of the cover media

    Study of result presentation and interaction for aggregated search

    Get PDF
    The World Wide Web has always attracted researchers and commercial search engine companies due to the enormous amount of information available on it. "Searching" on web has become an integral part of today's world, and many people rely on it when looking for information. The amount and the diversity of information available on the Web has also increased dramatically. Due to which, the researchers and the search engine companies are making constant efforts in order to make this information accessible to the people effectively. Not only there is an increase in the amount and diversity of information available online, users are now often seeking information on broader topics. Users seeking information on broad topics, gather information from various information sources (e.g, image, video, news, blog, etc). For such information requests, not only web results but results from different document genre and multimedia contents are also becoming relevant. For instance, users' looking for information on "Glasgow" might be interested in web results about Glasgow, Map of Glasgow, Images of Glasgow, News of Glasgow, and so on. Aggregated search aims to provide access to this diverse information in a unified manner by aggregating results from different information sources on a single result page. Hence making information gathering process easier for broad topics. This thesis aims to explore the aggregated search from the users' perspective. The thesis first and foremost focuses on understanding and describing the phenomena related to the users' search process in the context of the aggregated search. The goal is to participate in building theories and in understanding constraints, as well as providing insights into the interface design space. In building this understanding, the thesis focuses on the click-behavior, information need, source relevance, dynamics of search intents. The understanding comes partly from conducting users studies and, from analyzing search engine log data. While the thematic (or topical) relevance of documents is important, this thesis argues that the "source type" (source-orientation) may also be an important dimension in the relevance space for investigating in aggregated search. Therefore, relevance is multi-dimensional (topical and source-orientated) within the context of aggregated search. Results from the study suggest that the effect of the source-orientation was a significant factor in an aggregated search scenario. Hence adds another dimension to the relevance space within the aggregated search scenario. The thesis further presents an effective method which combines rule base and machine learning techniques to identify source-orientation behind a user query. Furthermore, after analyzing log-data from a search engine company and conducting user study experiments, several design issues that may arise with respect to the aggregated search interface are identified. In order to address these issues, suitable design guidelines that can be beneficial from the interface perspective are also suggested. To conclude, aim of this thesis is to explore the emerging aggregated search from users' perspective, since it is a very important for front-end technologies. An additional goal is to provide empirical evidence for influence of aggregated search on users searching behavior, and identify some of the key challenges of aggregated search. During this work several aspects of aggregated search will be uncovered. Furthermore, this thesis will provide a foundations for future research in aggregated search and will highlight the potential research directions
    • …
    corecore