77 research outputs found

    Automated scholarly paper review: Technologies and challenges

    Full text link
    Peer review is a widely accepted mechanism for research evaluation, playing a pivotal role in scholarly publishing. However, criticisms have long been leveled on this mechanism, mostly because of its inefficiency and subjectivity. Recent years have seen the application of artificial intelligence (AI) in assisting the peer review process. Nonetheless, with the involvement of humans, such limitations remain inevitable. In this review paper, we propose the concept and pipeline of automated scholarly paper review (ASPR) and review the relevant literature and technologies of achieving a full-scale computerized review process. On the basis of the review and discussion, we conclude that there is already corresponding research and implementation at each stage of ASPR. We further look into the challenges in ASPR with the existing technologies. The major difficulties lie in imperfect document parsing and representation, inadequate data, defective human-computer interaction and flawed deep logical reasoning. Moreover, we discuss the possible moral & ethical issues and point out the future directions of ASPR. In the foreseeable future, ASPR and peer review will coexist in a reinforcing manner before ASPR is able to fully undertake the reviewing workload from humans

    End-to-end Learning for Mining Text and Network Data

    Full text link
    A wealth of literature studies user behaviors in online communities, e.g., how users respond to information that are spreading over social networks. One way to study user responses is to analyze user-generated text, by identifying attitude towards target topics. Another way is to analyze the information diffusion networks over involved users. Conventional methods require manual encoding of world knowledge, which is ineffective in many cases. Therefore, to push research forward, we design end-to-end deep learning algorithms that learn high-level representations directly from data and optimize for particular tasks, relieving humans from hard coding features or rules, while achieving better performance. Specifically, I study attitude identification in the text mining domain, and important prediction tasks in the network domain. The key roles of text and networks in understanding user behaviors in online communities are not the only reason that we study them together. Compared with other types of data (e.g., image and speech), text and networks are both discrete and thus may share similar challenges and solutions. Attitude identification is conventionally decomposed into two separate subtasks: target detection that identifies whether a given target is mentioned in the text, and polarity classification that classifies the exact sentiment polarity. However, this decomposition fails to capture interactions between subtasks. To remedy the issue, we developed an end-to-end deep learning architecture, with the two subtasks interleaved by a memory network. Moreover, as the learned representations may share the same semantics for some targets, but vary for others, our model also incorporates the interactions among entities. For information networks, we aim to learn the representation of network structures in order to solve many valuable prediction tasks in the network community. An example of prediction tasks is network growth prediction, which assists decision makers in optimizing strategies. Instead of handcrafting features that could lead to severe loss of structural information, we propose to learn graph representations through a deep end-to-end prediction model. By finding "signatures" for graphs, we convert graphs into matrices, where convolutional neural networks could be applied. In additional to topology, information networks are often associated with different sources of information. We specifically consider the task of cascade prediction, where global context, text content on both nodes, and diffusion graphs play important roles for prediction. Conventional methods require manual specification of the interactions among different information sources, which is easy to miss key information. We present a novel, end-to-end deep learning architecture named DeepCas, which first represents a cascade graph as a set of cascade paths that are sampled through random walks. Such a representation not only allows incorporation of the global context, but also bounds the loss of structural information. After modeling the information of global context, we equip DeepCas with the ability to jointly model text and network in a unified framework. We present a gating mechanism to dynamically fuse the structural and textual representations of nodes based on their respective properties. To incorporate the text information associated with both diffusion items and nodes, attention mechanisms are employed over node text based on their interactions with item text.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/140791/1/lichengz_1.pd

    Development of a Corpus for User­based Scientific Question Answering

    Get PDF
    Tese de mestrado, Bioinformática e Biologia Computacional, Universidade de Lisboa, Faculdade de Ciências, 2021In recent years Question & Answering (QA) tasks became particularly relevant in the research field of natural language understanding. However, the lack of good quality datasets has been an important limiting factor in the quest for better models. Particularly in the biomedical domain, the scarcity of gold standard labelled datasets has been a recognized obstacle given its idiosyncrasies and complexities often require the participation of skilled domain¬specific experts in producing such datasets. To address this issue, a method for automatically gather Question¬Answer pairs from online QA biomedical forums has been suggested yielding a corpus named BiQA. The authors describe several strategies to validate this new dataset but a human manual verification has not been conducted. With this in mind, this dissertation was set out with the objectives of performing a manual verification of a sample of 1200 questions of BiQA and also to expanding these questions, by adding features, into a new corpus of text ¬ BiQA2 ¬ with the goal of contributing with a new corpusfor biomedical QA research. Regarding the manual verification of BiQA, a methodology for its characterization was laid out and allowed the identification of an array of potential problems related to the nature of its questions and answers aptness for which possible improvement solutions were presented. Concomitantly, the proposed new BiQA2 corpus ¬ created upon the validated questions and answers from the perused samples from BiQA ¬ builds new features similar to those observed in other biomedical corpus such as the BioASQ dataset. Both BiQA and BiQA2 were applied to deep learning strategies previously submitted to the BioASQ competition to assess their performance as a source of training data. Although the results achieved with the models created using BiQA2 exhibit limited capability pertaining to the BioASQ challenge, they also show some potential to contribute positively to model training in tasks such as Document re-ranking and answering to ‘yes/no’ questions

    Basque and Spanish Multilingual TTS Model for Speech-to-Speech Translation

    Get PDF
    [EN] Lately, multiple Text-to-Speech models have emerged using Deep Neural networks to synthesize audio from text. In this work, the state-of-the-art multilingual and multi-speaker Text-to-Speech model has been trained in Basque, Spanish, Catalan, and Galician. The research consisted of gathering the datasets, pre-processing their audio and text data, training the model in the languages in different steps, and evaluating the results at each point. For the training step, a transfer learning approach has been used from a model already trained in three languages: English, Portuguese, and French. Therefore, the final model created here supports a total of seven languages. Moreover, these models also support zero-shot voice conversion, using an input audio file as a reference. Finally, a prototype application has been created to do Speech-to-Speech Translation, putting together the models trained here and other models from the community. Along the way, some Deep Speech Speech-to-Text models have been generated for Basque and Galician.[EU] Azkenaldian, Text-to-Speech eredu anitz sortu dira sare neuronal sakonak erabiliz, testutik audioa sintetizatzeko. Lan honetan, state-of-the-art Text-to-Speech eredu eleaniztun eta hiztun anitzeko eredua landu da euskaraz, gaztelaniaz, katalanez eta galegoz. Ikerketa honetan datu-multzoak bildu, haien audio- eta testu-datuak aldez aurretik prozesatu, eredua hizkuntzetan entrenatu da urrats desberdinetan eta emaitzak puntu bakoitzean ebaluatu dira. Entrenatze-urratserako, ikaskuntza-transferentzia teknika erabili da dagoeneko hiru hizkuntzatan trebatutako eredu batetik abiatuta: ingelesa, portugesa eta frantsesa. Beraz, hemen sortutako azken ereduak zazpi hizkuntza onartzen ditu guztira. Gainera, eredu hauek zero-shot ahots bihurketa ere egiten dute, sarrerako audio fitxategi bat erreferentzia gisa erabiliz. Azkenik, Speech-to-Speech Translation egiteko prototipo aplikazio bat sortu da hemen entrenatutako ereduak eta komunitateko beste eredu batzuk elkartuz. Bide horretan, Deep Speech Speech-to-Text eredu batzuk sortu dira euskararako eta galegorako

    Collision Avoidance on Unmanned Aerial Vehicles using Deep Neural Networks

    Get PDF
    Unmanned Aerial Vehicles (UAVs), although hardly a new technology, have recently gained a prominent role in many industries, being widely used not only among enthusiastic consumers but also in high demanding professional situations, and will have a massive societal impact over the coming years. However, the operation of UAVs is full of serious safety risks, such as collisions with dynamic obstacles (birds, other UAVs, or randomly thrown objects). These collision scenarios are complex to analyze in real-time, sometimes being computationally impossible to solve with existing State of the Art (SoA) algorithms, making the use of UAVs an operational hazard and therefore significantly reducing their commercial applicability in urban environments. In this work, a conceptual framework for both stand-alone and swarm (networked) UAVs is introduced, focusing on the architectural requirements of the collision avoidance subsystem to achieve acceptable levels of safety and reliability. First, the SoA principles for collision avoidance against stationary objects are reviewed. Afterward, a novel image processing approach that uses deep learning and optical flow is presented. This approach is capable of detecting and generating escape trajectories against potential collisions with dynamic objects. Finally, novel models and algorithms combinations were tested, providing a new approach for the collision avoidance of UAVs using Deep Neural Networks. The feasibility of the proposed approach was demonstrated through experimental tests using a UAV, created from scratch using the framework developed.Os veículos aéreos não tripulados (VANTs), embora dificilmente considerados uma nova tecnologia, ganharam recentemente um papel de destaque em muitas indústrias, sendo amplamente utilizados não apenas por amadores, mas também em situações profissionais de alta exigência, sendo expectável um impacto social massivo nos próximos anos. No entanto, a operação de VANTs está repleta de sérios riscos de segurança, como colisões com obstáculos dinâmicos (pássaros, outros VANTs ou objetos arremessados). Estes cenários de colisão são complexos para analisar em tempo real, às vezes sendo computacionalmente impossível de resolver com os algoritmos existentes, tornando o uso de VANTs um risco operacional e, portanto, reduzindo significativamente a sua aplicabilidade comercial em ambientes citadinos. Neste trabalho, uma arquitectura conceptual para VANTs autônomos e em rede é apresentada, com foco nos requisitos arquitetônicos do subsistema de prevenção de colisão para atingir níveis aceitáveis de segurança e confiabilidade. Os estudos presentes na literatura para prevenção de colisão contra objectos estacionários são revistos e uma nova abordagem é descrita. Esta tecnica usa técnicas de aprendizagem profunda e processamento de imagem, para realizar a prevenção de colisões em tempo real com objetos móveis. Por fim, novos modelos e combinações de algoritmos são propostos, fornecendo uma nova abordagem para evitar colisões de VANTs usando Redes Neurais Profundas. A viabilidade da abordagem foi demonstrada através de testes experimentais utilizando um VANT, desenvolvido a partir da arquitectura apresentada

    Accelerating scientific research in the digital era: intelligent assessment and retrieval of research content

    Get PDF
    The efficient, effective, and timely access to the scientific literature by researchers is crucial for accelerating scientific research and discovery. Nowadays, research articles are almost exclusively published in a digital form and stored in digital libraries, accessible over the Web. Using digital libraries for storing scientific literature is advantageous as it enables access to articles at any time and place. Furthermore, digital libraries can leverage information management systems and artificial intelligence techniques to manage, retrieve, and analyze research content. Due to the large size of those libraries and their fast growth pace, the development of intelligent systems that can effectively retrieve and analyze research content is crucial for improving the productivity of researchers. In this thesis, we focus on improving literature search engines by addressing some of their limitations. One of the limitations of the current literature search engines is that they mainly treat articles as the retrieval units and do not support the direct search for any of the article's elements such as figures, tables, and formulas. In this thesis, we study how to enable researchers to access research collections using figures of articles. Figures are entities in research articles that play an essential role in scientific communications. For this reason, research figures can be utilized directly by literature systems to facilitate and accelerate research. As the first step in this direction, we propose and study the novel task of figure retrieval from collections of research articles where the goal is to retrieve research article figures using keyword queries. We focus on the textual bag-of-words representation of search queries and figures and study the effectiveness of different retrieval models for the task and various ways to represent figures using text data. The empirical study shows the benefit of using multiple textual inputs for representing a figure and combining different retrieval models. The results also shed light on the different challenges in addressing this novel task. Next, we address the limitations of the text-based bag-of-words representation of research figures by proposing and studying a new view of representation, namely deep neural network-based distributed representations. Specifically, we focus on using image data and text for learning figure representations with different model architectures and loss functions to understand how sensitive the embeddings are to the learning approach and the features used. We also develop a novel weak supervision technique for training neural networks for this task that leverages the citation network of articles to generate large quantities of training examples. The experimental results show that figure representations, learned using our weak supervision approach, are effective and outperform representations of the bag-of-words technique and pre-trained neural networks. The current systems also have minimal support for addressing queries for which a search engine performs poorly due to ineffective formulation by the user. When conducting research, poor-performing search queries may occur when a researcher faces a new or fast-evolving research topic, resulting in a significant vocabulary gap between the user's query and the relevant articles. In this thesis, we address this problem by developing a novel strategy for collaborative query construction. According to this strategy, the search engine would actively engage users in an iterative process to continuously revise a query. We propose a specific implementation of this strategy in which the search engine and the user work together to expand a search query. Specifically, the system generates expansion terms, utilizing the history of interactions of the user with it, that the user can add to the search query in every iteration to reach an "ideal query". The experimental results attest to the effectiveness of using this approach in improving poor-performing search queries with minimal effort from the user. The last limitation that we address in this thesis is that the current systems usually do not leverage any content analysis for the quality assessment of articles and instead rely on citation counts. In this thesis, we study the task of automatic quality assessment of research articles where the goal is to assess the quality of an article in different aspects such as clarity, originality, and soundness. Automating the quality assessment of articles could improve the current literature systems that can leverage the generated quality scores to support the search and analysis of research articles. Previous works have applied supervised machine learning to automate the assessment by learning from examples of reviewed articles by humans. In this thesis, we study the effectiveness of using topics for the task and propose a novel strategy for constructing multi-view topical features. Experimental results show that such features are effective for this task compared to deep neural network-based features and bag-of-words features. Finally, to facilitate further evaluation of the different approaches suggested in this thesis using real users and realistic user tasks, we developed AcademicExplorer, a novel general system that supports the retrieval and exploration of research articles using several new functions enabled by the proposed algorithms in this thesis, such as exploring research collections using figure embeddings, sorting research articles based on automatically generated review scores, and interactive query formulation. As an open-source system, AcademicExplorer can help advance the research, evaluation, and development of applications in this area

    Towards Developing Computer Vision Algorithms and Architectures for Real-world Applications

    Get PDF
    abstract: Computer vision technology automatically extracts high level, meaningful information from visual data such as images or videos, and the object recognition and detection algorithms are essential in most computer vision applications. In this dissertation, we focus on developing algorithms used for real life computer vision applications, presenting innovative algorithms for object segmentation and feature extraction for objects and actions recognition in video data, and sparse feature selection algorithms for medical image analysis, as well as automated feature extraction using convolutional neural network for blood cancer grading. To detect and classify objects in video, the objects have to be separated from the background, and then the discriminant features are extracted from the region of interest before feeding to a classifier. Effective object segmentation and feature extraction are often application specific, and posing major challenges for object detection and classification tasks. In this dissertation, we address effective object flow based ROI generation algorithm for segmenting moving objects in video data, which can be applied in surveillance and self driving vehicle areas. Optical flow can also be used as features in human action recognition algorithm, and we present using optical flow feature in pre-trained convolutional neural network to improve performance of human action recognition algorithms. Both algorithms outperform the state-of-the-arts at their time. Medical images and videos pose unique challenges for image understanding mainly due to the fact that the tissues and cells are often irregularly shaped, colored, and textured, and hand selecting most discriminant features is often difficult, thus an automated feature selection method is desired. Sparse learning is a technique to extract the most discriminant and representative features from raw visual data. However, sparse learning with \textit{L1} regularization only takes the sparsity in feature dimension into consideration; we improve the algorithm so it selects the type of features as well; less important or noisy feature types are entirely removed from the feature set. We demonstrate this algorithm to analyze the endoscopy images to detect unhealthy abnormalities in esophagus and stomach, such as ulcer and cancer. Besides sparsity constraint, other application specific constraints and prior knowledge may also need to be incorporated in the loss function in sparse learning to obtain the desired results. We demonstrate how to incorporate similar-inhibition constraint, gaze and attention prior in sparse dictionary selection for gastroscopic video summarization that enable intelligent key frame extraction from gastroscopic video data. With recent advancement in multi-layer neural networks, the automatic end-to-end feature learning becomes feasible. Convolutional neural network mimics the mammal visual cortex and can extract most discriminant features automatically from training samples. We present using convolutinal neural network with hierarchical classifier to grade the severity of Follicular Lymphoma, a type of blood cancer, and it reaches 91\% accuracy, on par with analysis by expert pathologists. Developing real world computer vision applications is more than just developing core vision algorithms to extract and understand information from visual data; it is also subject to many practical requirements and constraints, such as hardware and computing infrastructure, cost, robustness to lighting changes and deformation, ease of use and deployment, etc.The general processing pipeline and system architecture for the computer vision based applications share many similar design principles and architecture. We developed common processing components and a generic framework for computer vision application, and a versatile scale adaptive template matching algorithm for object detection. We demonstrate the design principle and best practices by developing and deploying a complete computer vision application in real life, building a multi-channel water level monitoring system, where the techniques and design methodology can be generalized to other real life applications. The general software engineering principles, such as modularity, abstraction, robust to requirement change, generality, etc., are all demonstrated in this research.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    The Artificial Intelligence in Digital Pathology and Digital Radiology: Where Are We?

    Get PDF
    This book is a reprint of the Special Issue entitled "The Artificial Intelligence in Digital Pathology and Digital Radiology: Where Are We?". Artificial intelligence is extending into the world of both digital radiology and digital pathology, and involves many scholars in the areas of biomedicine, technology, and bioethics. There is a particular need for scholars to focus on both the innovations in this field and the problems hampering integration into a robust and effective process in stable health care models in the health domain. Many professionals involved in these fields of digital health were encouraged to contribute with their experiences. This book contains contributions from various experts across different fields. Aspects of the integration in the health domain have been faced. Particular space was dedicated to overviewing the challenges, opportunities, and problems in both radiology and pathology. Clinal deepens are available in cardiology, the hystopathology of breast cancer, and colonoscopy. Dedicated studies were based on surveys which investigated students and insiders, opinions, attitudes, and self-perception on the integration of artificial intelligence in this field
    • …
    corecore