4 research outputs found

    Ant colony optimization on runtime reconfigurable architectures

    Get PDF

    Neural Representations of Concepts and Texts for Biomedical Information Retrieval

    Get PDF
    Information retrieval (IR) methods are an indispensable tool in the current landscape of exponentially increasing textual data, especially on the Web. A typical IR task involves fetching and ranking a set of documents (from a large corpus) in terms of relevance to a user\u27s query, which is often expressed as a short phrase. IR methods are the backbone of modern search engines where additional system-level aspects including fault tolerance, scale, user interfaces, and session maintenance are also addressed. In addition to fetching documents, modern search systems may also identify snippets within the documents that are potentially most relevant to the input query. Furthermore, current systems may also maintain preprocessed structured knowledge derived from textual data as so called knowledge graphs, so certain types of queries that are posed as questions can be parsed as such; a response can be an output of one or more named entities instead of a ranked list of documents (e.g., what diseases are associated with EGFR mutations? ). This refined setup is often termed as question answering (QA) in the IR and natural language processing (NLP) communities. In biomedicine and healthcare, specialized corpora are often at play including research articles by scientists, clinical notes generated by healthcare professionals, consumer forums for specific conditions (e.g., cancer survivors network), and clinical trial protocols (e.g., www.clinicaltrials.gov). Biomedical IR is specialized given the types of queries and the variations in the texts are different from that of general Web documents. For example, scientific articles are more formal with longer sentences but clinical notes tend to have less grammatical conformity and are rife with abbreviations. There is also a mismatch between the vocabulary of consumers and the lingo of domain experts and professionals. Queries are also different and can range from simple phrases (e.g., COVID-19 symptoms ) to more complex implicitly fielded queries (e.g., chemotherapy regimens for stage IV lung cancer patients with ALK mutations ). Hence, developing methods for different configurations (corpus, query type, user type) needs more deliberate attention in biomedical IR. Representations of documents and queries are at the core of IR methods and retrieval methodology involves coming up with these representations and matching queries with documents based on them. Traditional IR systems follow the approach of keyword based indexing of documents (the so called inverted index) and matching query phrases against the document index. It is not difficult to see that this keyword based matching ignores the semantics of texts (synonymy at the lexeme level and entailment at phrase/clause/sentence levels) and this has lead to dimensionality reduction methods such as latent semantic indexing that generally have scale-related concerns; such methods also do not address similarity at the sentence level. Since the resurgence of neural network methods in NLP, the IR field has also moved to incorporate advances in neural networks into current IR methods. This dissertation presents four specific methodological efforts toward improving biomedical IR. Neural methods always begin with dense embeddings for words and concepts to overcome the limitations of one-hot encoding in traditional NLP/IR. In the first effort, we present a new neural pre-training approach to jointly learn word and concept embeddings for downstream use in applications. In the second study, we present a joint neural model for two essential subtasks of information extraction (IE): named entity recognition (NER) and entity normalization (EN). Our method detects biomedical concept phrases in texts and links them to the corresponding semantic types and entity codes. These first two studies provide essential tools to model textual representations as compositions of both surface forms (lexical units) and high level concepts with potential downstream use in QA. In the third effort, we present a document reranking model that can help surface documents that are likely to contain answers (e.g, factoids, lists) to a question in a QA task. The model is essentially a sentence matching neural network that learns the relevance of a candidate answer sentence to the given question parametrized with a bilinear map. In the fourth effort, we present another document reranking approach that is tailored for precision medicine use-cases. It combines neural query-document matching and faceted text summarization. The main distinction of this effort from previous efforts is to pivot from a query manipulation setup to transforming candidate documents into pseudo-queries via neural text summarization. Overall, our contributions constitute nontrivial advances in biomedical IR using neural representations of concepts and texts

    Elements of Ion Linear Accelerators, Calm in The Resonances, Other_Tales

    Full text link
    The main part of this book, Elements of Linear Accelerators, outlines in Part 1 a framework for non-relativistic linear accelerator focusing and accelerating channel design, simulation, optimization and analysis where space charge is an important factor. Part 1 is the most important part of the book; grasping the framework is essential to fully understand and appreciate the elements within it, and the myriad application details of the following Parts. The treatment concentrates on all linacs, large or small, intended for high-intensity, very low beam loss, factory-type application. The Radio-Frequency-Quadrupole (RFQ) is especially developed as a representative and the most complicated linac form (from dc to bunched and accelerated beam), extending to practical design of long, high energy linacs, including space charge resonances and beam halo formation, and some challenges for future work. Also a practical method is presented for designing Alternating-Phase- Focused (APF) linacs with long sequences and high energy gain. Full open-source software is available. The following part, Calm in the Resonances and Other Tales, contains eyewitness accounts of nearly 60 years of participation in accelerator technology. (September 2023) The LINACS codes are released at no cost and, as always,with fully open-source coding. (p.2 & Ch 19.10)Comment: 652 pages. Some hundreds of figures - all images, there is no data in the figures. (September 2023) The LINACS codes are released at no cost and, as always,with fully open-source coding. (p.2 & Ch 19.10

    Clustering using Self Organized Map on Rmesh

    No full text
    Clustering finds its applications in image processing, business, and bioinformatics. In this paper two Self Organized map algorithms on Reconfigurable mesh with buses will be presented. These algorithms use M×K PEs and M×N×K PEs. The former has time complexity of 0(logM)for process one pattern. The latter has time complexity of O(logMN) for processing N patterns
    corecore