395 research outputs found

    Top K Relevant Passage Retrieval for Biomedical Question Answering

    Full text link
    Question answering is a task that answers factoid questions using a large collection of documents. It aims to provide precise answers in response to the user's questions in natural language. Question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. On the web, there is no single article that could provide all the possible answers available on the internet to the question of the problem asked by the user. The existing Dense Passage Retrieval model has been trained on Wikipedia dump from Dec. 20, 2018, as the source documents for answering questions. Question answering (QA) has made big strides with several open-domain and machine comprehension systems built using large-scale annotated datasets. However, in the clinical domain, this problem remains relatively unexplored. According to multiple surveys, Biomedical Questions cannot be answered correctly from Wikipedia Articles. In this work, we work on the existing DPR framework for the biomedical domain and retrieve answers from the Pubmed articles which is a reliable source to answer medical questions. When evaluated on a BioASQ QA dataset, our fine-tuned dense retriever results in a 0.81 F1 score.Comment: 6 pages, 5 figures. arXiv admin note: text overlap with arXiv:2004.04906 by other author

    LANGUAGE MODELS FOR RARE DISEASE INFORMATION EXTRACTION: EMPIRICAL INSIGHTS AND MODEL COMPARISONS

    Get PDF
    End-to-end relation extraction (E2ERE) is a crucial task in natural language processing (NLP) that involves identifying and classifying semantic relationships between entities in text. This thesis compares three paradigms for end-to-end relation extraction (E2ERE) in biomedicine, focusing on rare diseases with discontinuous and nested entities. We evaluate Named Entity Recognition (NER) to Relation Extraction (RE) pipelines, sequence-to-sequence models, and generative pre-trained transformer (GPT) models using the RareDis information extraction dataset. Our findings indicate that pipeline models are the most effective, followed closely by sequence-to-sequence models. GPT models, despite having eight times as many parameters, perform worse than sequence-to-sequence models and significantly lag pipeline models. Our results also hold for a second E2ERE dataset for chemical-protein interactions

    An Ethereum-based Product Identification System for Anti-counterfeits

    Full text link
    Fake products are items that are marketed and sold as genuine, high-quality products but are counterfeit or low-quality knockoffs. These products are often designed to closely mimic the appearance and branding of the genuine product to deceive consumers into thinking they are purchasing the real thing. Fake products can range from clothing and accessories to electronics and other goods and can be found in a variety of settings, including online marketplaces and brick-and-mortar stores. Blockchain technology can be used to help detect fake products in a few different ways. One of the most common ways is through the use of smart contracts, which are self-executing contracts with the terms of the agreement between buyer and seller being directly written into lines of code. This allows for a high level of transparency and traceability in supply chain transactions, making it easier to identify and prevent the sale of fake products and the use of unique product identifiers, such as serial numbers or QR codes, that are recorded on the blockchain. This allows consumers to easily verify the authenticity of a product by scanning the code and checking it against the information recorded on the blockchain. In this study, we will use smart contracts to detect fake products and will evaluate based on Gas cost and ethers used for each implementation.Comment: 5 page, 5 figure

    CSSXC: Context-sensitive Sanitization Framework for Web Applications against XSS Vulnerabilities in Cloud Environments

    Get PDF
    AbstractThis paper presents a context-sensitive sanitization based XSS defensive framework for the cloud environment. It discovers all the hidden injection points in HTML5-based web applications deployed on the platforms of cloud and sanitizes the XSS attack payloads injected in such points in a context sensitive manner. The identification of such injection points permits our technique to retrieve each possible web page of application, allowing a wider exploration and accelerating the process of applying the sanitizers on the untrusted variables of web application. The XSS attack mitigation capability of our framework was evaluated on web applications deployed for the cloud users in the cloud environment. The experimental results reveal that this technique detects the XSS attack payloads with minimum rate of false negatives and less runtime overhead

    CIMTDetect: A Community Infused Matrix-Tensor Coupled Factorization Based Method for Fake News Detection

    Full text link
    Detecting whether a news article is fake or genuine is a crucial task in today's digital world where it's easy to create and spread a misleading news article. This is especially true of news stories shared on social media since they don't undergo any stringent journalistic checking associated with main stream media. Given the inherent human tendency to share information with their social connections at a mouse-click, fake news articles masquerading as real ones, tend to spread widely and virally. The presence of echo chambers (people sharing same beliefs) in social networks, only adds to this problem of wide-spread existence of fake news on social media. In this paper, we tackle the problem of fake news detection from social media by exploiting the very presence of echo chambers that exist within the social network of users to obtain an efficient and informative latent representation of the news article. By modeling the echo-chambers as closely-connected communities within the social network, we represent a news article as a 3-mode tensor of the structure - and propose a tensor factorization based method to encode the news article in a latent embedding space preserving the community structure. We also propose an extension of the above method, which jointly models the community and content information of the news article through a coupled matrix-tensor factorization framework. We empirically demonstrate the efficacy of our method for the task of Fake News Detection over two real-world datasets. Further, we validate the generalization of the resulting embeddings over two other auxiliary tasks, namely: \textbf{1)} News Cohort Analysis and \textbf{2)} Collaborative News Recommendation. Our proposed method outperforms appropriate baselines for both the tasks, establishing its generalization.Comment: Presented at ASONAM'1

    Quantum entropy expansion using n-qubit permutation matrices in Galois field

    Full text link
    Random numbers are critical for any cryptographic application. However, the data that is flowing through the internet is not secure because of entropy deprived pseudo random number generators and unencrypted IoTs. In this work, we address the issue of lesser entropy of several data formats. Specifically, we use the large information space associated with the n-qubit permutation matrices to expand the entropy of any data without increasing the size of the data. We take English text with the entropy in the range 4 - 5 bits per byte. We manipulate the data using a set of n-qubit (n ≤\leq 10) permutation matrices and observe the expansion of the entropy in the manipulated data (to more than 7.9 bits per byte). We also observe similar behaviour with other data formats like image, audio etc. (n ≤\leq 15)

    Cryptanalysis of quantum permutation pad

    Full text link
    Cryptanalysis increases the level of confidence in cryptographic algorithms. We analyze the security of a symmetric cryptographic algorithm - quantum permutation pad (QPP) [8]. We found the instances of ciphertext the same as plaintext even after the action of QPP with the probability 1/N when the entire set of permutation matrices of dimension N is used and with the probability 1/N^m when an incomplete set of m permutation matrices of dimension N are used. We visually show such instances in a cipher image created by QPP of 256 permutation matrices of different dimensions. For any practical usage of QPP, we recommend a set of 256 permutation matrices of a dimension more or equal to 2048.Comment: 7 pages, 1 figures, comments are welcom
    • …
    corecore