173,973 research outputs found

    A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A Transformer-based Approach

    Full text link
    Semantic similarity analysis and modeling is a fundamentally acclaimed task in many pioneering applications of natural language processing today. Owing to the sensation of sequential pattern recognition, many neural networks like RNNs and LSTMs have achieved satisfactory results in semantic similarity modeling. However, these solutions are considered inefficient due to their inability to process information in a non-sequential manner, thus leading to the improper extraction of context. Transformers function as the state-of-the-art architecture due to their advantages like non-sequential data processing and self-attention. In this paper, we perform semantic similarity analysis and modeling on the U.S Patent Phrase to Phrase Matching Dataset using both traditional and transformer-based techniques. We experiment upon four different variants of the Decoding Enhanced BERT - DeBERTa and enhance its performance by performing K-Fold Cross-Validation. The experimental results demonstrate our methodology's enhanced performance compared to traditional techniques, with an average Pearson correlation score of 0.79

    Improving Text Semantic Similarity Modeling through a 3D Siamese Network

    Full text link
    Siamese networks have gained popularity as a method for modeling text semantic similarity. Traditional methods rely on pooling operation to compress the semantic representations from Transformer blocks in encoding, resulting in two-dimensional semantic vectors and the loss of hierarchical semantic information from Transformer blocks. Moreover, this limited structure of semantic vectors is akin to a flattened landscape, which restricts the methods that can be applied in downstream modeling, as they can only navigate this flat terrain. To address this issue, we propose a novel 3D Siamese network for text semantic similarity modeling, which maps semantic information to a higher-dimensional space. The three-dimensional semantic tensors not only retains more precise spatial and feature domain information but also provides the necessary structural condition for comprehensive downstream modeling strategies to capture them. Leveraging this structural advantage, we introduce several modules to reinforce this 3D framework, focusing on three aspects: feature extraction, attention, and feature fusion. Our extensive experiments on four text semantic similarity benchmarks demonstrate the effectiveness and efficiency of our 3D Siamese Network

    ARCHITECTURE, MODELS, AND ALGORITHMS FOR TEXTUAL SIMILARITY

    Get PDF
    Identifying similar pieces of texts remains one of the fundamental problems in computational linguistics. This dissertation focuses on the textual similarity measurement and identification problem by studying a variety of major tasks that share common properties, and presents our efforts to address 7 closely-related similarity tasks given over 20 public benchmarks, including paraphrase identification, answer selection for question answering, pairwise learning to rank, monolingual/cross-lingual semantic textual similarity measurement, insight extraction on biomedical literature, and high performance cross-lingual pattern matching for machine translation on GPUs. We investigate how to make textual similarity measurement more accurate with deep neural networks. Traditional approaches are either based on feature engineering which leads to disconnected solutions, or the Siamese architecture which treats inputs independently, utilizes single representation view and straightforward similarity comparison. In contrast, we focus on modeling stronger interactions between inputs and develop interaction-based neural modeling that explicitly encodes the alignments of input words or aggregated sentence representations into our models. As a result, our multiple deep neural networks show highly competitive performance on many textual similarity measurement public benchmarks we evaluated. Our multi-perspective convolutional neural networks (MPCNN) uses a multiplicity of perspectives to process input sentences with multiple parallel convolutional neural networks, is able to extract salient sentence-level features automatically at multiple granularities with different types of pooling. Our novel structured similarity layer encourages stronger input interactions by comparing local regions of both sentence representations. This model is the first example of our interaction-based neural modeling. We also provide an attention-based input interaction layer on top of the MPCNN model. The input interaction layer models a closer relationship of input words by converting two separate sentences into an inter-related sentence pair. This layer utilizes the attention mechanism in a straightforward way, and is another example of our interaction-based neural modeling. We then provide our pairwise word interaction model with very deep neural networks (PWI). This model directly encodes input word interactions with novel pairwise word interaction modeling and a novel similarity focus layer. The use of very deep architecture in this model is the first example in NLP domain for better textual similarity modeling. Our PWI model outperforms the Siamese architecture and feature engineering approach on multiple tasks, and is another example of our interaction-based neural modeling. We also focus on the question answering task with a pairwise ranking approach. Unlike traditional pointwise approach of the task, our pairwise ranking approach with the use of negative sampling focuses on modeling interactions between two pairs of question and answer inputs, then learns a relative order of the pairs to predict which answer is more relevant to the question. We demonstrate its high effectiveness against competitive previous pointwise baselines. For the insight extraction on biomedical literature task, we develop neural networks with similarity modeling for better causality/correlation relation extraction, as we convert the extraction task into a similarity measurement task. Our approach innovates in that it explicitly models the interactions among the trio: named entities, entity relations and contexts, and then measures both relational and contextual similarity among them, finally integrate both similarity evaluations into considerations for insight extraction. We also build an end-to-end system to extract insights, with human evaluations we show our system is able to extract insights with high human acceptance accuracy. Lastly, we explore how to exploit massive parallelism offered by modern GPUs for high-efficiency pattern matching. We take advantage of GPU hardware advances and develop a massive parallelism approach. We firstly work on phrase-based SMT, where we enable phrase lookup and extraction on suffix arrays to be massively parallelized and vastly many queries to be carried out in parallel. We then work on computationally expensive hierarchical SMT model, which requires matching grammar patterns that contain ''gaps''. In order to get high efficiency for the similarity identification task on GPUs, we show developing massively parallel algorithms on GPUs is the most important approach to fully utilize GPU's raw processing power, and developing compact data structures on GPUs is helpful to lower GPU's memory latency. Compared to a highly-optimized, state-of-the-art multi-threaded CPU implementation, our techniques achieve orders of magnitude improvement in terms of throughput

    On Quantifying Qualitative Geospatial Data: A Probabilistic Approach

    Full text link
    Living in the era of data deluge, we have witnessed a web content explosion, largely due to the massive availability of User-Generated Content (UGC). In this work, we specifically consider the problem of geospatial information extraction and representation, where one can exploit diverse sources of information (such as image and audio data, text data, etc), going beyond traditional volunteered geographic information. Our ambition is to include available narrative information in an effort to better explain geospatial relationships: with spatial reasoning being a basic form of human cognition, narratives expressing such experiences typically contain qualitative spatial data, i.e., spatial objects and spatial relationships. To this end, we formulate a quantitative approach for the representation of qualitative spatial relations extracted from UGC in the form of texts. The proposed method quantifies such relations based on multiple text observations. Such observations provide distance and orientation features which are utilized by a greedy Expectation Maximization-based (EM) algorithm to infer a probability distribution over predefined spatial relationships; the latter represent the quantified relationships under user-defined probabilistic assumptions. We evaluate the applicability and quality of the proposed approach using real UGC data originating from an actual travel blog text corpus. To verify the quality of the result, we generate grid-based maps visualizing the spatial extent of the various relations
    • …
    corecore