196 research outputs found

    GLIN: A Lightweight Learned Indexing Mechanism for Complex Geometries

    Full text link
    Although spatial index structures shorten the query response time, they rely on complex tree structures to narrow down the search space. Such structures in turn yield additional storage overhead and take a toll on index maintenance. Recently, there has been a flurry on works attempting to leverage machine-Learning(ML) models to simplify the index structures. Some follow-up works extend the idea to support geospatial point data. These approaches partition the multidimensional space to cells and assign IDs to these cells using space-filling curve(e.g., Z-order curve) or mathematical equations. These approaches work well for geospatial points but are not able to handle complex geometries such as polygons and trajectories which are widely available in geospatial data. This paper introduces GLIN, a lightweight learned index for spatial range queries on complex geometries. To achieve that, GLIN transforms geometries to Z-address intervals, and builds a hierarchical model to learn the cumulative distribution function between these intervals and the record positions. The lightweight hierarchical model greatly shortens the index probing time. Furthermore, GLIN augments spatial query windows using an add-on function to guarantee the query accuracy for both Contains and Intersects spatial relationships. Our experiments on real-world and synthetic datasets show that GLIN occupies 40-70 times less storage overhead than popular spatial indexes such as Quad-Tree while still showing similar query response time in medium selectivity queries. Moreover, GLIN's maintenance speed is around 1.5 times higher on insertion and 3-5 times higher on deletion

    Continuous Prompt Tuning Based Textual Entailment Model for E-commerce Entity Typing

    Full text link
    The explosion of e-commerce has caused the need for processing and analysis of product titles, like entity typing in product titles. However, the rapid activity in e-commerce has led to the rapid emergence of new entities, which is difficult to be solved by general entity typing. Besides, product titles in e-commerce have very different language styles from text data in general domain. In order to handle new entities in product titles and address the special language styles problem of product titles in e-commerce domain, we propose our textual entailment model with continuous prompt tuning based hypotheses and fusion embeddings for e-commerce entity typing. First, we reformulate the entity typing task into a textual entailment problem to handle new entities that are not present during training. Second, we design a model to automatically generate textual entailment hypotheses using a continuous prompt tuning method, which can generate better textual entailment hypotheses without manual design. Third, we utilize the fusion embeddings of BERT embedding and CharacterBERT embedding with a two-layer MLP classifier to solve the problem that the language styles of product titles in e-commerce are different from that of general domain. To analyze the effect of each contribution, we compare the performance of entity typing and textual entailment model, and conduct ablation studies on continuous prompt tuning and fusion embeddings. We also evaluate the impact of different prompt template initialization for the continuous prompt tuning. We show our proposed model improves the average F1 score by around 2% compared to the baseline BERT entity typing model

    M&A goodwill, information asymmetry and stock price crash risk

    Get PDF
    The collapse of stock prices have a huge negative impact on financial markets and the real economy, the mechanism and prevention methods of stock market crashes have become the focus of academic attention. This article takes Chinese A-share listed companies from 2008 to 2016 as samples and investigates the impact of M&A goodwill on the risk of stock price crashes. The study finds that, compared with non-goodwill companies, companies with goodwill have a greater risk of future stock price crashes; with the increase of goodwill value (GW), the risk of future stock price crashes increases significantly. Further research shows that the GW affects the risk of stock price crashes through information asymmetry at the corporate and market levels. This article not only deepens the research on the factors influencing the risk of stock price crashes, but also has great significance in understanding the role of M&A goodwill in the capital market and how to prevent stock price crashes and promote the orderly development of the capital market

    A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

    Full text link
    In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP). Grounded in \textsl{Policy Space Response Oracle} (PSRO) methods, our two-player framework outputs a population of best-responding Solvers, over which we can mix and output a combined model that achieves the least exploitability against the Generator, and thereby the most generalizable performance on different TSP tasks. We conduct experiments on a variety of TSP instances with different types and sizes. Results suggest that our Solvers achieve the state-of-the-art performance even on tasks the Solver never meets, whilst the performance of other deep learning-based Solvers drops sharply due to over-fitting. To demonstrate the principle of our framework, we study the learning outcome of the proposed two-player game and demonstrate that the exploitability of the Solver population decreases during training, and it eventually approximates the Nash equilibrium along with the Generator.Comment: ICLR2022 Gamification and Multiagent Solutions Workshop Spotlight Presentatio

    Subconscious processing reveals dissociable contextual modulations of visual size perception

    Get PDF
    Visual size perception is highly context-dependent. In a series of experiments reported here, we demonstrated that the contextual modulation of visual size processing could occur independent of conscious awareness. Specifically, the Ebbinghaus illusion, which is mediated by lateral connections within the early visual processing stream, persisted even when the surrounding inducers were rendered invisible. Moreover, when the central target was initially interocularly suppressed, the identical target emerged from suppression faster when surrounded by small relative to large inducers, with the suppression time difference well predicted by the strength of the illusion. By contrast, there were no such subconscious contextual modulation effects associated with the Ponzo illusion, which largely relies on feedback projections to the early visual cortices. These results indicate that contextual information can modulate visual size perception without conscious awareness, and the dissociated modulation effects further suggest that subconscious contextual modulation takes place in the early visual processing stream and is largely independent of high-level feedback influences

    Prioritization of risk genes for Alzheimer’s disease: an analysis framework using spatial and temporal gene expression data in the human brain based on support vector machine

    Get PDF
    Background: Alzheimer’s disease (AD) is a complex disorder, and its risk is influenced by multiple genetic and environmental factors. In this study, an AD risk gene prediction framework based on spatial and temporal features of gene expression data (STGE) was proposed.Methods: We proposed an AD risk gene prediction framework based on spatial and temporal features of gene expression data. The gene expression data of providers of different tissues and ages were used as model features. Human genes were classified as AD risk or non-risk sets based on information extracted from relevant databases. Support vector machine (SVM) models were constructed to capture the expression patterns of genes believed to contribute to the risk of AD.Results: The recursive feature elimination (RFE) method was utilized for feature selection. Data for 64 tissue-age features were obtained before feature selection, and this number was reduced to 19 after RFE was performed. The SVM models were built and evaluated using 19 selected and full features. The area under curve (AUC) values for the SVM model based on 19 selected features (0.740 [0.690–0.790]) and full feature sets (0.730 [0.678–0.769]) were very similar. Fifteen genes predicted to be risk genes for AD with a probability greater than 90% were obtained.Conclusion: The newly proposed framework performed comparably to previous prediction methods based on protein-protein interaction (PPI) network properties. A list of 15 candidate genes for AD risk was also generated to provide data support for further studies on the genetic etiology of AD
    corecore