15 research outputs found

    EnSiam: Self-Supervised Learning With Ensemble Representations

    Full text link
    Recently, contrastive self-supervised learning, where the proximity of representations is determined based on the identities of samples, has made remarkable progress in unsupervised representation learning. SimSiam is a well-known example in this area, known for its simplicity yet powerful performance. However, it is known to be sensitive to changes in training configurations, such as hyperparameters and augmentation settings, due to its structural characteristics. To address this issue, we focus on the similarity between contrastive learning and the teacher-student framework in knowledge distillation. Inspired by the ensemble-based knowledge distillation approach, the proposed method, EnSiam, aims to improve the contrastive learning procedure using ensemble representations. This can provide stable pseudo labels, providing better performance. Experiments demonstrate that EnSiam outperforms previous state-of-the-art methods in most cases, including the experiments on ImageNet, which shows that EnSiam is capable of learning high-quality representations

    Guaranteeing the \~O(AGM/OUT) Runtime for Uniform Sampling and OUT Size Estimation over Joins

    Full text link
    We propose a new method for estimating the number of answers OUT of a small join query Q in a large database D, and for uniform sampling over joins. Our method is the first to satisfy all the following statements. - Support arbitrary Q, which can be either acyclic or cyclic, and contain binary and non-binary relations. - Guarantee an arbitrary small error with a high probability always in \~O(AGM/OUT) time, where AGM is the AGM bound OUT (an upper bound of OUT), and \~O hides the polylogarithmic factor of input size. We also explain previous join size estimators in a unified framework. All methods including ours rely on certain indexes on relations in D, which take linear time to build offline. Additionally, we extend our method using generalized hypertree decompositions (GHDs) to achieve a lower complexity than \~O(AGM/OUT) when OUT is small, and present optimization techniques for improving estimation efficiency and accuracy.Comment: 19 page

    Point Cloud Resampling by Simulating Electric Charges on Metallic Surfaces

    No full text
    3D point cloud resampling based on computational geometry is still a challenging problem. In this paper, we propose a point cloud resampling algorithm inspired by the physical characteristics of the repulsion forces between point electrons. The points in the point cloud are considered as electrons that reside on a virtual metallic surface. We iteratively update the positions of the points by simulating the electromagnetic forces between them. Intuitively, the input point cloud becomes evenly distributed by the repulsive forces. We further adopt an acceleration and damping terms in our simulation. This system can be viewed as a momentum method in mathematical optimization and thus increases the convergence stability and uniformity performance. The net force of the repulsion forces may contain a normal directional force with respect to the local surface, which can make the point diverge from the surface. To prevent this, we introduce a simple restriction method that limits the repulsion forces between the points to an approximated local plane. This approach mimics the natural phenomenon in which positive electrons cannot escape from the metallic surface. However, this is still an approximation because the surfaces are often curved rather than being strict planes. Therefore, we project the points to the nearest local surface after the movement. In addition, we approximate the net repulsion force using the K-nearest neighbor to accelerate our algorithm. Furthermore, we propose a new measurement criterion that evaluates the uniformity of the resampled point cloud to compare the proposed algorithm with baselines. In experiments, our algorithm demonstrates superior performance in terms of uniformization, convergence, and run-time

    Combining Sampling and Synopses with Worst-Case Optimal Runtime and Quality Guarantees for Graph Pattern Cardinality Estimation

    No full text
    Graph pattern cardinality estimation is the problem of estimating the number of embeddings |M| of a query graph in a data graph. This fundamental problem arises, for example, during query planning in subgraph matching algorithms. There are two major approaches to solving the problem: sampling and synopsis. Synopsis (or summary)-based methods are fast and accurate if synopses capture information of graphs well. However, these methods suffer from large errors due to loss of information during summarization and inherent assumptions. Sampling-based methods are unbiased but suffer from large estimation variance due to large sample space. To address these limitations, we propose Alley, a hybrid method that combines both sampling and synopses. Alley employs 1) a novel sampling strategy, random walk with intersection, which effectively reduces the sample space, 2) branching to further reduce variance, and 3) a novel mining approach that extracts and indexes tangled patterns as synopses which are inherently difficult to estimate by sampling. By using them in the online estimation phase, we can effectively reduce the sample space while still ensuring unbiasedness. We establish that Alley has worst-case optimal runtime and approximation quality guarantees for any given error bound and required confidence . In addition to the theoretical aspect of Alley, our extensive experiments show that Alley outperforms the state-of-the-art methods by up to orders of magnitude higher accuracy with similar efficiency.1

    Heterogeneous Value of Water: Empirical Evidence in South Korea

    No full text
    Anthropogenic pressures have exacerbated self-sustaining river services, and growing concerns over sustaining river system become global problematic issues that lead us to implement river restoration projects. Of those projects, governing diverse needs and desires from stakeholders for those who have various water values are key elements of identifying the success of the project. In fact, the Korean government has had concern over restoring the rivers which brings to construct 16 weirs in four major rivers and may fail to achieve main goal of the project, which is to ameliorate water quality. In this study, principle component analysis and multinomial logit model were executed to investigate major socioeconomic variables to influence water values in terms of sustainability in Korea. Evitable evidences have been found that age, income, education level, and city dwelling are the most effective variables to estimate water values. In addition, a monotonous water development project and a myopic view could cause major dejection across the nation and may lead to the failure of water governance. Unfortunately, the latter may be observed in Korea as one of the reasons for the recent amplification of major conflicts

    Application Issues of Impacted As-Planned Schedule for Delay Analysis

    No full text
    Most construction projects are delayed, and many are subject to claims or disputes. Therefore, delay analysis is a critical component of any construction project to determine who is responsible for delays. This research examines four different techniques for estimating delay impacts using the impacted as-planned (IAP) method. A sample network was introduced as an example to discuss several concerns. The advantages and limitations of each approach were identified, and recommendations were given for each approach. When inserting an activity or activities representing delay events in IAP, it is necessary to use both constraints and logical relations among delay events, their logical predecessors, and successors. Constraints representing the actual date of delay events are the simplest and easiest. However, constraints should not be used in “single insertion” and “inserting only owner- or contractor-caused delay” approach. In addition, in the case of using constraints, it is critical to ensure that the impact of delay events is less than the duration of those delay events. Constraints should be avoided in this scenario, and delay events should be logically connected to their logical predecessors and successors without constraints. This study also identified through an example that inserting delay events only by logic can cause wrong analysis results. The results of this study will be helpful for delay analysts in identifying what kinds of problems occur in IAP methods and how to prevent those problems

    G-CARE: A Framework for Performance Benchmarking of Cardinality Estimation Techniques for Subgraph Matching

    No full text
    Despite the crucial role of cardinality estimation in query optimization, there has been no systematic and in-depth study of the existing cardinality estimation techniques for subgraph matching queries. In this paper, for the first time, we present a comprehensive study of the existing cardinality estimation techniques for subgraph matching queries, scaling far beyond the original experiments. We first introduce a novel framework called g-care that enables us to realize all existing techniques on top of it and that provides insights on their performance. By using g-care, we then reimplement representative cardinality estimation techniques for graph databases as well as relational databases. We next evaluate these techniques w.r.t accuracy on rdf and non-rdf graphs from different domains with subgraph matching queries of various topologies so far considered. Surprisingly, our results reveal that all existing techniques have serious problems in accuracy for various scenarios and datasets. Intriguingly, a simple sampling method based on an online aggregation technique designed for relational data, consistently outperforms all existing techniques.1

    Reinforcement Learning Guided by Double Replay Memory

    No full text
    Experience replay memory in reinforcement learning enables agents to remember and reuse past experiences. Most of the reinforcement models are subject to single experience replay memory to operate agents. In this article, we propose a framework that accommodates doubly used experience replay memory, exploiting both important transitions and new transitions simultaneously. In numerical studies, the deep Q-networks (DQN) equipped with double experience replay memory are examined under various scenarios. A self-driving car requires an automated agent to figure out when to adequately change lanes on the real-time basis. To this end, we apply our proposed agent to the simulation of urban mobility (SUMO) experiments. Besides, we also verify its applicability to reinforcement learning whose action space is discrete (e.g., computer game environments). Taken all together, we conclude that the proposed framework outperforms priorly known reinforcement learning models in the virtue of double experience replay memory

    A multimodal screening system for elderly neurological diseases based on deep learning

    No full text
    Abstract In this paper, we propose a deep-learning-based algorithm for screening neurological diseases. We proposed various examination protocols for screening neurological diseases and collected data by video-recording persons performing these protocols. We converted video data into human landmarks that capture action information with a much smaller data dimension. We also used voice data which are also effective indicators of neurological disorders. We designed a subnetwork for each protocol to extract features from landmarks or voice and a feature aggregator that combines all the information extracted from the protocols to make a final decision. Multitask learning was applied to screen two neurological diseases. To capture meaningful information about these human landmarks and voices, we applied various pre-trained models to extract preliminary features. The spatiotemporal characteristics of landmarks are extracted using a pre-trained graph neural network, and voice features are extracted using a pre-trained time-delay neural network. These extracted high-level features are then passed onto the subnetworks and an additional feature aggregator that are simultaneously trained. We also used various data augmentation techniques to overcome the shortage of data. Using a frame-length staticizer that considers the characteristics of the data, we can capture momentary tremors without wasting information. Finally, we examine the effectiveness of different protocols and different modalities (different body parts and voice) through extensive experiments. The proposed method achieves AUC scores of 0.802 for stroke and 0.780 for Parkinson’s disease, which is effective for a screening system
    corecore