317 research outputs found

    Graph Homomorphism Revisited for Graph Matching

    Get PDF
    In a variety of emerging applications one needs to decide whether a graph G matches another G p , i.e. , whether G has a topological structure similar to that of G p . The traditional notions of graph homomorphism and isomorphism often fall short of capturing the structural similarity in these applications. This paper studies revisions of these notions, providing a full treatment from complexity to algorithms. (1) We propose p-homomorphism (p -hom) and 1-1 p -hom, which extend graph homomorphism and subgraph isomorphism, respectively, by mapping edges from one graph to paths in another, and by measuring the similarity of nodes . (2) We introduce metrics to measure graph similarity, and several optimization problems for p -hom and 1-1 p -hom. (3) We show that the decision problems for p -hom and 1-1 p -hom are NP-complete even for DAGs, and that the optimization problems are approximation-hard. (4) Nevertheless, we provide approximation algorithms with provable guarantees on match quality. We experimentally verify the effectiveness of the revised notions and the efficiency of our algorithms in Web site matching, using real-life and synthetic data. </jats:p

    Propagating functional dependencies with conditions

    Get PDF
    The dependency propagation problem is to determine, given a view defined on data sources and a set of dependencies on the sources, whether another dependency is guaranteed to hold on the view. This paper investigates dependency propagation for recently proposed conditional functional dependencies (CFDs). The need for this study is evident in data integration, exchange and cleaning since dependencies on data sources often only hold conditionally on the view. We investigate dependency propagation for views defined in various fragments of relational algebra, CFDs as view dependencies, and for source dependencies given as either CFDs or traditional functional dependencies (FDs). (a) We establish lower and upper bounds, all matching , ranging from PTIME to undecidable. These not only provide the first results for CFD propagation, but also extend the classical work of FD propagation by giving new complexity bounds in the presence of finite domains. (b) We provide the first algorithm for computing a minimal cover of all CFDs propagated via SPC views; the algorithm has the same complexity as one of the most efficient algorithms for computing a cover of FDs propagated via a projection view, despite the increased expressive power of CFDs and SPC views. (c) We experimentally verify that the algorithm is efficient. </jats:p

    Water Pipeline Leakage Detection Based on Machine Learning and Wireless Sensor Networks

    Get PDF
    The detection of water pipeline leakage is important to ensure that water supply networks can operate safely and conserve water resources. To address the lack of intelligent and the low efficiency of conventional leakage detection methods, this paper designs a leakage detection method based on machine learning and wireless sensor networks (WSNs). The system employs wireless sensors installed on pipelines to collect data and utilizes the 4G network to perform remote data transmission. A leakage triggered networking method is proposed to reduce the wireless sensor network’s energy consumption and prolong the system life cycle effectively. To enhance the precision and intelligence of leakage detection, we propose a leakage identification method that employs the intrinsic mode function, approximate entropy, and principal component analysis to construct a signal feature set and that uses a support vector machine (SVM) as a classifier to perform leakage detection. Simulation analysis and experimental results indicate that the proposed leakage identification method can effectively identify the water pipeline leakage and has lower energy consumption than the networking methods used in conventional wireless sensor networks

    Robust Sparse Mean Estimation via Incremental Learning

    Full text link
    In this paper, we study the problem of robust sparse mean estimation, where the goal is to estimate a kk-sparse mean from a collection of partially corrupted samples drawn from a heavy-tailed distribution. Existing estimators face two critical challenges in this setting. First, they are limited by a conjectured computational-statistical tradeoff, implying that any computationally efficient algorithm needs Ω~(k2)\tilde\Omega(k^2) samples, while its statistically-optimal counterpart only requires O~(k)\tilde O(k) samples. Second, the existing estimators fall short of practical use as they scale poorly with the ambient dimension. This paper presents a simple mean estimator that overcomes both challenges under moderate conditions: it runs in near-linear time and memory (both with respect to the ambient dimension) while requiring only O~(k)\tilde O(k) samples to recover the true mean. At the core of our method lies an incremental learning phenomenon: we introduce a simple nonconvex framework that can incrementally learn the top-kk nonzero elements of the mean while keeping the zero elements arbitrarily small. Unlike existing estimators, our method does not need any prior knowledge of the sparsity level kk. We prove the optimality of our estimator by providing a matching information-theoretic lower bound. Finally, we conduct a series of simulations to corroborate our theoretical findings. Our code is available at https://github.com/huihui0902/Robust_mean_estimation

    Association between tea drinking and disability levels in older Chinese adults: a longitudinal analysis

    Get PDF
    ObjectiveAs the global population ages, disability among the elderly presents unprecedented challenges for healthcare systems. However, limited research has examined whether dietary interventions like tea consumption may alleviate and prevent disability in older adults. As an important dietary therapy, the health benefits of tea drinking have gained recognition across research disciplines. Therefore, this study aimed to investigate the association between tea drinking habits and disability levels in the elderly Chinese population.MethodsLeveraging data from the 2008 to 2018 waves of the Chinese Longitudinal Healthy Longevity Survey, we disaggregated tea drinking frequency and activities of daily living (ADL) measures and deployed fixed-effect ordered logit models to examine the tea-disability association for the first time. We statistically adjusted for potential confounders and conducted stratified analyses to assess heterogeneity across subpopulations.ResultsMultivariable fixed-effect ordered logistic regression suggested tea drinking has protective effects against ADL disability. However, only daily tea drinking was associated with lower risks of basic activities of daily living (BADL) disability [odds ratio (OR) = 0.61; 95% confidence interval (CI), 0.41–0.92] and lower levels of instrumental activities of daily living (IADL) disability (OR = 0.78; 95% CI, 0.64–0.95). Stratified analyses indicated heterogeneous effects across age and income groups. Daily tea drinking protected against BADL (OR = 0.26 and OR = 0.28) and IADL disability (OR = 0.48 and OR = 0.45) for adults over 83 years old and high-income households, respectively.ConclusionWe found that drinking tea almost daily was protective against disability in elderly people, warranting further research into optimal dosages. Future studies should utilize more rigorous causal inference methods and control for confounders

    Focus Is What You Need For Chinese Grammatical Error Correction

    Full text link
    Chinese Grammatical Error Correction (CGEC) aims to automatically detect and correct grammatical errors contained in Chinese text. In the long term, researchers regard CGEC as a task with a certain degree of uncertainty, that is, an ungrammatical sentence may often have multiple references. However, we argue that even though this is a very reasonable hypothesis, it is too harsh for the intelligence of the mainstream models in this era. In this paper, we first discover that multiple references do not actually bring positive gains to model training. On the contrary, it is beneficial to the CGEC model if the model can pay attention to small but essential data during the training process. Furthermore, we propose a simple yet effective training strategy called OneTarget to improve the focus ability of the CGEC models and thus improve the CGEC performance. Extensive experiments and detailed analyses demonstrate the correctness of our discovery and the effectiveness of our proposed method.Comment: Submitted to ICASSP2023 (currently under review

    Contextual Similarity is More Valuable than Character Similarity: Curriculum Learning for Chinese Spell Checking

    Full text link
    Chinese Spell Checking (CSC) task aims to detect and correct Chinese spelling errors. In recent years, related researches focus on introducing the character similarity from confusion set to enhance the CSC models, ignoring the context of characters that contain richer information. To make better use of contextual similarity, we propose a simple yet effective curriculum learning framework for the CSC task. With the help of our designed model-agnostic framework, existing CSC models will be trained from easy to difficult as humans learn Chinese characters and achieve further performance improvements. Extensive experiments and detailed analyses on widely used SIGHAN datasets show that our method outperforms previous state-of-the-art methods

    LatEval: An Interactive LLMs Evaluation Benchmark with Incomplete Information from Lateral Thinking Puzzles

    Full text link
    With the continuous evolution and refinement of LLMs, they are endowed with impressive logical reasoning or vertical thinking capabilities. But can they think out of the box? Do they possess proficient lateral thinking abilities? Following the setup of Lateral Thinking Puzzles, we propose a novel evaluation benchmark, LatEval, which assesses the model's lateral thinking within an interactive framework. In our benchmark, we challenge LLMs with 2 aspects: the quality of questions posed by the model and the model's capability to integrate information for problem-solving. We find that nearly all LLMs struggle with employing lateral thinking during interactions. For example, even the most advanced model, GPT-4, exhibits the advantage to some extent, yet still maintain a noticeable gap when compared to human. This evaluation benchmark provides LLMs with a highly challenging and distinctive task that is crucial to an effective AI assistant.Comment: Work in progres

    Modelling other agents through evolutionary behaviours

    Get PDF
    Modelling other agents is a challenging topic in artificial intelligence research particularly when a subject agent needs to optimise its own decisions by predicting their behaviours under uncertainty. Existing research often leads to a monotonic set of behaviours for other agents so that a subject agent can not cope with unexpected decisions from the other agents. It requires creative ideas about developing diversity of behaviours so as to improve the subject agent’s decision quality. In this paper, we resort to evolutionary computation approaches to generate a new set of behaviours for other agents and solve the complicated agents’ behaviour search and evaluation issues. The new approach starts with the initial behaviours that are ascribed to the other agents and expands the behaviours by using a number of genetic operators in the behaviour evolution. This is the first time that evolutionary techniques are used to modelling other agents in a general multiagent decision framework. We examine the new methods in two well-studied problem domains and provide experimental results in support
    corecore