407 research outputs found

    Top-K Queries on Uncertain Data: On Score Distribution and Typical Answers

    Get PDF
    Uncertain data arises in a number of domains, including data integration and sensor networks. Top-k queries that rank results according to some user-defined score are an important tool for exploring large uncertain data sets. As several recent papers have observed, the semantics of top-k queries on uncertain data can be ambiguous due to tradeoffs between reporting high-scoring tuples and tuples with a high probability of being in the resulting data set. In this paper, we demonstrate the need to present the score distribution of top-k vectors to allow the user to choose between results along this score-probability dimensions. One option would be to display the complete distribution of all potential top-k tuple vectors, but this set is too large to compute. Instead, we propose to provide a number of typical vectors that effectively sample this distribution. We propose efficient algorithms to compute these vectors. We also extend the semantics and algorithms to the scenario of score ties, which is not dealt with in the previous work in the area. Our work includes a systematic empirical study on both real dataset and synthetic datasets.National Natural Science Foundation (Grant number IIS-0086057)National Natural Science Foundation (Grant number IIS- 0325838)National Natural Science Foundation (Grant number IIS-0448124

    What Does the Gradient Tell When Attacking the Graph Structure

    Full text link
    Recent research has revealed that Graph Neural Networks (GNNs) are susceptible to adversarial attacks targeting the graph structure. A malicious attacker can manipulate a limited number of edges, given the training labels, to impair the victim model's performance. Previous empirical studies indicate that gradient-based attackers tend to add edges rather than remove them. In this paper, we present a theoretical demonstration revealing that attackers tend to increase inter-class edges due to the message passing mechanism of GNNs, which explains some previous empirical observations. By connecting dissimilar nodes, attackers can more effectively corrupt node features, making such attacks more advantageous. However, we demonstrate that the inherent smoothness of GNN's message passing tends to blur node dissimilarity in the feature space, leading to the loss of crucial information during the forward process. To address this issue, we propose a novel surrogate model with multi-level propagation that preserves the node dissimilarity information. This model parallelizes the propagation of unaggregated raw features and multi-hop aggregated features, while introducing batch normalization to enhance the dissimilarity in node representations and counteract the smoothness resulting from topological aggregation. Our experiments show significant improvement with our approach.Furthermore, both theoretical and experimental evidence suggest that adding inter-class edges constitutes an easily observable attack pattern. We propose an innovative attack loss that balances attack effectiveness and imperceptibility, sacrificing some attack effectiveness to attain greater imperceptibility. We also provide experiments to validate the compromise performance achieved through this attack loss

    Decoupled Mixup for Data-efficient Learning

    Full text link
    Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data. Recently, dynamic mixup methods have improved previous static policies effectively (e.g., linear interpolation) by maximizing salient regions or maintaining the target in mixed samples. The discrepancy is that the generated mixed samples from dynamic policies are more instance discriminative than the static ones, e.g., the foreground objects are decoupled from the background. However, optimizing mixup policies with dynamic methods in input space is an expensive computation compared to static ones. Hence, we are trying to transfer the decoupling mechanism of dynamic methods from the data level to the objective function level and propose the general decoupled mixup (DM) loss. The primary effect is that DM can adaptively focus on discriminative features without losing the original smoothness of the mixup while avoiding heavy computational overhead. As a result, DM enables static mixup methods to achieve comparable or even exceed the performance of dynamic methods. This also leads to an interesting objective design problem for mixup training that we need to focus on both smoothing the decision boundaries and identifying discriminative features. Extensive experiments on supervised and semi-supervised learning benchmarks across seven classification datasets validate the effectiveness of DM by equipping it with various mixup methods.Comment: The preprint revision, 15 pages, 6 figures. The source code is available at https://github.com/Westlake-AI/openmixu

    Cryo-EM model of the bullet-shaped vesicular stomatitis virus.

    Get PDF
    Vesicular stomatitis virus (VSV) is a bullet-shaped rhabdovirus and a model system of negative-strand RNA viruses. Through direct visualization by means of cryo-electron microscopy, we show that each virion contains two nested, left-handed helices: an outer helix of matrix protein M and an inner helix of nucleoprotein N and RNA. M has a hub domain with four contact sites that link to neighboring M and N subunits, providing rigidity by clamping adjacent turns of the nucleocapsid. Side-by-side interactions between neighboring N subunits are critical for the nucleocapsid to form a bullet shape, and structure-based mutagenesis results support this description. Together, our data suggest a mechanism of VSV assembly in which the nucleocapsid spirals from the tip to become the helical trunk, both subsequently framed and rigidified by the M layer

    Endothelial Stomatal and Fenestral Diaphragms in Normal Vessels and Angiogenesis

    Get PDF
    Vascular endothelium lines the entire cardiovascular system where performs a series of vital functions including the control of microvascular permeability, coagulation inflammation, vascular tone as well as the formation of new vessels via vasculogenesis and angiogenesis in normal and disease states. Normal endothelium consists of heterogeneous populations of cells differentiated according to the vascular bed and segment of the vascular tree where they occur. One of the cardinal features is the expression of specific subcellular structures such as plasmalemmal vesicles or caveolae, transendothelial channels, vesiculo-vacuolar organelles, endothelial pockets and fenestrae, whose presence define several endothelial morphological types. A less explored observation is the differential expression of such structures in diverse settings of angiogenesis. This review will focus on the latest developments on the components, structure and function of these specific endothelial structures in normal endothelium as well as in diverse settings of angiogenesis

    Re and<sup> 99m</sup>Tc complexes of BodP<sub>3</sub> – multi-modality imaging probes

    Get PDF
    A fluorescent tridentate phosphine, BodP(3) (2), forms rhenium complexes which effectively image cancer cells. Related technetium analogues are also readily prepared and have potential as dual SPECT/fluorescent biological probes

    Random-phase approximation and its applications in computational chemistry and materials science

    Full text link
    The random-phase approximation (RPA) as an approach for computing the electronic correlation energy is reviewed. After a brief account of its basic concept and historical development, the paper is devoted to the theoretical formulations of RPA, and its applications to realistic systems. With several illustrating applications, we discuss the implications of RPA for computational chemistry and materials science. The computational cost of RPA is also addressed which is critical for its widespread use in future applications. In addition, current correction schemes going beyond RPA and directions of further development will be discussed.Comment: 25 pages, 11 figures, published online in J. Mater. Sci. (2012

    Effective Rheology of Bubbles Moving in a Capillary Tube

    Full text link
    We calculate the average volumetric flux versus pressure drop of bubbles moving in a single capillary tube with varying diameter, finding a square-root relation from mapping the flow equations onto that of a driven overdamped pendulum. The calculation is based on a derivation of the equation of motion of a bubble train from considering the capillary forces and the entropy production associated with the viscous flow. We also calculate the configurational probability of the positions of the bubbles.Comment: 4 pages, 1 figur

    TRY plant trait database - enhanced coverage and open access

    Get PDF
    Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives
    corecore