23 research outputs found

    Mining Diversity on Networks

    Get PDF

    Textually Relevant Spatial Skylines

    Get PDF
    postprin

    Privacy Aware Parallel Computation of Skyline Sets Queries from Distributed Databases

    Get PDF
    A skyline query finds objects that are not dominated by another object from a given set of objects. Skyline queries help us to filter unnecessary information efficiently and provide us clues for various decision making tasks. However, we cannot use skyline queries in privacy aware environment, since we have to hide individual's records values even though there is no ID information. Therefore, we considered skyline sets queries. The skyline set query returns skyline sets from all possible sets, each of which is composed of some objects in a database. With the growth of network infrastructure data are stored in distributed databases. In this paper, we expand the idea to compute skyline sets queries in parallel fashion from distributed databases without disclosing individual records to others. The proposed method utilizes an agent-based parallel computing framework that can efficiently compute skyline sets queries and can solve the privacy problems of skyline queries in distributed environment. The computation of skyline sets is performed simultaneously in all databases which increases parallelism and reduces the computation time

    Event detection in high throughput social media

    Get PDF

    Event detection in high throughput social media

    Get PDF

    筑波大学計算科学研究センター 平成22年度 年次報告書

    Get PDF
    1 平成22年度 重点施策・改善目標 …… 42 平成22年度 実績報告 …… 73 各研究部門の報告 …… 11Ⅰ.素粒子物理研究部門 …… 11Ⅱ.宇宙・原子核物理研究部門 …… 23 Ⅱ-1.宇宙分野 …… 23 Ⅱ-2.原子核分野 …… 41Ⅲ.量子物性研究部門 …… 50Ⅳ.生命科学研究部門 …… 76 Ⅳ-1.生命機能情報分野 …… 76 Ⅳ-2.分子進化分野 …… 83Ⅴ.地球環境研究部門 …… 89Ⅵ.高性能計算システム研究部門 …… 99Ⅶ.計算情報学研究部門 …… 107 Ⅶ-1.データ基盤分野 …… 107 Ⅶ-2.計算メディア分野 …… 12

    StatBreak: identifying "lucky" data points through genetic algorithms

    Get PDF
    Sometimes interesting statistical findings are produced by a small number of “lucky” data points within the tested sample. To address this issue, researchers and reviewers are encouraged to investigate outliers and influential data points. Here, we present StatBreak, an easy-to-apply method, based on a genetic algorithm, that identifies the observations that most strongly contributed to a finding (e.g., effect size, model fit, p value, Bayes factor). Within a given sample, StatBreak searches for the largest subsample in which a previously observed pattern is not present or is reduced below a specifiable threshold. Thus, it answers the following question: “Which (and how few) ‘lucky’ cases would need to be excluded from the sample for the data-based conclusion to change?” StatBreak consists of a simple R function and flags the luckiest data points for any form of statistical analysis. Here, we demonstrate the effectiveness of the method with simulated and real data across a range of study designs and analyses. Additionally, we describe StatBreak’s R function and explain how researchers and reviewers can apply the method to the data they are working with.Social decision makin

    The GNAR-edge model: A network autoregressive model for networks with time-varying edge weights

    Full text link
    In economic and financial applications, there is often the need for analysing multivariate time series, comprising of time series for a range of quantities. In some applications such complex systems can be associated with some underlying network describing pairwise relationships among the quantities. Accounting for the underlying network structure for the analysis of this type of multivariate time series is required for assessing estimation error and can be particularly informative for forecasting. Our work is motivated by a dataset consisting of time series of industry-to-industry transactions. In this example, pairwise relationships between Standard Industrial Classification (SIC) codes can be represented using a network, with SIC codes as nodes and pairwise transactions between SIC codes as edges, while the observed time series of the amounts of the transactions for each pair of SIC codes can be regarded as time-varying weights on the edges. Inspired by Knight et al. (2020), we introduce the GNAR-edge model which allows modelling of multiple time series utilising the network structure, assuming that each edge weight depends not only on its past values, but also on past values of its neighbouring edges, for a range of neighbourhood stages. The method is validated through simulations. Results from the implementation of the GNAR-edge model on the real industry-to-industry data show good fitting and predictive performance of the model. The predictive performance is improved when sparsifying the network using a lead-lag analysis and thresholding edges according to a lead-lag score

    Big data-driven multimodal traffic management : trends and challenges

    Get PDF
    corecore