24,365 research outputs found

    Perspects in astrophysical databases

    Full text link
    Astrophysics has become a domain extremely rich of scientific data. Data mining tools are needed for information extraction from such large datasets. This asks for an approach to data management emphasizing the efficiency and simplicity of data access; efficiency is obtained using multidimensional access methods and simplicity is achieved by properly handling metadata. Moreover, clustering and classification techniques on large datasets pose additional requirements in terms of computation and memory scalability and interpretability of results. In this study we review some possible solutions

    A supervised clustering approach for fMRI-based inference of brain states

    Get PDF
    We propose a method that combines signals from many brain regions observed in functional Magnetic Resonance Imaging (fMRI) to predict the subject's behavior during a scanning session. Such predictions suffer from the huge number of brain regions sampled on the voxel grid of standard fMRI data sets: the curse of dimensionality. Dimensionality reduction is thus needed, but it is often performed using a univariate feature selection procedure, that handles neither the spatial structure of the images, nor the multivariate nature of the signal. By introducing a hierarchical clustering of the brain volume that incorporates connectivity constraints, we reduce the span of the possible spatial configurations to a single tree of nested regions tailored to the signal. We then prune the tree in a supervised setting, hence the name supervised clustering, in order to extract a parcellation (division of the volume) such that parcel-based signal averages best predict the target information. Dimensionality reduction is thus achieved by feature agglomeration, and the constructed features now provide a multi-scale representation of the signal. Comparisons with reference methods on both simulated and real data show that our approach yields higher prediction accuracy than standard voxel-based approaches. Moreover, the method infers an explicit weighting of the regions involved in the regression or classification task

    Data Management and Mining in Astrophysical Databases

    Full text link
    We analyse the issues involved in the management and mining of astrophysical data. The traditional approach to data management in the astrophysical field is not able to keep up with the increasing size of the data gathered by modern detectors. An essential role in the astrophysical research will be assumed by automatic tools for information extraction from large datasets, i.e. data mining techniques, such as clustering and classification algorithms. This asks for an approach to data management based on data warehousing, emphasizing the efficiency and simplicity of data access; efficiency is obtained using multidimensional access methods and simplicity is achieved by properly handling metadata. Clustering and classification techniques, on large datasets, pose additional requirements: computational and memory scalability with respect to the data size, interpretability and objectivity of clustering or classification results. In this study we address some possible solutions.Comment: 10 pages, Late

    Energy Efficiency in Cache Enabled Small Cell Networks With Adaptive User Clustering

    Full text link
    Using a network of cache enabled small cells, traffic during peak hours can be reduced considerably through proactively fetching the content that is most probable to be requested. In this paper, we aim at exploring the impact of proactive caching on an important metric for future generation networks, namely, energy efficiency (EE). We argue that, exploiting the correlation in user content popularity profiles in addition to the spatial repartitions of users with comparable request patterns, can result in considerably improving the achievable energy efficiency of the network. In this paper, the problem of optimizing EE is decoupled into two related subproblems. The first one addresses the issue of content popularity modeling. While most existing works assume similar popularity profiles for all users in the network, we consider an alternative caching framework in which, users are clustered according to their content popularity profiles. In order to showcase the utility of the proposed clustering scheme, we use a statistical model selection criterion, namely Akaike information criterion (AIC). Using stochastic geometry, we derive a closed-form expression of the achievable EE and we find the optimal active small cell density vector that maximizes it. The second subproblem investigates the impact of exploiting the spatial repartitions of users with comparable request patterns. After considering a snapshot of the network, we formulate a combinatorial optimization problem that enables to optimize content placement such that the used transmission power is minimized. Numerical results show that the clustering scheme enable to considerably improve the cache hit probability and consequently the EE compared with an unclustered approach. Simulations also show that the small base station allocation algorithm results in improving the energy efficiency and hit probability.Comment: 30 pages, 5 figures, submitted to Transactions on Wireless Communications (15-Dec-2016
    corecore