15,449 research outputs found

    Improving multivariate data streams clustering.

    Get PDF
    Clustering data streams is an important task in data mining research. Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams. Even so, these algorithms merely aggregate the attributes without touching upon the correlation among them. In order to overcome this issue, we propose a new framework to cluster multivariate data streams based on their evolving behavior over time, exploring the correlations among their attributes by computing the fractal dimension. Experimental results with climate data streams show that the clusters' quality and compactness can be improved compared to the competing method, leading to the thoughtfulness that attributes correlations cannot be put aside. In fact, the clusters' compactness are 7 to 25 times better using our method. Our framework also proves to be an useful tool to assist meteorologists in understanding the climate behavior along a period of time.Edição dos Proceedings do 16th International Conference on Computational Science, San Diego, 2016

    Big Data and Reliability Applications: The Complexity Dimension

    Full text link
    Big data features not only large volumes of data but also data with complicated structures. Complexity imposes unique challenges in big data analytics. Meeker and Hong (2014, Quality Engineering, pp. 102-116) provided an extensive discussion of the opportunities and challenges in big data and reliability, and described engineering systems that can generate big data that can be used in reliability analysis. Meeker and Hong (2014) focused on large scale system operating and environment data (i.e., high-frequency multivariate time series data), and provided examples on how to link such data as covariates to traditional reliability responses such as time to failure, time to recurrence of events, and degradation measurements. This paper intends to extend that discussion by focusing on how to use data with complicated structures to do reliability analysis. Such data types include high-dimensional sensor data, functional curve data, and image streams. We first provide a review of recent development in those directions, and then we provide a discussion on how analytical methods can be developed to tackle the challenging aspects that arise from the complexity feature of big data in reliability applications. The use of modern statistical methods such as variable selection, functional data analysis, scalar-on-image regression, spatio-temporal data models, and machine learning techniques will also be discussed.Comment: 28 pages, 7 figure

    Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

    Get PDF
    In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1

    Multiple scales of biological variability in New Zealand streams : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Ecology at Massey University, Manawatƫ, New Zealand

    Get PDF
    Stream fish communities in Taranaki, New Zealand, were studied for the patterns and drivers of their spatial ecology. The study was focused on three main themes: a) complementarity between geography and landuse in driving regional distribution patterns of stream fish, b) the impact of agriculture on community composition, structure and variability of fish and invertebrates, and c) concordance among environmental distance and community dissimilarities of stream fish and invertebrates. Stream sampling and data collection for fish was conducted at regional scale using 96 sites distributed in the protected forest (44 sites) of Egmont National Park in Taranaki, and in surrounding farmlands (52 sites). Local scale sampling for fish and invertebrates was carried out at 15 stream sites in pasture (8 sites) and in adjacent forest (7 sites). Environmental data of geography, landuse and local habitat description were also gathered concurrently to biological sampling. The regional scale survey reported fifteen fish species, dominated by longfin eels (Anguilla dieffenbachia), redfin bullies (Gobiomorphus huttoni) and koaro (Galaxias brevipinnis), while 12 fish species and 69 different invertebrate taxa were recorded from the 15 sites at local scale. Regional scale spatial patterns of fish were mainly driven by landuse pattern. Catchment landuse (characterised by percentage cover of farming/native forest) effectively partitioned the stream fish community structure in Taranaki. Within each level of catchment landuse (farming), abundance and richness of fish species were negatively correlated with the altitude. Moreover, the upstream slope in high elevations and intensive farming downstream limited the distribution of stream fish across the region. Fish community composition differed significantly but weakly between forest and pasture in the immediate proximity. The dissimilarity of fish communities between forest and pasture increased from regional to local scale, and a similar result was found with stream invertebrate dissimilarity at the local scale. Stream communities (fish and invertebrates) were equally variable among streams between the two land use classes both at regional and local scales. Although the land use difference did not affect within-stream variability of fish, invertebrate communities were less variable within a pasture stream. Trends in in-stream variability of invertebrates were influenced mainly by altitude, stream morphology, pH, and riparian native cover. In concordance analysis, Mantel and Procrustes tests were used to compare community matrices of fish and invertebrates and the environmental distance between stream sites. The spatial patterns of fish and invertebrates were significantly concordant with each other among the 15 streams at the local scale. Nevertheless, community concordance decreased with lower spatial scales, and the two communities were not concordant at local sites within a given stream. Agriculture had a negative impact on the concordance between fish and invertebrates among streams, and none of the communities correlated with the overall environmental distance between agricultural streams. Community concordance between fish and invertebrates was consistently higher than the community-environment links, and lower trophic level (invertebrates) linked to their environment more closely than the upper trophic level (fish). The overall results suggest a bottom-up control of the communities through the stream food web. Finally, to inform the regional management and conservation decision, stream sites were partitioned according to the most important bioenvironmental constraints. The ecological similarity was measured by geography, land use pattern and the abundances of influential native fish species within the region, and the streams were clustered into seven distinct zones, using the method of affinity propagation. Interestingly, the dichotomy in proximal land use was not generally represented between zones, and the species diversity gradients were not significantly different across the zonal stream clusters. The average elevation of a given zone did not influence the community variability, while upstream pasture significantly homogenised fish communities between streams within a zone. Nonetheless the zones were based on river-system connectivity and geographical proximity. This study showed separate effects of confounding geography (altitude) and landuse on stream fish community structure, which has not explicitly been explored by previous studies. Studies with a simultaneous focus on multiple biological (e.g. fish and invertebrates) and environmental (e.g. geography, landuse, stream morphology) scales in varying spatial scales are not common in freshwater ecology. Therefore, this study has a great contribution to the understanding of the spatial ecology of stream communities linked with the control of geography, landuse, environment and likely biological interactions between fish and invertebrates

    Fronthaul-Constrained Cloud Radio Access Networks: Insights and Challenges

    Full text link
    As a promising paradigm for fifth generation (5G) wireless communication systems, cloud radio access networks (C-RANs) have been shown to reduce both capital and operating expenditures, as well as to provide high spectral efficiency (SE) and energy efficiency (EE). The fronthaul in such networks, defined as the transmission link between a baseband unit (BBU) and a remote radio head (RRH), requires high capacity, but is often constrained. This article comprehensively surveys recent advances in fronthaul-constrained C-RANs, including system architectures and key techniques. In particular, key techniques for alleviating the impact of constrained fronthaul on SE/EE and quality of service for users, including compression and quantization, large-scale coordinated processing and clustering, and resource allocation optimization, are discussed. Open issues in terms of software-defined networking, network function virtualization, and partial centralization are also identified.Comment: 5 Figures, accepted by IEEE Wireless Communications. arXiv admin note: text overlap with arXiv:1407.3855 by other author
    • 

    corecore