2,180 research outputs found

    Towards mining trapezoidal data streams

    Full text link
    © 2015 IEEE. We study a new problem of learning from doubly-streaming data where both data volume and feature space increase over time. We refer to the problem as mining trapezoidal data streams. The problem is challenging because both data volume and feature space are increasing, to which existing online learning, online feature selection and streaming feature selection algorithms are inapplicable. We propose a new Sparse Trapezoidal Streaming Data mining algorithm (STSD) and its two variants which combine online learning and online feature selection to enable learning trapezoidal data streams with infinite training instances and features. Specifically, when new training instances carrying new features arrive, the classifier updates the existing features by following the passive-aggressive update rule used in online learning and updates the new features with the structural risk minimization principle. Feature sparsity is also introduced using the projected truncation techniques. Extensive experiments on the demonstrated UCI data sets show the performance of the proposed algorithms

    Personalized Temporal Medical Alert System

    No full text
    International audienceThe continuous increasing needs in telemedicine and healthcare, accentuate the need of well-adapted medical alert systems. Such alert systems may be used by a variety of patients and medical actors, and should allow monitoring a wide range of medical variables. This paper proposes Tempas, a personalized temporal alert system. It facilitates customized alert configuration by using linguistic trends. The trend detection algorithm is based on data normalization, time series segmentation, and segment classification. It improves state of the art by treating irregular and regular time series in an appropriate way, thanks to the introduction of an observation variable valid time. Alert detection is enriched with quality and applicability measures. They allow a personalized tuning of the system to help reducing false negatives and false positives alert

    Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning

    Get PDF
    Fine tuning distributed systems is considered to be a craftsmanship, relying on intuition and experience. This becomes even more challenging when the systems need to react in near real time, as streaming engines have to do to maintain pre-agreed service quality metrics. In this article, we present an automated approach that builds on a combination of supervised and reinforcement learning methods to recommend the most appropriate lever configurations based on previous load. With this, streaming engines can be automatically tuned without requiring a human to determine the right way and proper time to deploy them. This opens the door to new configurations that are not being applied today since the complexity of managing these systems has surpassed the abilities of human experts. We show how reinforcement learning systems can find substantially better configurations in less time than their human counterparts and adapt to changing workloads

    Sistemas granulares evolutivos

    Get PDF
    Orientador: Fernando Antonio Campos GomideTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Recentemente tem-se observado um crescente interesse em abordagens de modelagem computacional para lidar com fluxos de dados do mundo real. Métodos e algoritmos têm sido propostos para obtenção de conhecimento a partir de conjuntos de dados muito grandes e, a princípio, sem valor aparente. Este trabalho apresenta uma plataforma computacional para modelagem granular evolutiva de fluxos de dados incertos. Sistemas granulares evolutivos abrangem uma variedade de abordagens para modelagem on-line inspiradas na forma com que os humanos lidam com a complexidade. Esses sistemas exploram o fluxo de informação em ambiente dinâmico e extrai disso modelos que podem ser linguisticamente entendidos. Particularmente, a granulação da informação é uma técnica natural para dispensar atenção a detalhes desnecessários e enfatizar transparência, interpretabilidade e escalabilidade de sistemas de informação. Dados incertos (granulares) surgem a partir de percepções ou descrições imprecisas do valor de uma variável. De maneira geral, vários fatores podem afetar a escolha da representação dos dados tal que o objeto representativo reflita o significado do conceito que ele está sendo usado para representar. Neste trabalho são considerados dados numéricos, intervalares e fuzzy; e modelos intervalares, fuzzy e neuro-fuzzy. A aprendizagem de sistemas granulares é baseada em algoritmos incrementais que constroem a estrutura do modelo sem conhecimento anterior sobre o processo e adapta os parâmetros do modelo sempre que necessário. Este paradigma de aprendizagem é particularmente importante uma vez que ele evita a reconstrução e o retreinamento do modelo quando o ambiente muda. Exemplos de aplicação em classificação, aproximação de função, predição de séries temporais e controle usando dados sintéticos e reais ilustram a utilidade das abordagens de modelagem granular propostas. O comportamento de fluxos de dados não-estacionários com mudanças graduais e abruptas de regime é também analisado dentro do paradigma de computação granular evolutiva. Realçamos o papel da computação intervalar, fuzzy e neuro-fuzzy em processar dados incertos e prover soluções aproximadas de alta qualidade e sumário de regras de conjuntos de dados de entrada e saída. As abordagens e o paradigma introduzidos constituem uma extensão natural de sistemas inteligentes evolutivos para processamento de dados numéricos a sistemas granulares evolutivos para processamento de dados granularesAbstract: In recent years there has been increasing interest in computational modeling approaches to deal with real-world data streams. Methods and algorithms have been proposed to uncover meaningful knowledge from very large (often unbounded) data sets in principle with no apparent value. This thesis introduces a framework for evolving granular modeling of uncertain data streams. Evolving granular systems comprise an array of online modeling approaches inspired by the way in which humans deal with complexity. These systems explore the information flow in dynamic environments and derive from it models that can be linguistically understood. Particularly, information granulation is a natural technique to dispense unnecessary details and emphasize transparency, interpretability and scalability of information systems. Uncertain (granular) data arise from imprecise perception or description of the value of a variable. Broadly stated, various factors can affect one's choice of data representation such that the representing object conveys the meaning of the concept it is being used to represent. Of particular concern to this work are numerical, interval, and fuzzy types of granular data; and interval, fuzzy, and neurofuzzy modeling frameworks. Learning in evolving granular systems is based on incremental algorithms that build model structure from scratch on a per-sample basis and adapt model parameters whenever necessary. This learning paradigm is meaningful once it avoids redesigning and retraining models all along if the system changes. Application examples in classification, function approximation, time-series prediction and control using real and synthetic data illustrate the usefulness of the granular approaches and framework proposed. The behavior of nonstationary data streams with gradual and abrupt regime shifts is also analyzed in the realm of evolving granular computing. We shed light upon the role of interval, fuzzy, and neurofuzzy computing in processing uncertain data and providing high-quality approximate solutions and rule summary of input-output data sets. The approaches and framework introduced constitute a natural extension of evolving intelligent systems over numeric data streams to evolving granular systems over granular data streamsDoutoradoAutomaçãoDoutor em Engenharia Elétric

    Do contaminants originating from state-of-the-art treated wastewater impact the ecological quality of surface waters?

    Get PDF
    Since the 1980s, advances in wastewater treatment technology have led to considerably improved surface water quality in the urban areas of many high income countries. However, trace concentrations of organic wastewater-associated contaminants may still pose a key environmental hazard impairing the ecological quality of surface waters. To identify key impact factors, we analyzed the effects of a wide range of anthropogenic and environmental variables on the aquatic macroinvertebrate community. We assessed ecological water quality at 26 sampling sites in four urban German lowland river systems with a 0–100% load of state-of-the-art biological activated sludge treated wastewater. The chemical analysis suite comprised 12 organic contaminants (five phosphor organic flame retardants, two musk fragrances, bisphenol A, nonylphenol, octylphenol, diethyltoluamide, terbutryn), 16 polycyclic aromatic hydrocarbons, and 12 heavy metals. Non-metric multidimensional scaling identified organic contaminants that are mainly wastewater-associated (i.e., phosphor organic flame retardants, musk fragrances, and diethyltoluamide) as a major impact variable on macroinvertebrate species composition. The structural degradation of streams was also identified as a significant factor. Multiple linear regression models revealed a significant impact of organic contaminants on invertebrate populations, in particular on Ephemeroptera, Plecoptera, and Trichoptera species. Spearman rank correlation analyses confirmed wastewater-associated organic contaminants as the most significant variable negatively impacting the biodiversity of sensitive macroinvertebrate species. In addition to increased aquatic pollution with organic contaminants, a greater wastewater fraction was accompanied by a slight decrease in oxygen concentration and an increase in salinity. This study highlights the importance of reducing the wastewater-associated impact on surface waters. For aquatic ecosystems in urban areas this would lead to: (i) improvement of the ecological integrity, (ii) reduction of biodiversity loss, and (iii) faster achievement of objectives of legislative requirements, e.g., the European Water Framework Directive

    Online Deep Learning from Doubly-Streaming Data

    Get PDF
    This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. A plausible idea to deal with such data streams is to establish a relationship between the old and new feature spaces, so that an online learner can leverage the knowledge learned from the old features to better the learning performance on the new features. Unfortunately, this idea does not scale up to high-dimensional multimedia data with complex feature interplay, which suffers a tradeoff between onlineness, which biases shallow learners, and expressiveness, which requires deep models. Motivated by this, we propose a novel OLD3S paradigm, where a shared latent subspace is discovered to summarize information from the old and new feature spaces, building an intermediate feature mapping relationship. A key trait of OLD3S is to treat the model capacity as a learnable semantics, aiming to yield optimal model depth and parameters jointly in accordance with the complexity and non-linearity of the input data streams in an online fashion. Both theoretical analysis and empirical studies substantiate the viability and effectiveness of our proposed approach. The code is available online at https://github.com/X1aoLian/OLD3S

    Catchment controls of denitrification and nitrous oxide production rates in headwater remediated agricultural streams

    Get PDF
    Heavily modified headwater streams and open ditches carry high nitrogen loads from agricultural soils that sustain eutrophication and poor water quality in downstream aquatic ecosystems. To remediate agricultural streams and reduce the export of nitrate (NO3-), phosphorus and suspended sediments, two-stage ditches with constructed floodplains can be implemented as countermeasures. By extending hydrological connectivity between the stream channel and riparian corridor within constructed floodplains, these remediated ditches enhance the removal of NO3- via the microbial denitrification process. Ten remediated ditches were paired with upstream trapezoidal ditches in Sweden across different soils and land uses to measure the capacity for denitrification and nitrous oxide (N2O) production and yields under denitrifying conditions in stream and floodplain sediments. To examine the controls for denitrification, water quality was monitored monthly and flow discharge continuously along reaches. Floodplain sediments accounted for 33% of total denitrification capacity of remediated ditches, primarily controlled by inundation and stream NO3- concentrations. Despite reductions in flow-weighted NO3- concentrations along reaches, NW removal in remediated ditches via denitrification can be masked by inputs of NW-rich groundwaters, typical of intensively managed agricultural landscapes. Although N2O production rates were 50 % lower in floodplains compared to the stream, remediated ditches emitted more N2O than conventional trapezoidal ditches. Higher denitrification rates and reductions of N2O proportions were predicted by catchments with loamy soils, higher proportions of agricultural land use and lower floodplain elevations. For realizing enhanced NO3- removal from floodplains and avoiding increased N2O emissions, soil type, land use and the design of floodplains need to be considered when implementing remediated streams. Further, we stress the need for assessing the impact of stream remediation in the context of broader catchment processes, to determine the overall potential for improving water quality
    • …
    corecore