Search CORE

12,613 research outputs found

Automated cleansing of POI databases

Author: A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
C. Baral
C. Baral
D. Dubois
G. Bordogna
G. Cooman De
G. Nachouki
G. Tré De
H. Foley
I. Bloch
I. Fellegi
J. Dujmović
J. Lin
J. Lin
L.A. Zadeh
L.A. Zadeh
M. Bright
M.A. Rodríguez
P. Carrara
R. Torres
R. Yager
R. Yager
R. Yager
R.W. Sinnott
S. Destercke
S. Konieczny
S. Rahimi
S. Sandri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Ghent University Academic Bibliography

Data mining based cyber-attack detection

Author: Tianfield Huaglory
Publication venue
Publication date: 31/05/2017
Field of study

ResearchOnline@GCU

Load curve data cleansing and imputation via sparsity and low rank

Author: Giannakis Georgios B.
Mateos Gonzalo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/01/2013
Field of study

The smart grid vision is to build an intelligent power network with an unprecedented level of situational awareness and controllability over its services and infrastructure. This paper advocates statistical inference methods to robustify power monitoring tasks against the outlier effects owing to faulty readings and malicious attacks, as well as against missing data due to privacy concerns and communication errors. In this context, a novel load cleansing and imputation scheme is developed leveraging the low intrinsic-dimensionality of spatiotemporal load profiles and the sparse nature of "bad data.'' A robust estimator based on principal components pursuit (PCP) is adopted, which effects a twofold sparsity-promoting regularization through an

\ell_1

-norm of the outliers, and the nuclear norm of the nominal load profiles. Upon recasting the non-separable nuclear norm into a form amenable to decentralized optimization, a distributed (D-) PCP algorithm is developed to carry out the imputation and cleansing tasks using networked devices comprising the so-termed advanced metering infrastructure. If D-PCP converges and a qualification inequality is satisfied, the novel distributed estimator provably attains the performance of its centralized PCP counterpart, which has access to all networkwide data. Computer simulations and tests with real load curve data corroborate the convergence and effectiveness of the novel D-PCP algorithm.Comment: 8 figures, submitted to IEEE Transactions on Smart Grid - Special issue on "Optimization methods and algorithms applied to smart grid

arXiv.org e-Print Archive

CiteSeerX

Big Data Caching for Networking: Moving from Cloud to Edge

Author: Baştuğ Ejder
Bennis Mehdi
Debbah Mérouane
Er Ahmet Salih
Kader Manhal Abdel
Karatepe Alper
Zeydan Engin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/06/2016
Field of study

In order to cope with the relentless data tsunami in

5G

wireless networks, current approaches such as acquiring new spectrum, deploying more base stations (BSs) and increasing nodes in mobile packet core networks are becoming ineffective in terms of scalability, cost and flexibility. In this regard, context-aware

5

G networks with edge/cloud computing and exploitation of \emph{big data} analytics can yield significant gains to mobile operators. In this article, proactive content caching in

5

G wireless networks is investigated in which a big data-enabled architecture is proposed. In this practical architecture, vast amount of data is harnessed for content popularity estimation and strategic contents are cached at the BSs to achieve higher users' satisfaction and backhaul offloading. To validate the proposed solution, we consider a real-world case study where several hours of mobile data traffic is collected from a major telecom operator in Turkey and a big data-enabled analysis is carried out leveraging tools from machine learning. Based on the available information and storage capacity, numerical studies show that several gains are achieved both in terms of users' satisfaction and backhaul offloading. For example, in the case of

16

BSs with

30\%

of content ratings and

13

Gbyte of storage size (

78\%

of total library size), proactive caching yields

100\%

of users' satisfaction and offloads

98\%

of the backhaul.Comment: accepted for publication in IEEE Communications Magazine, Special Issue on Communications, Caching, and Computing for Content-Centric Mobile Network

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-Rennes 1

Towards a framework for designing full model selection and optimization systems

Author: Mayo Michael
Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

People from a variety of industrial domains are beginning to realise that appropriate use of machine learning techniques for their data mining projects could bring great benefits. End-users now have to face the new problem of how to choose a combination of data processing tools and algorithms for a given dataset. This problem is usually termed the Full Model Selection (FMS) problem. Extended from our previous work [10], in this paper, we introduce a framework for designing FMS algorithms. Under this framework, we propose a novel algorithm combining both genetic algorithms (GA) and particle swarm optimization (PSO) named GPS (which stands for GA-PSO-FMS), in which a GA is used for searching the optimal structure for a data mining solution, and PSO is used for searching optimal parameters for a particular structure instance. Given a classification dataset, GPS outputs a FMS solution as a directed acyclic graph consisting of diverse data mining operators that are available to the problem. Experimental results demonstrate the benefit of the algorithm. We also present, with detailed analysis, two model-tree-based variants for speeding up the GPS algorithm

Research Commons@Waikato

MARVEL: measured active rotational-vibrational energy levels

Author: Csaszar AG
Furtenbacher T
Tennyson J
Publication venue: ACADEMIC PRESS INC ELSEVIER SCIENCE
Publication date: 01/10/2007
Field of study

An algorithm is proposed, based principally on an earlier proposition of Flaud and co-workers [Mol. Phys. 32 (1976) 499], that inverts the information contained in uniquely assigned experimental rotational-vibrational transitions in order to obtain measured active rotational-vibrational energy levels (MARVEL). The procedure starts with collecting, critically evaluating, selecting, and compiling all available measured transitions, including assignments and uncertainties, into a single database. Then, spectroscopic networks (SN) are determined which contain all interconnecting rotational-vibrational energy levels supported by the grand database of the selected transitions. Adjustment of the uncertainties of the lines is performed next, with the help of a robust weighting strategy, until a self-consistent set of lines and uncertainties is achieved. Inversion of the transitions through a weighted least-squares-type procedure results in MARVEL energy levels and associated uncertainties. Local sensitivity coefficients could be computed for each energy level. The resulting set of MARVEL levels is called active as when new experimental measurements become available the same evaluation, adjustment, and inversion procedure should be repeated in order to obtain more dependable energy levels and uncertainties. MARVEL is tested on the example of the H-2 O-17 isotopologue of water and a list of 2736 dependable energy levels, based on 8369 transitions, has been obtained. (c) 2007 Elsevier Inc. All rights reserved

UCL Discovery

Predicting \u27Attention Deficit Hyperactive Disorder\u27 using large scale child data set

Author: Shah Arpi
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2015
Field of study

Attention deficit hyperactivity disorder (ADHD) is a disorder found in children affecting about 9.5% of American children aged 13 years or more. Every year, the number of children diagnosed with ADHD is increasing. There is no single test that can diagnose ADHD. In fact, a health practitioner has to analyze the behavior of the child to determine if the child has ADHD. He has to gather information about the child, and his/her behavior and environment. Because of all these problems in diagnosis, I propose to use Machine Learning techniques to predict ADHD by using large scale child data set. Machine learning offers a principled approach for developing sophisticated, automatic, and objective algorithms for analysis of disease. Lot of new approaches have immerged which allows to develop understanding and provides opportunity to do advanced analysis. Use of classification model in detection has made significant impacts in the detection and diagnosis of diseases. I propose to use binary classification techniques for detection and diagnosis of ADHD

SJSU ScholarWorks

A Survey on IT-Techniques for a Dynamic Emergency Management in Large Infrastructures

Author: Brodt Simon
Bry François
Eckert Michael
Hausmann Steffen
Poppe Olga
Publication venue
Publication date: 30/06/2010
Field of study

This deliverable is a survey on the IT techniques that are relevant to the three use cases of the project EMILI. It describes the state-of-the-art in four complementary IT areas: Data cleansing, supervisory control and data acquisition, wireless sensor networks and complex event processing. Even though the deliverable’s authors have tried to avoid a too technical language and have tried to explain every concept referred to, the deliverable might seem rather technical to readers so far little familiar with the techniques it describes

Open Access LMU

Structural health monitoring of offshore wind turbines: A review through the Statistical Pattern Recognition Paradigm

Author: Kolios Athanasios
Martinez-Luengo Maria
Wang Lin
Publication venue: 'Elsevier BV'
Publication date: 01/10/2016
Field of study

Offshore Wind has become the most profitable renewable energy source due to the remarkable development it has experienced in Europe over the last decade. In this paper, a review of Structural Health Monitoring Systems (SHMS) for offshore wind turbines (OWT) has been carried out considering the topic as a Statistical Pattern Recognition problem. Therefore, each one of the stages of this paradigm has been reviewed focusing on OWT application. These stages are: Operational Evaluation; Data Acquisition, Normalization and Cleansing; Feature Extraction and Information Condensation; and Statistical Model Development. It is expected that optimizing each stage, SHMS can contribute to the development of efficient Condition-Based Maintenance Strategies. Optimizing this strategy will help reduce labor costs of OWTs׳ inspection, avoid unnecessary maintenance, identify design weaknesses before failure, improve the availability of power production while preventing wind turbines׳ overloading, therefore, maximizing the investments׳ return. In the forthcoming years, a growing interest in SHM technologies for OWT is expected, enhancing the potential of offshore wind farm deployments further offshore. Increasing efficiency in operational management will contribute towards achieving UK׳s 2020 and 2050 targets, through ultimately reducing the Levelised Cost of Energy (LCOE)

Elsevier - Publisher Connector

University of Strathclyde Institutional Repository

Cranfield CERES