35 research outputs found

    Outlier Detection Methods for Industrial Applications

    Get PDF
    An outlier is an observation (or measurement) that is different with respect to the other values contained in a given dataset. Outliers can be due to several causes. The measurement can be incorrectly observed, recorded or entered into the process computer, the observed datum can come from a different population with respect to the normal situation and thus is correctly measured but represents a rare event. In literature different definitions of outlier exist: the most commonly referred are reported in the following: - "An outlier is an observation that deviates so much from other observations as to arouse suspicions that is was generated by a different mechanism " (Hawkins, 1980). - "An outlier is an observation (or subset of observations) which appear to be inconsistent with the remainder of the dataset" (Barnet & Lewis, 1994). - "An outlier is an observation that lies outside the overall pattern of a distribution" (Moore and McCabe, 1999). - "Outliers are those data records that do not follow any pattern in an application" (Chen and al., 2002). - "An outlier in a set of data is an observation or a point that is considerably dissimilar or inconsistent with the remainder of the data" (Ramasmawy at al., 2000). Many data mining algorithms try to minimize the influence of outliers for instance on a final model to develop, or to eliminate them in the data pre-processing phase. However, a data miner should be careful when automatically detecting and eliminating outliers because, if the data are correct, their elimination can cause the loss of important hidden information (Kantardzic, 2003). Some data mining applications are focused on outlier detection and they are the essential result of a data-analysis (Sane & Ghatol, 2006). The outlier detection techniques find applications in credit card fraud, network robustness analysis, network intrusion detection, financial applications and marketing (Han & Kamber, 2001). A more exhaustive list of applications that exploit outlier detection is provided below (Hodge, 2004): - Fraud detection: fraudulent applications for credit cards, state benefits or fraudulent usage of credit cards or mobile phones. - Loan application processing: fraudulent applications or potentially problematical customers. - Intrusion detection, such as unauthorized access in computer networks

    EVOLUZIONE DELLA DISTRIBUZIONE E COMPORTAMENTO DEL CONSUMATORE. IL CASO DEL DESIGNER OUTLET DI BARBERINO DEL MUGELLO

    No full text
    L'OBIETTIVO DI QUESTO LAVORO, PARTENDO DA UN ANALISI DELLE TAPPE EVOLUTIVE DELLA DISTRIBUZIONE E DEL COMPORTAMENTO D'ACQUISTO DEL CONSUMATORE, è L'ANALISI DEL FENOMENO DEI FACTORY OUTLET CENTER E LE RIPERCUSSIONI CHE QUESTE GRANDI STRUTTURE DI VENDITA HANNO SUL TERRITORIO DAL PUNTO DI VISTA SOCIO-ECONOMICO. A TALE PROPOSITO VIENE ANALIZZATO IL CASO DEL DESIGNER OUTLET DI BARBERINO DEL MUGELLO E GLI EFFETTI CHE QUESTO AVRà SULL'AREA MUGELLANA

    Improving the stability of sequential forward variables selection

    No full text

    Learners Reliability Estimated Through Neural Networks Applied to Build a Novel Hybrid Ensemble Method

    No full text
    In this paper a novel hybrid ensemble method aiming at the improvement of models accuracy in regression tasks is presented. The basic idea of the approach is the creation of an ensemble learner composed by a strong learner which is trained by exploiting data belonging to the whole training dataset and a set of specialised weak learners trained by using data coming from limited regions of the input space determined by means of a self organising map based clustering. In this context, different methods have been tested for the design of the learners, including a hierarchical approach. In the simulation phase, the strong and weak learners operate according to their punctual self-estimated reliabilities so as to exploit their strengths and overcome their weaknesses. The method has been tested on literature and real world datasets achieving competitive results by outperforming other ensemble methods on most of the tested datasets and reducing the average absolute error by up to 10%
    corecore