77 research outputs found

    On the Usability of Probably Approximately Correct Implication Bases

    Full text link
    We revisit the notion of probably approximately correct implication bases from the literature and present a first formulation in the language of formal concept analysis, with the goal to investigate whether such bases represent a suitable substitute for exact implication bases in practical use-cases. To this end, we quantitatively examine the behavior of probably approximately correct implication bases on artificial and real-world data sets and compare their precision and recall with respect to their corresponding exact implication bases. Using a small example, we also provide qualitative insight that implications from probably approximately correct bases can still represent meaningful knowledge from a given data set.Comment: 17 pages, 8 figures; typos added, corrected x-label on graph

    Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

    Full text link
    Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

    Towards a Soft Evaluation and Refinement of Tagging in Digital Humanities

    Get PDF
    In this paper we estimate the soundness of tagging in digital repositories within the field of Digital Humanities by studying the (semantic) conceptual structure behind the folksnonomy. The use of association rules associated to this conceptual structure (Stem and Luxenburger basis) allows to faithfully (from a semantic point of view) complete the tagging (or suggest such a completion).Ministerio de Economía y Competitividad TIN2013-41086-PJunta de Andalucía TIC-606

    On morphological hierarchical representations for image processing and spatial data clustering

    Full text link
    Hierarchical data representations in the context of classi cation and data clustering were put forward during the fties. Recently, hierarchical image representations have gained renewed interest for segmentation purposes. In this paper, we briefly survey fundamental results on hierarchical clustering and then detail recent paradigms developed for the hierarchical representation of images in the framework of mathematical morphology: constrained connectivity and ultrametric watersheds. Constrained connectivity can be viewed as a way to constrain an initial hierarchy in such a way that a set of desired constraints are satis ed. The framework of ultrametric watersheds provides a generic scheme for computing any hierarchical connected clustering, in particular when such a hierarchy is constrained. The suitability of this framework for solving practical problems is illustrated with applications in remote sensing

    A study of observation scales based on Felzenswalb-Huttenlocher dissimilarity measure for hierarchical segmentation

    Get PDF
    International audienceHierarchical image segmentation provides a region-oriented scale-space, i.e., a set of image segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. Guimarães et al. proposed a hierarchical graph based image segmentation (HGB) method based on the Felzenszwalb-Huttenlocher dissimilarity. This HGB method computes, for each edge of a graph, the minimum scale in a hierarchy at which two regions linked by this edge should merge according to the dissimilarity. In order to generalize this method, we first propose an algorithm to compute the intervals which contain all the observation scales at which the associated regions should merge. Then, following the current trend in mathematical morphology to study criteria which are not increasing on a hierarchy, we present various strategies to select a significant observation scale in these intervals. We use the BSDS dataset to assess our observation scale selection methods. The experiments show that some of these strategies lead to better segmentation results than the ones obtained with the original HGB method

    On the equivalence between hierarchical segmentations and ultrametric watersheds

    Get PDF
    We study hierarchical segmentation in the framework of edge-weighted graphs. We define ultrametric watersheds as topological watersheds null on the minima. We prove that there exists a bijection between the set of ultrametric watersheds and the set of hierarchical segmentations. We end this paper by showing how to use the proposed framework in practice in the example of constrained connectivity; in particular it allows to compute such a hierarchy following a classical watershed-based morphological scheme, which provides an efficient algorithm to compute the whole hierarchy.Comment: 19 pages, double-colum

    Large-scale unit commitment under uncertainty: an updated literature survey

    Get PDF
    The Unit Commitment problem in energy management aims at finding the optimal production schedule of a set of generation units, while meeting various system-wide constraints. It has always been a large-scale, non-convex, difficult problem, especially in view of the fact that, due to operational requirements, it has to be solved in an unreasonably small time for its size. Recently, growing renewable energy shares have strongly increased the level of uncertainty in the system, making the (ideal) Unit Commitment model a large-scale, non-convex and uncertain (stochastic, robust, chance-constrained) program. We provide a survey of the literature on methods for the Uncertain Unit Commitment problem, in all its variants. We start with a review of the main contributions on solution methods for the deterministic versions of the problem, focussing on those based on mathematical programming techniques that are more relevant for the uncertain versions of the problem. We then present and categorize the approaches to the latter, while providing entry points to the relevant literature on optimization under uncertainty. This is an updated version of the paper "Large-scale Unit Commitment under uncertainty: a literature survey" that appeared in 4OR 13(2), 115--171 (2015); this version has over 170 more citations, most of which appeared in the last three years, proving how fast the literature on uncertain Unit Commitment evolves, and therefore the interest in this subject

    Les pièges à particules : principes, état de l'art et perspectives pour la surveillance des milieux aquatiques - focus sur les cours d'eaux

    No full text
    [Departement_IRSTEA]Eaux [TR1_IRSTEA]BELCA [ADD1_IRSTEA]Systèmes aquatiques soumis à des pressions multiples [Relecteur_IRSTEA]Yari, A. ; Ghestem, J.P.Particulate matters are a central point in the assessment of water bodies. A wide range of techniques is available for sampling of particulate matters (TSS / Sediment) in aquatic systems. The relevance of each technique is determined by the flows and dynamics of the particulate matters, the data requirements (limit of quantification, accuracy, uncertainty, representativity ...) and available resources. These factors determine the sampling strategy and method to be adopted and how the sample should be handled (transported and stored) after collection. It is therefore essential to pay particular attention to the following question: Which sampling approach (s) will provide the most representative sample? Sediment traps are collectors, boxes, or baskets, placed in the water column and which capture the particulate matters continuously by decantation. Once deployed, the water passes through the system in which a decrease in the velocity of the flow occurs, causing the particulate matters to decant in the tool. The objective of this action is to evaluate the potential of sediment traps in the monitoring of chemical contamination of aquatic environments. The main observations show that sediment traps can be integrated into strategies for the chemical monitoring of aquatic environments' contamination, in particular to: - Integrate the variability of contaminant concentrations in particulate matter; - Track chemical contamination of water bodies; - Improve the representativeness of the assessment of chemical contamination of aquatic environments by an integrated measure, in addition to integrative samplers; - Meet chemical monitoring requirements for EQS and whole water; in addition to passive integrative samplers; - Estimate flows of particulate contaminants.La prise en compte des particules est un point central de l'évaluation des masses d'eaux. Un large éventail de techniques est disponible pour l'échantillonnage des particules (MES/Sédiments) dans les systèmes aquatiques. La pertinence de chaque technique est déterminée par les flux et dynamiques des particules, les exigences sur les données (limite de quantification, exactitude, incertitude, représentativité, ...) et les ressources disponibles. Ces facteurs déterminent la méthode d'échantillonnage à adopter et la manière dont l'échantillon devra être manipulé (transporté et stocké) après la collecte. Il est donc indispensable de porter une attention particulière à la question suivante : quelle(s) approche(s) d'échantillonnage fournira (ont) 'échantillon le plus représentatif ? Les pièges à particules sont des collecteurs, boîtes, ou paniers, placés dans la colonne d'eau et qui capturent les particules en continu par décantation. Une fois déployé, l'eau passe au travers du système au sein duquel une diminution de la vitesse du débit s'opère, provoquant la décantation des particules dans l'outil. L'objectif de cette action est d'évaluer le potentiel des pièges à particules dans le cadre de la surveillance de la contamination chimique des milieux aquatiques. Les principales observations démontrent que les pièges à particules peuvent être intégrés dans des stratégies de surveillance de la contamination chimique des milieux aquatiques, notamment pour : - Intégrer la variabilité des concentrations en contaminants dans les particules; - Suivre en tendance la contamination chimique des masses d'eau ; - Améliorer la représentativité de l'évaluation de la contamination chimique des milieux aquatiques par une mesure intégrée, en complément d'échantillonneurs intégratifs ; - Répondre aux exigences de surveillance de l'état chimique NQE et fraction eau totale ; en complément d'échantillonneurs intégratifs passifs ; - Estimer des flux de contaminants particulaire

    Caractérisation de la matière organique d'eaux résiduaires et d'eaux de surface par les sondes spectrophotométriques UV- Visible

    No full text
    In order to improve the chemical status assessment of surface waters, some alternative methods for sampling and laboratory analysis were developed. Among these alternative methods, UV-visible spectrophotometric techniques are increasingly used as they allow in situ and continuous measurements. The UV-vis spectra obtained by commercial devices allow the quantification of suspended particulate matter (SPM), nitrates, chemical oxygen demand (COD) or total organic carbon (TOC). Numerous studies show that the UV-vis spectra would provide information on the quality of dissolved organic matter (DOM) such as the degree of aromaticity or the size distribution of organic molecules. This report investigates the potential of UV-visible spectrophotometric measurement to provide information on the quality of DOM in different types of water from several scenarios. For this, a preliminary experiment was carried out in the laboratory: Eight wastewaters (untreated raw water and treated water) from two treatment plants were physically fractionated (by sieving and filtration) to highlight the influence of different classes of particles or organic colloides on the UV-vis spectra measured by two field systems (spectro probe:lyser and Pastel Uviline system) and laboratory spectrophotometer. The first results show a good agreement between the two field systems for the analysis of major parameters such as SPM, nitrates, COD and TOC. These analyses could be improved by using local calibration. A set of conventional indicators was estimated from the spectra: SUVA, absorbance ratios (250/365nm; 465/665nm), spectral slopes over the 275-295 nm and 350-400 nm ranges. These indicators clearly show that the quality of DOM at the two stations is different, with changes over the sampling time for each station. Finally, an analysis of all spectra using descriptive statistical tools (such as principal component analysis) clearly differentiates the waters and fractions according to their origin and the characteristics of DOM. The next part of the study consists on the observation of the variability of spectra acquired on wastewaters and surface waters and the study of their spectral fingerprints (by integrating all the indicators). For this purpose, several spectral databases have been built from: - 4 river samples and 59 wastewater samples collected along the treatment lines of 9 wastewater treatment plants (WWTPs) between 2017 and 2018. In addition to UV-vis spectra, complementary DOM characterization analyses (including size distribution of DOM) have been conducted. The comparison of spectra with complementary data improves the interpretation of UV-vis indicators. Some spectral fingerprints specific to water types (input and output waters of WWTPs) are constructed from these indicators. - 279 spectra of wastewater (n=237) and surface water (n=42). The variability of spectral fingerprints is compared to the type and origin of the water and therefore the potential source of DOM. This extensive database confirms the use of spectral fingerprints to classify waters according to the quality of the DOM they contain. - 2 temporal monitoring conducted on the Oise (November 2015 - February 2016) and Gier (May 2017 - May 2018) Rivers. After processing the raw data (mainly turbidity correction), the chronicles of spectra illustrate the potential of UV-vis spectrophotometry for studying the variations of DOM quality in rivers at scales ranging from month to day. These different examples highlight critical steps to be improved (such as the compensation of turbidity spectra) to facilitate the use and interpretation of data from UV-vis spectrophotometric probes in order accurately characterize DOM in waste and surface waters.Afin d'améliorer la qualité de l'évaluation de l'état chimique des eaux, des méthodes alternatives aux méthodes de prélèvement et d'analyses classiques se développent. Parmi ces techniques, les systèmes spectrophotométriques UV-visible, permettant des mesures in situ et en continu, sont de plus en plus utilisés. Des sondes commercialisées permettent d'obtenir des teneurs en matières en suspension (MES), nitrates, demande chimique en oxygène (DCO) ou carbone organique à partir des spectres UV-vis. De nombreuses études montrent qu'il est possible d'extraire des données spectrales d'autres types d'informations sur la qualité de la matière organique dissoute (MOD) comme le degré d'aromaticité ou la taille globale des molécules organiques. Ce rapport explore le potentiel de la mesure spectrophotométriques UV-vis pour apporter des informations sur la qualité de la MOD de différents types d'eaux. Ce travail s'appuie sur des expérimentations menées en laboratoire et sur l'analyse de jeux de spectres UV-vis acquis sur des eaux résiduaires et de surface. Une expérimentation préliminaire a été réalisée en laboratoire en 2016. Huit échantillons d'eaux résiduaires (eaux brutes non traitées et eaux traitées) de deux stations de traitement ont été physiquement fractionnées (par tamisage et filtration) afin de voir l'influence des différentes classes de particules et colloïdes organiques sur les spectres UV-vis mesurés avec deux systèmes portables (sonde spectro::lyser et système Pastel Uviline) et un spectrophotomètre de laboratoire. Les résultats montrent une bonne concordance entre les deux systèmes de mesures pour l'analyse des paramètres majeurs (MES, nitrates, DCO, DBO et COT) qui pourrait être améliorée en prenant en compte une calibration locale. Un ensemble d'indicateurs sont estimés à partir des spectres : indice SUVA, rapports d'absorbance (250/365nm ; 465/665nm), pentes spectrales sur les domaines 275-295 nm et 350-400 nm. Ces indicateurs montrent clairement que la qualité de la MOD des deux stations est différente, et qu'elle évolue en fonction du temps de prélèvement pour une même station. La suite de l'étude consiste à observer la variabilité des spectres UV-vis d'échantillons d'eaux résiduaires et d'eau de surface et d'étudier leurs signatures spectrales (à partir de l'ensemble des indicateurs classiques). Pour cela, plusieurs bases de données ont été construites à partir de : - 4 échantillons de rivière et 59 échantillons d'eaux résiduaires prélevés le long des traitements de 9 stations de traitements d'eaux usées (STEU) entre 2017 et 2018. En plus des spectres UV-vis, des analyses complémentaires de caractérisation de la MOD (dont la distribution de taille de la MOD) ont été réalisées. La comparaison des spectres et des données complémentaires permettent de mieux interpréter les indicateurs UV-vis classiques. Des signatures spectrales spécifiques des types d'eaux (entrée vs. sortie de STEU) sont identifiées à partir de l'intégration de l'ensemble de ces indicateurs. - 279 spectres UV-vis d'eaux résiduaires non traitées et traitées (n=237) et d'eaux de surface (n=42). La variabilité des signatures spectrales est étudiée en fonction du type et de l'origine de l'eau et donc de la source potentielle de la MOD. Cette banque de données conséquente permet de confirmer l'utilisation des empreintes spectrales pour classer des eaux de surface en fonction de la qualité de la MOD qu'elles contiennent. - 2 suivis temporels réalisés sur les rivières Oise (novembre 2015 à février 2016) et Gier (mai 2017 à mai 2018). Les chroniques de spectres, après traitement des données brutes (correction de la turbidité), permettent d'illustrer le potentiel de la spectrophotométrie UV-vis pour l'étude des variations de la qualité de la MOD dans les deux cours d'eau à des échelles allant du mois à la journée. L'étude de ces différentes bases spectrales a permis permet aussi de distinguer des étapes critiques à améliorer (comme la compensation des spectres de la turbidité) pour faciliter l'utilisation et l'interprétation des données issues des sondes spectrophotométriques UV-vis dans un but de caractériser la MOD des eaux résiduaires et des eaux de surface
    corecore