58 research outputs found

    Towards development of fuzzy spatial datacubes : fundamental concepts with example for multidimensional coastal erosion risk assessment and representation

    Get PDF
    Les systèmes actuels de base de données géodécisionnels (GeoBI) ne tiennent généralement pas compte de l'incertitude liée à l'imprécision et le flou des objets; ils supposent que les objets ont une sémantique, une géométrie et une temporalité bien définies et précises. Un exemple de cela est la représentation des zones à risque par des polygones avec des limites bien définies. Ces polygones sont créés en utilisant des agrégations d'un ensemble d'unités spatiales définies sur soit des intérêts des organismes responsables ou les divisions de recensement national. Malgré la variation spatio-temporelle des multiples critères impliqués dans l’analyse du risque, chaque polygone a une valeur unique de risque attribué de façon homogène sur l'étendue du territoire. En réalité, la valeur du risque change progressivement d'un polygone à l'autre. Le passage d'une zone à l'autre n'est donc pas bien représenté avec les modèles d’objets bien définis (crisp). Cette thèse propose des concepts fondamentaux pour le développement d'une approche combinant le paradigme GeoBI et le concept flou de considérer la présence de l’incertitude spatiale dans la représentation des zones à risque. En fin de compte, nous supposons cela devrait améliorer l’analyse du risque. Pour ce faire, un cadre conceptuel est développé pour créer un model conceptuel d’une base de donnée multidimensionnelle avec une application pour l’analyse du risque d’érosion côtier. Ensuite, une approche de la représentation des risques fondée sur la logique floue est développée pour traiter l'incertitude spatiale inhérente liée à l'imprécision et le flou des objets. Pour cela, les fonctions d'appartenance floues sont définies en basant sur l’indice de vulnérabilité qui est un composant important du risque. Au lieu de déterminer les limites bien définies entre les zones à risque, l'approche proposée permet une transition en douceur d'une zone à une autre. Les valeurs d'appartenance de plusieurs indicateurs sont ensuite agrégées basées sur la formule des risques et les règles SI-ALORS de la logique floue pour représenter les zones à risque. Ensuite, les éléments clés d'un cube de données spatiales floues sont formalisés en combinant la théorie des ensembles flous et le paradigme de GeoBI. En plus, certains opérateurs d'agrégation spatiale floue sont présentés. En résumé, la principale contribution de cette thèse se réfère de la combinaison de la théorie des ensembles flous et le paradigme de GeoBI. Cela permet l’extraction de connaissances plus compréhensibles et appropriées avec le raisonnement humain à partir de données spatiales et non-spatiales. Pour ce faire, un cadre conceptuel a été proposé sur la base de paradigme GéoBI afin de développer un cube de données spatiale floue dans le system de Spatial Online Analytical Processing (SOLAP) pour évaluer le risque de l'érosion côtière. Cela nécessite d'abord d'élaborer un cadre pour concevoir le modèle conceptuel basé sur les paramètres de risque, d'autre part, de mettre en œuvre l’objet spatial flou dans une base de données spatiales multidimensionnelle, puis l'agrégation des objets spatiaux flous pour envisager à la représentation multi-échelle des zones à risque. Pour valider l'approche proposée, elle est appliquée à la région Perce (Est du Québec, Canada) comme une étude de cas.Current Geospatial Business Intelligence (GeoBI) systems typically do not take into account the uncertainty related to vagueness and fuzziness of objects; they assume that the objects have well-defined and exact semantics, geometry, and temporality. Representation of fuzzy zones by polygons with well-defined boundaries is an example of such approximation. This thesis uses an application in Coastal Erosion Risk Analysis (CERA) to illustrate the problems. CERA polygons are created using aggregations of a set of spatial units defined by either the stakeholders’ interests or national census divisions. Despite spatiotemporal variation of the multiple criteria involved in estimating the extent of coastal erosion risk, each polygon typically has a unique value of risk attributed homogeneously across its spatial extent. In reality, risk value changes gradually within polygons and when going from one polygon to another. Therefore, the transition from one zone to another is not properly represented with crisp object models. The main objective of the present thesis is to develop a new approach combining GeoBI paradigm and fuzzy concept to consider the presence of the spatial uncertainty in the representation of risk zones. Ultimately, we assume this should improve coastal erosion risk assessment. To do so, a comprehensive GeoBI-based conceptual framework is developed with an application for Coastal Erosion Risk Assessment (CERA). Then, a fuzzy-based risk representation approach is developed to handle the inherent spatial uncertainty related to vagueness and fuzziness of objects. Fuzzy membership functions are defined by an expert-based vulnerability index. Instead of determining well-defined boundaries between risk zones, the proposed approach permits a smooth transition from one zone to another. The membership values of multiple indicators (e.g. slop and elevation of region under study, infrastructures, houses, hydrology network and so on) are then aggregated based on risk formula and Fuzzy IF-THEN rules to represent risk zones. Also, the key elements of a fuzzy spatial datacube are formally defined by combining fuzzy set theory and GeoBI paradigm. In this regard, some operators of fuzzy spatial aggregation are also formally defined. The main contribution of this study is combining fuzzy set theory and GeoBI. This makes spatial knowledge discovery more understandable with human reasoning and perception. Hence, an analytical conceptual framework was proposed based on GeoBI paradigm to develop a fuzzy spatial datacube within Spatial Online Analytical Processing (SOLAP) to assess coastal erosion risk. This necessitates developing a framework to design a conceptual model based on risk parameters, implementing fuzzy spatial objects in a spatial multi-dimensional database, and aggregating fuzzy spatial objects to deal with multi-scale representation of risk zones. To validate the proposed approach, it is applied to Perce region (Eastern Quebec, Canada) as a case study

    ZigBee Pulse Oximeter

    Get PDF
    This work presents a prototype to adapt a standard pulse oximeter by turning it into a wireless device using ZigBee. Patient’s data are extracted and transmitted to the server in real time through a Wireless Sensor Network. This Wireless Sensor Network is deployed using the mesh topology in order to reach the maximum reliability in the communications. The pulse oximeter is based on a Nellcor DS-100a probe and is controlled by an Arduino FIO with a XBee wireless modem. The amplifier circuit which is designed to extract the information of the pulse oximeter probe is included in this work

    Treatment of imprecision in data repositories with the aid of KNOLAP

    Get PDF
    Traditional data repositories introduced for the needs of business processing, typically focus on the storage and querying of crisp domains of data. As a result, current commercial data repositories have no facilities for either storing or querying imprecise/ approximate data. No significant attempt has been made for a generic and applicationindependent representation of value imprecision mainly as a property of axes of analysis and also as part of dynamic environment, where potential users may wish to define their “own” axes of analysis for querying either precise or imprecise facts. In such cases, measured values and facts are characterised by descriptive values drawn from a number of dimensions, whereas values of a dimension are organised as hierarchical levels. A solution named H-IFS is presented that allows the representation of flexible hierarchies as part of the dimension structures. An extended multidimensional model named IF-Cube is put forward, which allows the representation of imprecision in facts and dimensions and answering of queries based on imprecise hierarchical preferences. Based on the H-IFS and IF-Cube concepts, a post relational OLAP environment is delivered, the implementation of which is DBMS independent and its performance solely dependent on the underlying DBMS engine

    Relative-fuzzy: a novel approach for handling complex ambiguity for software engineering of data mining models

    Get PDF
    There are two main defined classes of uncertainty namely: fuzziness and ambiguity, where ambiguity is ‘one-to-many’ relationship between syntax and semantic of a proposition. This definition seems that it ignores ‘many-to-many’ relationship ambiguity type of uncertainty. In this thesis, we shall use complex-uncertainty to term many-to-many relationship ambiguity type of uncertainty. This research proposes a new approach for handling the complex ambiguity type of uncertainty that may exist in data, for software engineering of predictive Data Mining (DM) classification models. The proposed approach is based on Relative-Fuzzy Logic (RFL), a novel type of fuzzy logic. RFL defines a new formulation of the problem of ambiguity type of uncertainty in terms of States Of Proposition (SOP). RFL describes its membership (semantic) value by using the new definition of Domain of Proposition (DOP), which is based on the relativity principle as defined by possible-worlds logic. To achieve the goal of proposing RFL, a question is needed to be answered, which is: how these two approaches; i.e. fuzzy logic and possible-world, can be mixed to produce a new membership value set (and later logic) that able to handle fuzziness and multiple viewpoints at the same time? Achieving such goal comes via providing possible world logic the ability to quantifying multiple viewpoints and also model fuzziness in each of these multiple viewpoints and expressing that in a new set of membership value. Furthermore, a new architecture of Hierarchical Neural Network (HNN) called ML/RFL-Based Net has been developed in this research, along with a new learning algorithm and new recalling algorithm. The architecture, learning algorithm and recalling algorithm of ML/RFL-Based Net follow the principles of RFL. This new type of HNN is considered to be a RFL computation machine. The ability of the Relative Fuzzy-based DM prediction model to tackle the problem of complex ambiguity type of uncertainty has been tested. Special-purpose Integrated Development Environment (IDE) software, which generates a DM prediction model for speech recognition, has been developed in this research too, which is called RFL4ASR. This special purpose IDE is an extension of the definition of the traditional IDE. Using multiple sets of TIMIT speech data, the prediction model of type ML/RFL-Based Net has classification accuracy of 69.2308%. This accuracy is higher than the best achievements of WEKA data mining machines given the same speech data

    Analytic Extensions to the Data Model for Management Analytics and Decision Support in the Big Data Environment

    Get PDF
    From 2006 to 2016, an estimated average of 50% of big data analytics and decision support projects failed to deliver acceptable and actionable outputs to business users. The resulting management inefficiency came with high cost, and wasted investments estimated at $2.7 trillion in 2016 for companies in the United States. The purpose of this quantitative descriptive study was to examine the data model of a typical data analytics project in a big data environment for opportunities to improve the information created for management problem-solving. The research questions focused on finding artifacts within enterprise data to model key business scenarios for management action. The foundations of the study were information and decision sciences theories, especially information entropy and high-dimensional utility theories. The design-based research in a nonexperimental format was used to examine the data model for the functional forms that mapped the available data to the conceptual formulation of the management problem by combining ontology learning, data engineering, and analytic formulation methodologies. Semantic, symbolic, and dimensional extensions emerged as key functional forms of analytic extension of the data model. The data-modeling approach was applied to 15-terabyte secondary data set from a multinational medical product distribution company with profit growth problem. The extended data model simplified the composition of acceptable analytic insights, the derivation of business solutions, and the design of programs to address the ill-defined management problem. The implication for positive social change was the potential for overall improvement in management efficiency and increasing participation in advocacy and sponsorship of social initiatives

    Dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users

    Full text link
    [EN] The electricity sector is currently undergoing a process of liberalization and separation of roles, which is being implemented under the regulatory auspices of each Member State of the European Union and, therefore, with different speeds, perspectives and objectives that must converge on a common horizon, where Europe will benefit from an interconnected energy market in which producers and consumers can participate in free competition. This process of liberalization and separation of roles involves two consequences or, viewed another way, entails a major consequence from which other immediate consequence, as a necessity, is derived. The main consequence is the increased complexity in the management and supervision of a system, the electrical, increasingly interconnected and participatory, with connection of distributed energy sources, much of them from renewable sources, at different voltage levels and with different generation capacity at any point in the network. From this situation the other consequence is derived, which is the need to communicate information between agents, reliably, safely and quickly, and that this information is analyzed in the most effective way possible, to form part of the processes of decision taking that improve the observability and controllability of a system which is increasing in complexity and number of agents involved. With the evolution of Information and Communication Technologies (ICT), and the investments both in improving existing measurement and communications infrastructure, and taking the measurement and actuation capacity to a greater number of points in medium and low voltage networks, the availability of data that informs of the state of the network is increasingly higher and more complete. All these systems are part of the so-called Smart Grids, or intelligent networks of the future, a future which is not so far. One such source of information comes from the energy consumption of customers, measured on a regular basis (every hour, half hour or quarter-hour) and sent to the Distribution System Operators from the Smart Meters making use of Advanced Metering Infrastructure (AMI). This way, there is an increasingly amount of information on the energy consumption of customers, being stored in Big Data systems. This growing source of information demands specialized techniques which can take benefit from it, extracting a useful and summarized knowledge from it. This thesis deals with the use of this information of energy consumption from Smart Meters, in particular on the application of data mining techniques to obtain temporal patterns that characterize the users of electrical energy, grouping them according to these patterns in a small number of groups or clusters, that allow evaluating how users consume energy, both during the day and during a sequence of days, allowing to assess trends and predict future scenarios. For this, the current techniques are studied and, proving that the current works do not cover this objective, clustering or dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users are developed. These techniques are tested and validated on a database of hourly energy consumption values for a sample of residential customers in Spain during years 2008 and 2009. The results allow to observe both the characterization in consumption patterns of the different types of residential energy consumers, and their evolution over time, and to assess, for example, how the regulatory changes that occurred in Spain in the electricity sector during those years influenced in the temporal patterns of energy consumption.[ES] El sector eléctrico se halla actualmente sometido a un proceso de liberalización y separación de roles, que está siendo aplicado bajo los auspicios regulatorios de cada Estado Miembro de la Unión Europea y, por tanto, con distintas velocidades, perspectivas y objetivos que deben confluir en un horizonte común, en donde Europa se beneficiará de un mercado energético interconectado, en el cual productores y consumidores podrán participar en libre competencia. Este proceso de liberalización y separación de roles conlleva dos consecuencias o, visto de otra manera, conlleva una consecuencia principal de la cual se deriva, como necesidad, otra consecuencia inmediata. La consecuencia principal es el aumento de la complejidad en la gestión y supervisión de un sistema, el eléctrico, cada vez más interconectado y participativo, con conexión de fuentes distribuidas de energía, muchas de ellas de origen renovable, a distintos niveles de tensión y con distinta capacidad de generación, en cualquier punto de la red. De esta situación se deriva la otra consecuencia, que es la necesidad de comunicar información entre los distintos agentes, de forma fiable, segura y rápida, y que esta información sea analizada de la forma más eficaz posible, para que forme parte de los procesos de toma de decisiones que mejoran la observabilidad y controlabilidad de un sistema cada vez más complejo y con más agentes involucrados. Con el avance de las Tecnologías de Información y Comunicaciones (TIC), y las inversiones tanto en mejora de la infraestructura existente de medida y comunicaciones, como en llevar la obtención de medidas y la capacidad de actuación a un mayor número de puntos en redes de media y baja tensión, la disponibilidad de datos sobre el estado de la red es cada vez mayor y más completa. Todos estos sistemas forman parte de las llamadas Smart Grids, o redes inteligentes del futuro, un futuro ya no tan lejano. Una de estas fuentes de información proviene de los consumos energéticos de los clientes, medidos de forma periódica (cada hora, media hora o cuarto de hora) y enviados hacia las Distribuidoras desde los contadores inteligentes o Smart Meters, mediante infraestructura avanzada de medida o Advanced Metering Infrastructure (AMI). De esta forma, cada vez se tiene una mayor cantidad de información sobre los consumos energéticos de los clientes, almacenada en sistemas de Big Data. Esta cada vez mayor fuente de información demanda técnicas especializadas que sepan aprovecharla, extrayendo un conocimiento útil y resumido de la misma. La presente Tesis doctoral versa sobre el uso de esta información de consumos energéticos de los contadores inteligentes, en concreto sobre la aplicación de técnicas de minería de datos (data mining) para obtener patrones temporales que caractericen a los usuarios de energía eléctrica, agrupándolos según estos mismos patrones en un número reducido de grupos o clusters, que permiten evaluar la forma en que los usuarios consumen la energía, tanto a lo largo del día como durante una secuencia de días, permitiendo evaluar tendencias y predecir escenarios futuros. Para ello se estudian las técnicas actuales y, comprobando que los trabajos actuales no cubren este objetivo, se desarrollan técnicas de clustering o segmentación dinámica aplicadas a curvas de carga de consumo eléctrico diario de clientes domésticos. Estas técnicas se prueban y validan sobre una base de datos de consumos energéticos horarios de una muestra de clientes residenciales en España durante los años 2008 y 2009. Los resultados permiten observar tanto la caracterización en consumos de los distintos tipos de consumidores energéticos residenciales, como su evolución en el tiempo, y permiten evaluar, por ejemplo, cómo influenciaron en los patrones temporales de consumos los cambios regulatorios que se produjeron en España en el sector eléctrico durante esos años.[CA] El sector elèctric es troba actualment sotmès a un procés de liberalització i separació de rols, que s'està aplicant davall els auspicis reguladors de cada estat membre de la Unió Europea i, per tant, amb distintes velocitats, perspectives i objectius que han de confluir en un horitzó comú, on Europa es beneficiarà d'un mercat energètic interconnectat, en el qual productors i consumidors podran participar en lliure competència. Aquest procés de liberalització i separació de rols comporta dues conseqüències o, vist d'una altra manera, comporta una conseqüència principal de la qual es deriva, com a necessitat, una altra conseqüència immediata. La conseqüència principal és l'augment de la complexitat en la gestió i supervisió d'un sistema, l'elèctric, cada vegada més interconnectat i participatiu, amb connexió de fonts distribuïdes d'energia, moltes d'aquestes d'origen renovable, a distints nivells de tensió i amb distinta capacitat de generació, en qualsevol punt de la xarxa. D'aquesta situació es deriva l'altra conseqüència, que és la necessitat de comunicar informació entre els distints agents, de forma fiable, segura i ràpida, i que aquesta informació siga analitzada de la manera més eficaç possible, perquè forme part dels processos de presa de decisions que milloren l'observabilitat i controlabilitat d'un sistema cada vegada més complex i amb més agents involucrats. Amb l'avanç de les tecnologies de la informació i les comunicacions (TIC), i les inversions, tant en la millora de la infraestructura existent de mesura i comunicacions, com en el trasllat de l'obtenció de mesures i capacitat d'actuació a un nombre més gran de punts en xarxes de mitjana i baixa tensió, la disponibilitat de dades sobre l'estat de la xarxa és cada vegada major i més completa. Tots aquests sistemes formen part de les denominades Smart Grids o xarxes intel·ligents del futur, un futur ja no tan llunyà. Una d'aquestes fonts d'informació prové dels consums energètics dels clients, mesurats de forma periòdica (cada hora, mitja hora o quart d'hora) i enviats cap a les distribuïdores des dels comptadors intel·ligents o Smart Meters, per mitjà d'infraestructura avançada de mesura o Advanced Metering Infrastructure (AMI). D'aquesta manera, cada vegada es té una major quantitat d'informació sobre els consums energètics dels clients, emmagatzemada en sistemes de Big Data. Aquesta cada vegada major font d'informació demanda tècniques especialitzades que sàpiguen aprofitar-la, extraient-ne un coneixement útil i resumit. La present tesi doctoral versa sobre l'ús d'aquesta informació de consums energètics dels comptadors intel·ligents, en concret sobre l'aplicació de tècniques de mineria de dades (data mining) per a obtenir patrons temporals que caracteritzen els usuaris d'energia elèctrica, agrupant-los segons aquests mateixos patrons en una quantitat reduïda de grups o clusters, que permeten avaluar la forma en què els usuaris consumeixen l'energia, tant al llarg del dia com durant una seqüència de dies, i que permetent avaluar tendències i predir escenaris futurs. Amb aquesta finalitat, s'estudien les tècniques actuals i, en comprovar que els treballs actuals no cobreixen aquest objectiu, es desenvolupen tècniques de clustering o segmentació dinàmica aplicades a corbes de càrrega de consum elèctric diari de clients domèstics. Aquestes tècniques es proven i validen sobre una base de dades de consums energètics horaris d'una mostra de clients residencials a Espanya durant els anys 2008 i 2009. Els resultats permeten observar tant la caracterització en consums dels distints tipus de consumidors energètics residencials, com la seua evolució en el temps, i permeten avaluar, per exemple, com van influenciar en els patrons temporals de consums els canvis reguladors que es van produir a Espanya en el sector elèctric durant aquests anys.Benítez Sánchez, IJ. (2015). Dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/59236TESI

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    The 5th Conference of PhD Students in Computer Science

    Get PDF

    Decision Support Systems

    Get PDF
    Decision support systems (DSS) have evolved over the past four decades from theoretical concepts into real world computerized applications. DSS architecture contains three key components: knowledge base, computerized model, and user interface. DSS simulate cognitive decision-making functions of humans based on artificial intelligence methodologies (including expert systems, data mining, machine learning, connectionism, logistical reasoning, etc.) in order to perform decision support functions. The applications of DSS cover many domains, ranging from aviation monitoring, transportation safety, clinical diagnosis, weather forecast, business management to internet search strategy. By combining knowledge bases with inference rules, DSS are able to provide suggestions to end users to improve decisions and outcomes. This book is written as a textbook so that it can be used in formal courses examining decision support systems. It may be used by both undergraduate and graduate students from diverse computer-related fields. It will also be of value to established professionals as a text for self-study or for reference
    corecore