146 research outputs found

    GENERIC FRAMEWORKS FOR INTERACTIVE PERSONALIZED INTERESTING PATTERN DISCOVERY

    Get PDF
    The traditional frequent pattern mining algorithms generate an exponentially large number of patterns of which a substantial portion are not much significant for many data analysis endeavours. Due to this, the discovery of a small number of interesting patterns from the exponentially large number of frequent patterns according to a particular user\u27s interest is an important task. Existing works on patter

    Process Mining for Smart Product Design

    Get PDF

    Turning data into information: assessing and reporting GIS metadata integrity using integrated computing technologies

    Get PDF
    A Geographic Information System (GIS) serves as the tangible and intangible means by which spatially related phenomena can be created, analyzed and rendered. GIS metadata serves as the formal framework to catalog information about a GIS data set. Metadata is independent of the encoded spatial and attribute information. GIS metadata is a subset of electronic metadata which catalogs electronic resources such as web pages and software applications. However, GIS metadata is inherently different than electronic media because each metadata file can be applied to a spatial component that is not implicit with other forms of metadata. Using open source technologies such as R, Perl and PHP, metadata information for large GIS data sets (thousands of layers) can be gleaned quickly and more efficiently than the human element. In doing so, metrics to express the integrity of both the metadata and GIS data can be captured, displayed and compared for use in the decision making process. Supervised and unsupervised techniques allow users and computer algorithms to explore unseen trends about the GIS data not obvious to the human component. The validity of these analyses was tested using a Technology Acceptance Model (TAM). Responses from 40 GIS professionals about the results of this methodology were captured to find a relationship between this technology’s Perceived Ease of Use, Perceived Usefulness, Attitude Towards Using and the Intention to Further use this technology

    Toward Customizable Multi-tenant SaaS Applications

    Get PDF
    abstract: Nowadays, Computing is so pervasive that it has become indeed the 5th utility (after water, electricity, gas, telephony) as Leonard Kleinrock once envisioned. Evolved from utility computing, cloud computing has emerged as a computing infrastructure that enables rapid delivery of computing resources as a utility in a dynamically scalable, virtualized manner. However, the current industrial cloud computing implementations promote segregation among different cloud providers, which leads to user lockdown because of prohibitive migration cost. On the other hand, Service-Orented Computing (SOC) including service-oriented architecture (SOA) and Web Services (WS) promote standardization and openness with its enabling standards and communication protocols. This thesis proposes a Service-Oriented Cloud Computing Architecture by combining the best attributes of the two paradigms to promote an open, interoperable environment for cloud computing development. Mutil-tenancy SaaS applicantions built on top of SOCCA have more flexibility and are not locked down by a certain platform. Tenants residing on a multi-tenant application appear to be the sole owner of the application and not aware of the existence of others. A multi-tenant SaaS application accommodates each tenant’s unique requirements by allowing tenant-level customization. A complex SaaS application that supports hundreds, even thousands of tenants could have hundreds of customization points with each of them providing multiple options, and this could result in a huge number of ways to customize the application. This dissertation also proposes innovative customization approaches, which studies similar tenants’ customization choices and each individual users behaviors, then provides guided semi-automated customization process for the future tenants. A semi-automated customization process could enable tenants to quickly implement the customization that best suits their business needs.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Fouille de séquences temporelles pour la maintenance prédictive : application aux données de véhicules traceurs ferroviaires

    Get PDF
    In order to meet the mounting social and economic demands, railway operators and manufacturers are striving for a longer availability and a better reliability of railway transportation systems. Commercial trains are being equipped with state-of-the-art onboard intelligent sensors monitoring various subsystems all over the train. These sensors provide real-time flow of data, called floating train data, consisting of georeferenced events, along with their spatial and temporal coordinates. Once ordered with respect to time, these events can be considered as long temporal sequences which can be mined for possible relationships. This has created a neccessity for sequential data mining techniques in order to derive meaningful associations rules or classification models from these data. Once discovered, these rules and models can then be used to perform an on-line analysis of the incoming event stream in order to predict the occurrence of target events, i.e, severe failures that require immediate corrective maintenance actions. The work in this thesis tackles the above mentioned data mining task. We aim to investigate and develop various methodologies to discover association rules and classification models which can help predict rare tilt and traction failures in sequences using past events that are less critical. The investigated techniques constitute two major axes: Association analysis, which is temporal and Classification techniques, which is not temporal. The main challenges confronting the data mining task and increasing its complexity are mainly the rarity of the target events to be predicted in addition to the heavy redundancy of some events and the frequent occurrence of data bursts. The results obtained on real datasets collected from a fleet of trains allows to highlight the effectiveness of the approaches and methodologies usedDe nos jours, afin de répondre aux exigences économiques et sociales, les systèmes de transport ferroviaire ont la nécessité d'être exploités avec un haut niveau de sécurité et de fiabilité. On constate notamment un besoin croissant en termes d'outils de surveillance et d'aide à la maintenance de manière à anticiper les défaillances des composants du matériel roulant ferroviaire. Pour mettre au point de tels outils, les trains commerciaux sont équipés de capteurs intelligents envoyant des informations en temps réel sur l'état de divers sous-systèmes. Ces informations se présentent sous la forme de longues séquences temporelles constituées d'une succession d'événements. Le développement d'outils d'analyse automatique de ces séquences permettra d'identifier des associations significatives entre événements dans un but de prédiction d'événement signant l'apparition de défaillance grave. Cette thèse aborde la problématique de la fouille de séquences temporelles pour la prédiction d'événements rares et s'inscrit dans un contexte global de développement d'outils d'aide à la décision. Nous visons à étudier et développer diverses méthodes pour découvrir les règles d'association entre événements d'une part et à construire des modèles de classification d'autre part. Ces règles et/ou ces classifieurs peuvent ensuite être exploités pour analyser en ligne un flux d'événements entrants dans le but de prédire l'apparition d'événements cibles correspondant à des défaillances. Deux méthodologies sont considérées dans ce travail de thèse: La première est basée sur la recherche des règles d'association, qui est une approche temporelle et une approche à base de reconnaissance de formes. Les principaux défis auxquels est confronté ce travail sont principalement liés à la rareté des événements cibles à prédire, la redondance importante de certains événements et à la présence très fréquente de "bursts". Les résultats obtenus sur des données réelles recueillies par des capteurs embarqués sur une flotte de trains commerciaux permettent de mettre en évidence l'efficacité des approches proposée

    Service-oriented discovery of knowledge : foundations, implementations and applications

    Get PDF
    In this thesis we will investigate how a popular new way of distributed computing called service orientation can be used within the field of Knowledge Discovery. We critically investigate its principles and present models for developing withing this paradigm. We then apply this model to create a web service caled Fantom, that mines subgroups in a ranked list of identifiers, based on their score. The descriptions of these subgroups are done in ontologies to provide the scientist a description in a standardized and familiar language. Finally, Fantom is tested on two different data sets from the field of life-sciences; one concerning gene data, the other concerning SNP data.LEI Universiteit LeidenAlgorithm
    • …
    corecore