17 research outputs found

    Exploring the Existing and Unknown Side Effects of Privacy Preserving Data Mining Algorithms

    Get PDF
    The data mining sanitization process involves converting the data by masking the sensitive data and then releasing it to public domain. During the sanitization process, side effects such as hiding failure, missing cost and artificial cost of the data were observed. Privacy Preserving Data Mining (PPDM) algorithms were developed for the sanitization process to overcome information loss and yet maintain data integrity. While these PPDM algorithms did provide benefits for privacy preservation, they also made sure to solve the side effects that occurred during the sanitization process. Many PPDM algorithms were developed to reduce these side effects. There are several PPDM algorithms created based on different PPDM techniques. However, previous studies have not explored or justified why non-traditional side effects were not given much importance. This study reported the findings of the side effects for the PPDM algorithms in a newly created web repository. The research methodology adopted for this study was Design Science Research (DSR). This research was conducted in four phases, which were as follows. The first phase addressed the characteristics, similarities, differences, and relationships of existing side effects. The next phase found the characteristics of non-traditional side effects. The third phase used the Privacy Preservation and Security Framework (PPSF) tool to test if non-traditional side effects occur in PPDM algorithms. This phase also attempted to find additional unknown side effects which have not been found in prior studies. PPDM algorithms considered were Greedy, POS2DT, SIF_IDF, cpGA2DT, pGA2DT, sGA2DT. PPDM techniques associated were anonymization, perturbation, randomization, condensation, heuristic, reconstruction, and cryptography. The final phase involved creating a new online web repository to report all the side effects found for the PPDM algorithms. A Web repository was created using full stack web development. AngularJS, Spring, Spring Boot and Hibernate frameworks were used to build the web application. The results of the study implied various PPDM algorithms and their side effects. Additionally, the relationship and impact that hiding failure, missing cost, and artificial cost have on each other was also understood. Interestingly, the side effects and their relationship with the type of data (sensitive or non-sensitive or new) was observed. As the web repository acts as a quick reference domain for PPDM algorithms. Developing, improving, inventing, and reporting PPDM algorithms is necessary. This study will influence researchers or organizations to report, use, reuse, or develop better PPDM algorithms

    Efficient Learning Machines

    Get PDF
    Computer scienc

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    Machine Learning-Driven Decision Making based on Financial Time Series

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Development of Context-Aware Recommenders of Sequences of Touristic Activities

    Get PDF
    En els últims anys, els sistemes de recomanació s'han fet omnipresents a la xarxa. Molts serveis web, inclosa la transmissió de pel·lícules, la cerca web i el comerç electrònic, utilitzen sistemes de recomanació per facilitar la presa de decisions. El turisme és una indústria molt representada a la xarxa. Hi ha diversos serveis web (e.g. TripAdvisor, Yelp) que es beneficien de la integració de sistemes recomanadors per ajudar els turistes a explorar destinacions turístiques. Això ha augmentat la investigació centrada en la millora dels recomanadors turístics per resoldre els principals problemes als quals s'enfronten. Aquesta tesi proposa nous algorismes per a sistemes recomanadors turístics que aprenen les preferències dels turistes a partir dels seus missatges a les xarxes socials per suggerir una seqüència d'activitats turístiques que s'ajustin a diversos contextes i incloguin activitats afins. Per aconseguir-ho, proposem mètodes per identificar els turistes a partir de les seves publicacions a Twitter, identificant les activitats experimentades en aquestes publicacions i perfilant turistes similars en funció dels seus interessos, informació contextual i períodes d'activitat. Aleshores, els perfils d'usuari es combinen amb un algorisme de mineria de regles d'associació per capturar relacions implícites entre els punts d'interès de cada perfil. Finalment, es fa un rànquing de regles i un procés de selecció d'un conjunt d'activitats recomanables. Es va avaluar la precisió de les recomanacions i l'efecte del perfil d'usuari. A més, ordenem el conjunt d'activitats mitjançant un algorisme multi-objectiu per enriquir l'experiència turística. També realitzem una segona fase d'anàlisi dels fluxos turístics a les destinacions que és beneficiós per a les organitzacions de gestió de destinacions, que volen entendre la mobilitat turística. En general, els mètodes i algorismes proposats en aquesta tesi es mostren útils en diversos aspectes dels sistemes de recomanació turística.En los últimos años, los sistemas de recomendación se han vuelto omnipresentes en la web. Muchos servicios web, incluida la transmisión de películas, la búsqueda en la web y el comercio electrónico, utilizan sistemas de recomendación para ayudar a la toma de decisiones. El turismo es una industria altament representada en la web. Hay varios servicios web (e.g. TripAdvisor, Yelp) que se benefician de la inclusión de sistemas recomendadores para ayudar a los turistas a explorar destinos turísticos. Esto ha aumentado la investigación centrada en mejorar los recomendadores turísticos y resolver los principales problemas a los que se enfrentan. Esta tesis propone nuevos algoritmos para sistemas recomendadores turísticos que aprenden las preferencias de los turistas a partir de sus mensajes en redes sociales para sugerir una secuencia de actividades turísticas que se alinean con diversos contextos e incluyen actividades afines. Para lograr esto, proponemos métodos para identificar a los turistas a partir de sus publicaciones en Twitter, identificar las actividades experimentadas en estas publicaciones y perfilar turistas similares en función de sus intereses, contexto información y periodos de actividad. Luego, los perfiles de usuario se combinan con un algoritmo de minería de reglas de asociación para capturar relaciones entre los puntos de interés que aparecen en cada perfil. Finalmente, un proceso de clasificación de reglas y selección de actividades produce un conjunto de actividades recomendables. Se evaluó la precisión de las recomendaciones y el efecto de la elaboración de perfiles de usuario. Ordenamos además el conjunto de actividades utilizando un algoritmo multi-objetivo para enriquecer la experiencia turística. También llevamos a cabo un análisis de los flujos turísticos en los destinos, lo que es beneficioso para las organizaciones de gestión de destinos, que buscan entender la movilidad turística. En general, los métodos y algoritmos propuestos en esta tesis se muestran útiles en varios aspectos de los sistemas de recomendación turística.In recent years, recommender systems have become ubiquitous on the web. Many web services, including movie streaming, web search and e-commerce, use recommender systems to aid human decision-making. Tourism is one industry that is highly represented on the web. There are several web services (e.g. TripAdvisor, Yelp) that benefit from integrating recommender systems to aid tourists in exploring tourism destinations. This has increased research focused on improving tourism recommender systems and solving the main issues they face. This thesis proposes new algorithms for tourism recommender systems that learn tourist preferences from their social media data to suggest a sequence of touristic activities that align with various contexts and include affine activities. To accomplish this, we propose methods for identifying tourists from their frequent Twitter posts, identifying the activities experienced in these posts, and profiling similar tourists based on their interests, contextual information, and activity periods. User profiles are then combined with an association rule mining algorithm for capturing implicit relationships between points of interest apparent in each profile. Finally, a rule ranking and activity selection process produces a set of recommendable activities. The recommendations were evaluated for accuracy and the effect of user profiling. We further order the set of activities using a multi-objective algorithm to enrich the tourist experience. We also carry out a second-stage analysis of tourist flows at destinations which is beneficial to destination management organisations seeking to understand tourist mobility. Overall, the methods and algorithms proposed in this thesis are shown to be useful in various aspects of tourism recommender systems

    Discovering and Utilising Expert Knowledge from Security Event Logs

    Get PDF
    Security assessment and configuration is a methodology of protecting computer systems from malicious entities. It is a continuous process and heavily dependent on human experts, which are widely attributed to being in short supply. This can result in a system being left insecure because of the lack of easily accessible experience and specialist resources. While performing security tasks, human experts often revert to a system's event logs to determine status of security, such as failures, configuration modifications, system operations etc. However, finding and exploiting knowledge from event logs is a challenging and time-consuming task for non-experts. Hence, there is a strong need to provide mechanisms to make the process easier for security experts, as well as providing tools for those with significantly less security expertise. Doing so automatically allows for persistent and methodical testing without an excessive amount of manual time and effort, and makes computer security more accessible to on-experts. In this thesis, we present a novel technique to process security event logs of a system that have been evaluated and configured by a security expert, extract key domain knowledge indicative of human decision making, and automatically apply acquired knowledge to previously unseen systems by non-experts to recommend security improvements. The proposed solution utilises association and causal rule mining techniques to automatically discover relationships in the event log entries. The relationships are in the form of cause and effect rules that define security-related patterns. These rules and other relevant information are encoded into a PDDL-based domain action model. The domain model and problem instance generated from any vulnerable system can then be used to produce a plan-of-action by employing a state-of-the-art automated planning algorithm. The plan can be exploited by non-professionals to identify the security issues and make improvements. Empirical analysis is subsequently performed on 21 live, real world event log datasets, where the acquired domain model and identified plans are closely examined. The solution's accuracy lies between 73% - 92% and gained a significant performance boost as compared to the manual approach of identifying event relationships. The research presented in this thesis is an automation of extracting knowledge from event data steams. The previous research and current industry practices suggest that this knowledge elicitation is performed by human experts. As evident from the empirical analysis, we present a promising line of work that has the capacity to be utilised in commercial settings. This would reduce (or even eliminate) the dire and immediate need for human resources along with contributing towards financial savings

    Personality Identification from Social Media Using Deep Learning: A Review

    Get PDF
    Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed

    2017 GREAT Day Program

    Get PDF
    SUNY Geneseo’s Eleventh Annual GREAT Day.https://knightscholar.geneseo.edu/program-2007/1011/thumbnail.jp
    corecore