42 research outputs found

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Generic Architecture for Predictive Computational Modelling with Application to Financial Data Analysis: Integration of Semantic Approach and Machine Learning

    Get PDF
    The PhD thesis introduces a Generic Architecture for Predictive Computational Modelling capable of automating analytical conclusions regarding quantitative data structured as a data frame. The model involves heterogeneous data mining based on a semantic approach, graph-based methods (ontology, knowledge graphs, graph databases) and advanced machine learning methods. The main focus of my research is data pre-processing aimed at a more efficient selection of input features to the computational model. Since the model I propose is generic, it can be applied for data mining of all quantitative datasets (containing two-dimensional, size-mutable, heterogeneous tabular data); however, it is best suitable for highly interconnected data. To adapt this generic model to a specific use case, an Ontology as the formal conceptual representation for the relevant domain knowledge is needed. I have determined to use financial/market data for my use cases. In the course of practical experiments, the effectiveness of the PCM model application for the UK companies’ financial risk analysis and the FTSE100 market index forecasting was evaluated. The tests confirmed that the PCM model has more accurate outcomes than stand-alone traditional machine learning methods. By critically evaluating this architecture, I proved its validity and suggested directions for future research

    Developing Cyberspace Data Understanding: Using CRISP-DM for Host-based IDS Feature Mining

    Get PDF
    Current intrusion detection systems generate a large number of specific alerts, but do not provide actionable information. Many times, these alerts must be analyzed by a network defender, a time consuming and tedious task which can occur hours or days after an attack occurs. Improved understanding of the cyberspace domain can lead to great advancements in Cyberspace situational awareness research and development. This thesis applies the Cross Industry Standard Process for Data Mining (CRISP-DM) to develop an understanding about a host system under attack. Data is generated by launching scans and exploits at a machine outfitted with a set of host-based data collectors. Through knowledge discovery, features are identified within the data collected which can be used to enhance host-based intrusion detection. By discovering relationships between the data collected and the events, human understanding of the activity is shown. This method of searching for hidden relationships between sensors greatly enhances understanding of new attacks and vulnerabilities, bolstering our ability to defend the cyberspace domain

    Theory and Applications for Advanced Text Mining

    Get PDF
    Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields

    PROCESS-ORIENTED KNOWLEDGE DISCOVERY TO SUPPORT PRODUCT DESIGN USING TEXT MINING

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Data Mining Algorithms for Internet Data: from Transport to Application Layer

    Get PDF
    Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data

    Unsupervised discovery of relations for analysis of textual data in digital forensics

    Get PDF
    This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information. Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data. A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts' exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches. The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects. Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics CopyrightDissertation (MSc)--University of Pretoria, 2010.Computer Scienceunrestricte

    Exploring the military role in support of development in Southern Africa

    Get PDF
    Thesis (PhD)--Stellenbosch University, 2019.ENGLISH ABSTRACT: Abundant pieces of legislation and policy frameworks exist that link the military role and durable peace, and those that link durable peace and sustainable development. The linkage between the military role and sustainable development is absent in these source documents. The researcher submits that this “absence” constitutes both a theoretical and a policy-based gap that demands the attention of the policy practitioners and scholars in Public Administration. In attempting to close this gap, this study begins with the fundamental concepts that emerged from the literature review. Among others, they include regional administration and defence administration that led to the formulation of regional defence administration (RDA) as a higher-order construct. The concepts “operations other than war” (OOTW) and “operational activities for development” (OAD) led to the formulation of “military operational activities for development” (MOAD). In theorising the concept of MOAD, this study seeks to close the identified gaps. In closing this gap, this study depended on the grounded theory and methodological analysis using case studies selected from Southern Africa. The theoretical sampling method was used to generate data from various databases using three key terms, namely the military role, durable peace, and sustainable development. In analysing and synthesising the emerging data, the study focused on the most common words, utterances, concepts, properties, and categories to formulate the higher-order constructs. Furthermore, the study borrowed from biological studies to juxtapose the “unknown” with the “known” for purposes of theory building. In doing so, the study borrowed from systems thinking, biomimicry, metaphorical thinking, tensegrity systems, design by analogy to biology, and the theory of biological compressions and tensions. These theories assisted the researcher to establish the interdependence of civilian and military organisations that respond to worldwide complex emergencies. In doing so, the researcher argues that rapid responses and effective interventions in managing complex emergencies are a step in achieving the long-term Agenda for Sustainable Development. It is on the basis of this theoretical line of argument that the study establishes the military role in support of development.AFRIKAANSE OPSOMMMING: Vele wette en beleidsraamwerke bestaan wat die militêre rol skakel aan duursame vrede, en dié wat duursame vrede skakel aan volhoubare ontwikkeling. Die skakel tussen die militêre rol en volhoubare ontwikkeling ontbreek in hierdie brondokumente. Die navorser hou voor dat hierdie “afwesigheid” beide ’n teoretiese en ’n beleidgebaseerde gaping vorm wat die aandag vereis van beleidpraktisyns en academici in Openbare Administrasie. In die poging om hierdie gaping te oorbrug, begin hierdie studie met die fundamentele konsepte wat uit die literatuuroorsig te voorskyn gekom het. Onder andere sluit hulle in streeksbestuur en verdedigingsbestuur, wat gelei het tot die formulering van streeksverdedigingsbestuur as ’n hoër-vlak-konstruk. Die konsepte “aktiwiteite buiten oorlog” (“operations other than war” [OOTW]) en “operasionele ontwikkelingsaktiwiteite” (“operational activities for development” [OAD]) het gelei tot die formulering van “militêre operasionele ontwikkelingsaktiwiteite” (“military operational activities for development” [MOAD]). Hierdie studie poog dus om die geïdentifiseerde gapings te oorbrug deur die konsep van MOAD te teoretiseer. Om hierdie gaping te oorbrug, het hierdie studie gebruik gemaak van die gegronde teorie en metodologiese ontleding van gevallestudies vanuit Suidelike Afrika. Die teoretiese steekproefmetode was gebruik om data te genereer vanuit verskeie databasisse deur die gebruik van drie sleutelterme, naamlik die militêre rol, duursame vrede, en volhoubare ontwikkeling. In die ontleding en sintetisering van die data wat te voorskyn gekom het, het hierdie studie gefokus op die mees algemene woorde, uitdrukkings, konsepte, eienskappe, en kategorieë om die hoër-vlak-konstrukte te formuleer. Die studie het voorts geleen by biologiese navorsing om die “onbekende” naas die “bekende” te stel ten einde teorie te bou. Hierdie studie het dus gebruik gemaak van die bestudering van stelsels, bio-mimikrie, spanningsintregriteitstelsels, ontwerp volgens analogie tot biologie, en die teorie van biologiese druk en spanning. Hierdie teorieë het die navorser in staat gestel om die interafhanklikheid van burgerlike en militêre organisasies wat reageer op wêreldwye ingewikkelde noodtoestande te vestig. Hiermee argumenteer die navorser dat spoedige reaksie en doeltreffende ingryping in die bestuur van ingewikkelde noodtoestande stappe is ten einde die langtermyn “Agenda for Sustainable Development” te behaal. Dit is gegrond op hierdie teoretiese argument dat hierdie studie die militêre rol ter ondersteuning van ontwikkeling vestig.Doctora

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum
    corecore