100 research outputs found

    Network Intrusion Detection System:A systematic study of Machine Learning and Deep Learning approaches

    Get PDF
    The rapid advances in the internet and communication fields have resulted in ahuge increase in the network size and the corresponding data. As a result, manynovel attacks are being generated and have posed challenges for network secu-rity to accurately detect intrusions. Furthermore, the presence of the intruderswiththeaimtolaunchvariousattackswithinthenetworkcannotbeignored.Anintrusion detection system (IDS) is one such tool that prevents the network frompossible intrusions by inspecting the network traffic, to ensure its confidential-ity, integrity, and availability. Despite enormous efforts by the researchers, IDSstillfaceschallengesinimprovingdetectionaccuracywhilereducingfalsealarmrates and in detecting novel intrusions. Recently, machine learning (ML) anddeep learning (DL)-based IDS systems are being deployed as potential solutionsto detect intrusions across the network in an efficient manner. This article firstclarifiestheconceptofIDSandthenprovidesthetaxonomybasedonthenotableML and DL techniques adopted in designing network-based IDS (NIDS) sys-tems. A comprehensive review of the recent NIDS-based articles is provided bydiscussing the strengths and limitations of the proposed solutions. Then, recenttrends and advancements of ML and DL-based NIDS are provided in terms ofthe proposed methodology, evaluation metrics, and dataset selection. Using theshortcomings of the proposed methods, we highlighted various research chal-lenges and provided the future scope for the research in improving ML andDL-based NIDS

    Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities

    Full text link
    With the increasing amount of spatial-temporal~(ST) ocean data, numerous spatial-temporal data mining (STDM) studies have been conducted to address various oceanic issues, e.g., climate forecasting and disaster warning. Compared with typical ST data (e.g., traffic data), ST ocean data is more complicated with some unique characteristics, e.g., diverse regionality and high sparsity. These characteristics make it difficult to design and train STDM models. Unfortunately, an overview of these studies is still missing, hindering computer scientists to identify the research issues in ocean while discouraging researchers in ocean science from applying advanced STDM techniques. To remedy this situation, we provide a comprehensive survey to summarize existing STDM studies in ocean. Concretely, we first summarize the widely-used ST ocean datasets and identify their unique characteristics. Then, typical ST ocean data quality enhancement techniques are discussed. Next, we classify existing STDM studies for ocean into four types of tasks, i.e., prediction, event detection, pattern mining, and anomaly detection, and elaborate the techniques for these tasks. Finally, promising research opportunities are highlighted. This survey will help scientists from the fields of both computer science and ocean science have a better understanding of the fundamental concepts, key techniques, and open challenges of STDM in ocean

    Deep Neural Networks and Data for Automated Driving

    Get PDF
    This open access book brings together the latest developments from industry and research on automated driving and artificial intelligence. Environment perception for highly automated driving heavily employs deep neural networks, facing many challenges. How much data do we need for training and testing? How to use synthetic data to save labeling costs for training? How do we increase robustness and decrease memory usage? For inevitably poor conditions: How do we know that the network is uncertain about its decisions? Can we understand a bit more about what actually happens inside neural networks? This leads to a very practical problem particularly for DNNs employed in automated driving: What are useful validation techniques and how about safety? This book unites the views from both academia and industry, where computer vision and machine learning meet environment perception for highly automated driving. Naturally, aspects of data, robustness, uncertainty quantification, and, last but not least, safety are at the core of it. This book is unique: In its first part, an extended survey of all the relevant aspects is provided. The second part contains the detailed technical elaboration of the various questions mentioned above

    FIN-DM: finantsteenuste andmekaeve protsessi mudel

    Get PDF
    Andmekaeve hõlmab reeglite kogumit, protsesse ja algoritme, mis võimaldavad ettevõtetel iga päev kogutud andmetest rakendatavaid teadmisi ammutades suurendada tulusid, vähendada kulusid, optimeerida tooteid ja kliendisuhteid ning saavutada teisi eesmärke. Andmekaeves ja -analüütikas on vaja hästi määratletud metoodikat ja protsesse. Saadaval on mitu andmekaeve ja -analüütika standardset protsessimudelit. Kõige märkimisväärsem ja laialdaselt kasutusele võetud standardmudel on CRISP-DM. Tegu on tegevusalast sõltumatu protsessimudeliga, mida kohandatakse sageli sektorite erinõuetega. CRISP-DMi tegevusalast lähtuvaid kohandusi on pakutud mitmes valdkonnas, kaasa arvatud meditsiini-, haridus-, tööstus-, tarkvaraarendus- ja logistikavaldkonnas. Seni pole aga mudelit kohandatud finantsteenuste sektoris, millel on omad valdkonnapõhised erinõuded. Doktoritöös käsitletakse seda lünka finantsteenuste sektoripõhise andmekaeveprotsessi (FIN-DM) kavandamise, arendamise ja hindamise kaudu. Samuti uuritakse, kuidas kasutatakse andmekaeve standardprotsesse eri tegevussektorites ja finantsteenustes. Uurimise käigus tuvastati mitu tavapärase raamistiku kohandamise stsenaariumit. Lisaks ilmnes, et need meetodid ei keskendu piisavalt sellele, kuidas muuta andmekaevemudelid tarkvaratoodeteks, mida saab integreerida organisatsioonide IT-arhitektuuri ja äriprotsessi. Peamised finantsteenuste valdkonnas tuvastatud kohandamisstsenaariumid olid seotud andmekaeve tehnoloogiakesksete (skaleeritavus), ärikesksete (tegutsemisvõime) ja inimkesksete (diskrimineeriva mõju leevendus) aspektidega. Seejärel korraldati tegelikus finantsteenuste organisatsioonis juhtumiuuring, mis paljastas 18 tajutavat puudujääki CRISP- DMi protsessis. Uuringu andmete ja tulemuste abil esitatakse doktoritöös finantsvaldkonnale kohandatud CRISP-DM nimega FIN-DM ehk finantssektori andmekaeve protsess (Financial Industry Process for Data Mining). FIN-DM laiendab CRISP-DMi nii, et see toetab privaatsust säilitavat andmekaevet, ohjab tehisintellekti eetilisi ohte, täidab riskijuhtimisnõudeid ja hõlmab kvaliteedi tagamist kui osa andmekaeve elutsüklisData mining is a set of rules, processes, and algorithms that allow companies to increase revenues, reduce costs, optimize products and customer relationships, and achieve other business goals, by extracting actionable insights from the data they collect on a day-to-day basis. Data mining and analytics projects require well-defined methodology and processes. Several standard process models for conducting data mining and analytics projects are available. Among them, the most notable and widely adopted standard model is CRISP-DM. It is industry-agnostic and often is adapted to meet sector-specific requirements. Industry- specific adaptations of CRISP-DM have been proposed across several domains, including healthcare, education, industrial and software engineering, logistics, etc. However, until now, there is no existing adaptation of CRISP-DM for the financial services industry, which has its own set of domain-specific requirements. This PhD Thesis addresses this gap by designing, developing, and evaluating a sector-specific data mining process for financial services (FIN-DM). The PhD thesis investigates how standard data mining processes are used across various industry sectors and in financial services. The examination identified number of adaptations scenarios of traditional frameworks. It also suggested that these approaches do not pay sufficient attention to turning data mining models into software products integrated into the organizations' IT architectures and business processes. In the financial services domain, the main discovered adaptation scenarios concerned technology-centric aspects (scalability), business-centric aspects (actionability), and human-centric aspects (mitigating discriminatory effects) of data mining. Next, an examination by means of a case study in the actual financial services organization revealed 18 perceived gaps in the CRISP-DM process. Using the data and results from these studies, the PhD thesis outlines an adaptation of CRISP-DM for the financial sector, named the Financial Industry Process for Data Mining (FIN-DM). FIN-DM extends CRISP-DM to support privacy-compliant data mining, to tackle AI ethics risks, to fulfill risk management requirements, and to embed quality assurance as part of the data mining life-cyclehttps://www.ester.ee/record=b547227

    Advanced Topics in Systems Safety and Security

    Get PDF
    This book presents valuable research results in the challenging field of systems (cyber)security. It is a reprint of the Information (MDPI, Basel) - Special Issue (SI) on Advanced Topics in Systems Safety and Security. The competitive review process of MDPI journals guarantees the quality of the presented concepts and results. The SI comprises high-quality papers focused on cutting-edge research topics in cybersecurity of computer networks and industrial control systems. The contributions presented in this book are mainly the extended versions of selected papers presented at the 7th and the 8th editions of the International Workshop on Systems Safety and Security—IWSSS. These two editions took place in Romania in 2019 and respectively in 2020. In addition to the selected papers from IWSSS, the special issue includes other valuable and relevant contributions. The papers included in this reprint discuss various subjects ranging from cyberattack or criminal activities detection, evaluation of the attacker skills, modeling of the cyber-attacks, and mobile application security evaluation. Given this diversity of topics and the scientific level of papers, we consider this book a valuable reference for researchers in the security and safety of systems

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Digital Transformation in Smart Farm and Forest Operations Needs Human-Centered AI: Challenges and Future Directions

    Get PDF
    The main impetus for the global efforts toward the current digital transformation in almost all areas of our daily lives is due to the great successes of artificial intelligence (AI), and in particular, the workhorse of AI, statistical machine learning (ML). The intelligent analysis, modeling, and management of agricultural and forest ecosystems, and of the use and protection of soils, already play important roles in securing our planet for future generations and will become irreplaceable in the future. Technical solutions must encompass the entire agricultural and forestry value chain. The process of digital transformation is supported by cyber-physical systems enabled by advances in ML, the availability of big data and increasing computing power. For certain tasks, algorithms today achieve performances that exceed human levels. The challenge is to use multimodal information fusion, i.e., to integrate data from different sources (sensor data, images, *omics), and explain to an expert why a certain result was achieved. However, ML models often react to even small changes, and disturbances can have dramatic effects on their results. Therefore, the use of AI in areas that matter to human life (agriculture, forestry, climate, health, etc.) has led to an increased need for trustworthy AI with two main components: explainability and robustness. One step toward making AI more robust is to leverage expert knowledge. For example, a farmer/forester in the loop can often bring in experience and conceptual understanding to the AI pipeline—no AI can do this. Consequently, human-centered AI (HCAI) is a combination of “artificial intelligence” and “natural intelligence” to empower, amplify, and augment human performance, rather than replace people. To achieve practical success of HCAI in agriculture and forestry, this article identifies three important frontier research areas: (1) intelligent information fusion; (2) robotics and embodied intelligence; and (3) augmentation, explanation, and verification for trusted decision support. This goal will also require an agile, human-centered design approach for three generations (G). G1: Enabling easily realizable applications through immediate deployment of existing technology. G2: Medium-term modification of existing technology. G3: Advanced adaptation and evolution beyond state-of-the-art

    Mining Social Media and Structured Data in Urban Environmental Management to Develop Smart Cities

    Get PDF
    This research presented the deployment of data mining on social media and structured data in urban studies. We analyzed urban relocation, air quality and traffic parameters on multicity data as early work. We applied the data mining techniques of association rules, clustering and classification on urban legislative history. Results showed that data mining could produce meaningful knowledge to support urban management. We treated ordinances (local laws) and the tweets about them as indicators to assess urban policy and public opinion. Hence, we conducted ordinance and tweet mining including sentiment analysis of tweets. This part of the study focused on NYC with a goal of assessing how well it heads towards a smart city. We built domain-specific knowledge bases according to widely accepted smart city characteristics, incorporating commonsense knowledge sources for ordinance-tweet mapping. We developed decision support tools on multiple platforms using the knowledge discovered to guide urban management. Our research is a concrete step in harnessing the power of data mining in urban studies to enhance smart city development

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI
    corecore