53 research outputs found
Machine learning for network based intrusion detection: an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data.
For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack
of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical
investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes
of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained
whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from
imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective
GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions
HUMANE internal case study: eVACUATE #1
This case study was conducted on 14 December 2015. The purpose was to evaluate the usefulness of the HUMANE approach as perceived by relevant developers (software engineers), and additionally ask if the HUMANE typology facilitates cross-disciplinary understanding.
The files included here provide a summary of the analysis and the transcript from a semi-structured focus group
HUMANE external case study: eVACUATE #2
This case study was conducted in September to October 2016 with the purpose of providing an external validation of the HUMANE typology and method. This eVACUATE case-study comprises four different engagements in order to ensure a comprehensive evaluation: a quantitative online survey on the HUMANE design patterns; a quantitative survey on the HUMANE typology used for characterising Human-Machine Networks (HMNs); and two focus groups evaluating the HUMANE method (covering the profiling process, network diagramming, implication analysis, and design pattern approach).
A summary of results, along with focus group transcripts, surveys and survey results are included here
Machine learning for network based intrusion detection : an investigation into discrepancies in findings with the KDD cup '99 data set and multi-objective evolution of neural network classifier ensembles from imbalanced data
For the last decade it has become commonplace to evaluate machine learning techniques for network based intrusion detection on the KDD Cup '99 data set. This data set has served well to demonstrate that machine learning can be useful in intrusion detection. However, it has undergone some criticism in the literature, and it is out of date. Therefore, some researchers question the validity of the findings reported based on this data set. Furthermore, as identified in this thesis, there are also discrepancies in the findings reported in the literature. In some cases the results are contradictory. Consequently, it is difficult to analyse the current body of research to determine the value in the findings. This thesis reports on an empirical investigation to determine the underlying causes of the discrepancies. Several methodological factors, such as choice of data subset, validation method and data preprocessing, are identified and are found to affect the results significantly. These findings have also enabled a better interpretation of the current body of research. Furthermore, the criticisms in the literature are addressed and future use of the data set is discussed, which is important since researchers continue to use it due to a lack of better publicly available alternatives. Due to the nature of the intrusion detection domain, there is an extreme imbalance among the classes in the KDD Cup '99 data set, which poses a significant challenge to machine learning. In other domains, researchers have demonstrated that well known techniques such as Artificial Neural Networks (ANNs) and Decision Trees (DTs) often fail to learn the minor class(es) due to class imbalance. However, this has not been recognized as an issue in intrusion detection previously. This thesis reports on an empirical investigation that demonstrates that it is the class imbalance that causes the poor detection of some classes of intrusion reported in the literature. An alternative approach to training ANNs is proposed in this thesis, using Genetic Algorithms (GAs) to evolve the weights of the ANNs, referred to as an Evolutionary Neural Network (ENN). When employing evaluation functions that calculate the fitness proportionally to the instances of each class, thereby avoiding a bias towards the major class(es) in the data set, significantly improved true positive rates are obtained whilst maintaining a low false positive rate. These findings demonstrate that the issues of learning from imbalanced data are not due to limitations of the ANNs; rather the training algorithm. Moreover, the ENN is capable of detecting a class of intrusion that has been reported in the literature to be undetectable by ANNs. One limitation of the ENN is a lack of control of the classification trade-off the ANNs obtain. This is identified as a general issue with current approaches to creating classifiers. Striving to create a single best classifier that obtains the highest accuracy may give an unfruitful classification trade-off, which is demonstrated clearly in this thesis. Therefore, an extension of the ENN is proposed, using a Multi-Objective GA (MOGA), which treats the classification rate on each class as a separate objective. This approach produces a Pareto front of non-dominated solutions that exhibit different classification trade-offs, from which the user can select one with the desired properties. The multi-objective approach is also utilised to evolve classifier ensembles, which yields an improved Pareto front of solutions. Furthermore, the selection of classifier members for the ensembles is investigated, demonstrating how this affects the performance of the resultant ensembles. This is a key to explaining why some classifier combinations fail to give fruitful solutions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Towards critical event monitoring, detection and prediction for self-adaptive future Internet applications
The Future Internet (FI) will be composed of a multitude of diverse types of services that offer flexible, remote access to software features, content, computing resources, and middleware solutions through different cloud delivery models, such as IaaS, PaaS and SaaS. Ultimately, this means that loosely coupled Internet services will form a comprehensive base for developing value added applications in an agile way. Unlike traditional application development, which uses computing resources and software components under local administrative control, FI applications will thus strongly depend on third-party services. To maintain their quality of service, those applications therefore need to dynamically and autonomously adapt to an unprecedented level of changes that may occur during runtime. In this paper, we present our recent experiences on monitoring, detection, and prediction of critical events for both software services and multimedia applications. Based on these findings we introduce potential directions for future research on self-adaptive FI applications, bringing together those research directions
Jakten pĂĄ Raud den Rame : et studie av makt og samhandling i Saltens yngre jernalder.
Avhandlingen tar for seg økonomi, makt og samhandling i yngre jernalder, med fokus på
omrĂĄdene ved og innenfor Saltstraumen. Jeg tar utgangspunkt i sagalitteraturens fortellinger
om Olav Tryggvasons møte med Raud den Rame fra Salten. Sagaen fremlegger flere
påstander om de politiske og økonomiske forholdene i Salten. Raud sto i følge med Samiske
befolkningen i distriktet. Snorre forteller at Raud hadde flere hundre samer i følget sitt som
sto til hans disposisjon nĂĄr han trengte det. Mine problemstillinger leder ut fra disse
påstandene. Gjennom det arkeologiske materialet og annet kildetilfang forsøker jeg å besvare
tre sentrale spørsmål. Kan man identifisere et maktsenter i dette området, slik det man hører
om i sagalitteraturen? Er en samhandling mellom samer og håløyger sporbar i dette området?
Hvor mye kan egentlig sagalitteraturen fortelle oss om samfunnet pĂĄ denne tiden?
Jeg argumenterer for at et slikt maktsenter kan spores gjennom det arkeologiske materialet, og
ved hjelp av annet kildetilfang mener jeg ĂĄ avgrense betydelig hvor dette maktsenteret var
lokalisert. Det viser seg at flere av påstandene i sagalitteraturen har støtte i det arkeologiske
materialet. Jeg fremhever at selv om man ofte ønsker å plassere lokaliteter og funn i etniske
og kulturelle båser, kan man ikke alltid gjøre det uten videre.
Om Raud den Rame har eksistert eller er et produkt av lengre tids forvrengning av muntlige
sagn og sagaforfatternes personlige bidrag til disse sagnene kan man aldri vite. Allikevel
forteller historien om Raud den Rame oss mye om både sosiopolitiske, økonomiske og
kultiske forhold i jernalderens Salten. Relasjonene mellom samer og håløyger fremheves
sterkt i Snorres tekst. Dette mener jeg er et bevisst valg fra sagaforfatteren. Jeg argumenterer
for at håløygenes kontakter med den samiske befolkningen har stukket langt dypere enn rene
økonomiske hensyn. Høvdingenes kontakter med samer har hatt en viktig religiøs og
sosiopolitisk betydning. Demoniseringen av det samiske folk, og samisk trolldom var kanskje
et viktig trekk for ĂĄ komme de gamle skikkene til livs, og gjennom ĂĄ plassere en fiende av
kongen og kirken i en allianse med dem, understreke de politiske poengene
Automation in Human-Machine Networks: How Increasing Machine Agency Affects Human Agency
Efficient human-machine networks require productive interaction between human
and machine actors. In this study, we address how a strengthening of machine
agency, for example through increasing levels of automation, affect the human
actors of the networks. Findings from case studies within air traffic
management, crisis management, and crowd evacuation are presented, exemplifying
how automation may strengthen the agency of human actors in the network through
responsibility sharing and task allocation, and serve as a needed prerequisite
of innovation and change
Business Process Risk Management and Simulation Modelling for Digital Audio-Visual Media Preservation.
Digitised and born-digital Audio-Visual (AV) content
presents new challenges for preservation and Quality Assurance
(QA) to ensure that cultural heritage is accessible for the long
term. Digital archives have developed strategies for avoiding,
mitigating and recovering from digital AV loss using IT-based
systems, involving QA tools before ingesting files into the archive
and utilising file-based replication to repair files that may be
damaged while in the archive. However, while existing strategies
are effective for addressing issues related to media degradation,
issues such as format obsolescence and failures in processes and
people pose significant risk to the long-term value of digital
AV content. We present a Business Process Risk management
framework (BPRisk) designed to support preservation experts
in managing risks to long-term digital media preservation. This
framework combines workflow and risk specification within a
single risk management process designed to support continual
improvement of workflows. A semantic model has been developed
that allows the framework to incorporate expert knowledge from
both preservation and security experts in order to intelligently
aid workflow designers in creating and optimising workflows.
The framework also provides workflow simulation functionality,
allowing users to a) understand the key vulnerabilities in the
workflows, b) target investments to address those vulnerabilities,
and c) minimise the economic consequences of risks. The application of the BPRisk framework is demonstrated on a use case
with the Austrian Broadcasting Corporation (ORF), discussing
simulation results and an evaluation against the outcomes of
executing the planned workflow
The development of a web-based application to predict the risk of gastrointestinal cancer in iron deficiency anaemia; the IDIOM app
To facilitate the clinical use of an algorithm for predicting the risk of gastrointestinal malignancy in iron deficiency anaemia—the IDIOM score, a software application has been developed, with a view to providing free and simple access to healthcare professionals in the UK. A detailed requirements analysis for intended users of the application revealed the need for an automated decision-support tool in which anonymised, individual patient data is entered and gastrointestinal cancer risk is calculated and displayed immediately, which lends itself to use in busy clinical settings. Human-centred design was employed to develop the solution, focusing on the users and their needs, whilst ensuring that they are provided with sufficient details to appropriately interpret the risk score. The IDIOM App has been developed using R Shiny as a web-based application enabling access from different platforms with updates that can be carried out centrally through the host server. The application has been evaluated through literature search, internal/external validation, code testing, risk analysis, and usability assessments. Legal notices, contact system with research and maintenance teams, and all the supportive information for the application such as description of the population and intended users have been embedded within the application interface. With the purpose of providing a guide of developing standalone software medical devices in academic setting, this paper aims to present the theoretical and practical aspects of developing, writing technical documentation, and certifying standalone software medical devices using the case of the IDIOM App as an example
- …