51 research outputs found
Development of a cloud-assisted classification technique for the preservation of secure data storage in smart cities
Cloud computing is the most recent smart city advancement, made possible by the increasing volume of heterogeneous data produced by apps. More storage capacity and processing power are required to process this volume of data. Data analytics is used to examine various datasets, both structured and unstructured. Nonetheless, as the complexity of data in the healthcare and biomedical communities grows, obtaining more precise results from analyses of medical datasets presents a number of challenges. In the cloud environment, big data is abundant, necessitating proper classification that can be effectively divided using machine language. Machine learning is used to investigate algorithms for learning and data prediction. The Cleveland database is frequently used by machine learning researchers. Among the performance metrics used to compare the proposed and existing methodologies are execution time, defect detection rate, and accuracy. In this study, two supervised learning-based classifiers, SVM and Novel KNN, were proposed and used to analyses data from a benchmark database obtained from the UCI repository. Initially, intrusions were detected using the SVM classification method. The proposed study demonstrated how the novel KNN used for distance capacity outperformed previous studies. The accuracy of the results of both approaches is evaluated. The results show that the intrusion detection system (IDS) with a 98.98% accuracy rate produces the best results when using the suggested system
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Design for energy-efficient and reliable fog-assisted healthcare IoT systems
Cardiovascular disease and diabetes are two of the most dangerous diseases as they are the leading causes of death in all ages. Unfortunately, they cannot be completely cured with the current knowledge and existing technologies. However, they can be effectively managed by applying methods of continuous health monitoring. Nonetheless, it is difficult to achieve a high quality of healthcare with the current health monitoring systems which often have several limitations such as non-mobility support, energy inefficiency, and an insufficiency of advanced services. Therefore, this thesis presents a Fog computing approach focusing on four main tracks, and proposes it as a solution to the existing limitations. In the first track, the main goal is to introduce Fog computing and Fog services into remote health monitoring systems in order to enhance the quality of healthcare.
In the second track, a Fog approach providing mobility support in a real-time health monitoring IoT system is proposed. The handover mechanism run by Fog-assisted smart gateways helps to maintain the connection between sensor nodes and the gateways with a minimized latency. Results show that the handover latency of the proposed Fog approach is 10%-50% less than other state-of-the-art mobility support approaches.
In the third track, the designs of four energy-efficient health monitoring IoT systems are discussed and developed. Each energy-efficient system and its sensor nodes are designed to serve a specific purpose such as glucose monitoring, ECG monitoring, or fall detection; with the exception of the fourth system which is an advanced and combined system for simultaneously monitoring many diseases such as diabetes and cardiovascular disease. Results show that these sensor nodes can continuously work, depending on the application, up to 70-155 hours when using a 1000 mAh lithium battery.
The fourth track mentioned above, provides a Fog-assisted remote health monitoring IoT system for diabetic patients with cardiovascular disease. Via several proposed algorithms such as QT interval extraction, activity status categorization, and fall detection algorithms, the system can process data and detect abnormalities in real-time. Results show that the proposed system using Fog services is a promising approach for improving the treatment of diabetic patients with cardiovascular disease
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications
Processing of Polarization Patterns and Visual Self-Motion in the Locust Central Complex for Spatial Orientation
Despite their relatively small brains with comparatively low neuron counts, insects show complex navigation behavior such as seasonal long-range migration, path integration, and precise straight-line movement. Spatial navigation requires a sense of current heading, which must be tethered to prominent external cues and updated by internal cues that result from movement.
Global external cues such as the position of the sun may provide a reference frame for orientation. Sunlight is polarized by scattering in the atmosphere, which results in a sky-spanning polarization pattern that directly depends on the current solar position and makes polarization information, like the sun itself, useful as an external reference cue. Internally, moving through the environment generates optic flow---the motion of the viewed scenery on the retina---, which may inform about turning maneuvers, movement speed, and covered distance. Many insects use these external and internal cues for orientation, and the neuronal center for spatial navigation likely is the central complex, a higher-order brain structure where sensory information is integrated to form an internal compass representation of the current heading.
This thesis addresses the question how celestial compass cues, specifically the polarization pattern, and optic flow are processed in the central complex of the desert locust, a long-range migratory insect. All chapters except the last one are electrophysiological studies in which single central-complex neurons were intracellularly recorded while presenting visual stimuli. The neurons' anatomy was histologically determined by dye injection in order to infer their role in the neural network.
The studies in Chapters 1 and 2 show that the central complex contains a neuronal compass that robustly signals the sun direction based on direct sunlight and the integration of the whole solar polarization pattern. This shows that the locust brain uses all available skylight cues in order to form a unified compass signal, enabling robust navigation under different environmental conditions.
The study in Chapter 3 further examines how neurons at the input stage of the central complex process skylight cues. Already at this stage, single neurons integrate visual information from large areas of the sky and have receptive fields suitable to build the skylight compass.
Chapter 4 sheds light on the detection sensitivity for the angle of polarization, finding that central-complex neurons are highly sensitive in this regard, adapted to analyze the skylight polarization pattern almost in its entirety and under unfavorable environmental conditions.
In Chapter 5 the locust central complex was scanned for neurons that receive optic flow information. Neurons at virtually all network stages are sensitive to optic flow, mainly uncoupled from skylight-cue sensitivity. This highlights that sensory information is flexibly processed in the central complex, presumably depending on the animal's current behavioral demands. Further, the study hypothesizes how horizontal turning motion is processed in order to update the internal heading representation, backed up by a computational model that adheres to brain anatomy and physiological data.
Altogether, these studies advance the understanding of how external and internal cues are processed in the central-complex network in order to establish a sense of orientation in the insect brain.
Finally, I contributed with data sets and programming code to the development of the InsectBrainDatabase (www.insectbraindb.org), a free online database tool designed to manage, share and publish anatomical and functional research data (Chapter 6)
Code similarity and clone search in large-scale source code data
Software development is tremendously benefited from the Internet by having online code corpora that enable instant sharing of source code and online developer's guides and documentation. Nowadays, duplicated code (i.e., code clones) not only exists within or across software projects but also between online code repositories and websites. We call them "online code clones."' They can lead to license violations, bug propagation, and re-use of outdated code similar to classic code clones between software systems. Unfortunately, they are difficult to locate and fix since the search space in online code corpora is large and no longer confined to a local repository. This thesis presents a combined study of code similarity and online code clones. We empirically show that many code snippets on Stack Overflow are cloned from open source projects. Several of them become outdated or violate their original license and are possibly harmful to reuse. To develop a solution for finding online code clones, we study various code similarity techniques to gain insights into their strengths and weaknesses. A framework, called OCD, for evaluating code similarity and clone search tools is introduced and used to compare 34 state-of-the-art techniques on pervasively modified code and boiler-plate code. We also found that clone detection techniques can be enhanced by compilation and decompilation. Using the knowledge from the comparison of code similarity analysers, we create and evaluate Siamese, a scalable token-based clone search technique via multiple code representations. Our evaluation shows that Siamese scales to large-scale source code data of 365 million lines of code and offers high search precision and recall. Its clone search precision is comparable to seven state-of-the-art clone detection tools on the OCD framework. Finally, we demonstrate the usefulness of Siamese by applying the tool to find online code clones, automatically analyse clone licenses, and recommend tests for reuse
BIG DATA и анализ высокого уровня : материалы конференции
В сборнике опубликованы результаты научных исследований и разработок в области BIG DATA and Advanced Analytics для оптимизации IT-решений и бизнес-решений, а также тематических исследований в области медицины, образования и экологии
- …