125 research outputs found

    Artificial Intelligence for the Edge Computing Paradigm.

    Get PDF
    With modern technologies moving towards the internet of things where seemingly every financial, private, commercial and medical transaction being carried out by portable and intelligent devices; Machine Learning has found its way into every smart device and application possible. However, Machine Learning cannot be used on the edge directly due to the limited capabilities of small and battery-powered modules. Therefore, this thesis aims to provide light-weight automated Machine Learning models which are applied on a standard edge device, the Raspberry Pi, where one framework aims to limit parameter tuning while automating feature extraction and a second which can perform Machine Learning classification on the edge traditionally, and can be used additionally for image-based explainable Artificial Intelligence. Also, a commercial Artificial Intelligence software have been ported to work in a client/server setups on the Raspberry Pi board where it was incorporated in all of the Machine Learning frameworks which will be presented in this thesis. This dissertation also introduces multiple algorithms that can convert images into Time-series for classification and explainability but also introduces novel Time-series feature extraction algorithms that are applied to biomedical data while introducing the concept of the Activation Engine, which is a post-processing block that tunes Neural Networks without the need of particular experience in Machine Leaning. Also, a tree-based method for multiclass classification has been introduced which outperforms the One-to-Many approach while being less complex that the One-to-One method.\par The results presented in this thesis exhibit high accuracy when compared with the literature, while remaining efficient in terms of power consumption and the time of inference. Additionally the concepts, methods or algorithms that were introduced are particularly novel technically, where they include: ‱ Feature extraction of professionally annotated, and poorly annotated time-series. ‱ The introduction of the Activation Engine post-processing block. ‱ A model for global image explainability with inference on the edge. ‱ A tree-based algorithm for multiclass classification

    Novel methods for multi-view learning with applications in cyber security

    Get PDF
    Modern data is complex. It exists in many different forms, shapes and kinds. Vectors, graphs, histograms, sets, intervals, etc.: they each have distinct and varied structural properties. Tailoring models to the characteristics of various feature representations has been the subject of considerable research. In this thesis, we address the challenge of learning from data that is described by multiple heterogeneous feature representations. This situation arises often in cyber security contexts. Data from a computer network can be represented by a graph of user authentications, a time series of network traffic, a tree of process events, etc. Each representation provides a complementary view of the holistic state of the network, and so data of this type is referred to as multi-view data. Our motivating problem in cyber security is anomaly detection: identifying unusual observations in a joint feature space, which may not appear anomalous marginally. Our contributions include the development of novel supervised and unsupervised methods, which are applicable not only to cyber security but to multi-view data in general. We extend the generalised linear model to operate in a vector-valued reproducing kernel Hilbert space implied by an operator-valued kernel function, which can be tailored to the structural characteristics of multiple views of data. This is a highly flexible algorithm, able to predict a wide variety of response types. A distinguishing feature is the ability to simultaneously identify outlier observations with respect to the fitted model. Our proposed unsupervised learning model extends multidimensional scaling to directly map multi-view data into a shared latent space. This vector embedding captures both commonalities and disparities that exist between multiple views of the data. Throughout the thesis, we demonstrate our models using real-world cyber security datasets.Open Acces

    Sequence data mining and characterisation of unclassified microbial diversity

    Get PDF
    In the last two decades, sequencing has become increasingly affordable and a routine tool to study the microbial community of a given environment. Metagenomics has revolutionised the way microbes are identified and studied in this age of biological data science because it provides a relatively unbiased view of the composition of microbial communities we interact with every day, which are integral to our ecosystem. These technological advances have led to an exponential growth of raw data repositories that save, distribute and archive these metagenomic datasets. Since metagenomics presents the ultimate opportunity to capture, explore and identify uncultivated microbial genomic sequences, these metagenomic datasets harbour a large proportion of unknown sequences that do not bear any similarity to known sequences readily available in the standard sequence data repositories. The aim of this thesis was to systematically catalogue, quantify and potentially characterise the unknown sequences embedded within the metagenomic datasets. To this end, a comprehensive, portable, modular framework called UnXplore was developed to determine the proportion of unknown sequences included in human microbiome datasets. UnXplore was applied to a range of different human microbiomes and showed that on average 2% of assembled sequences were categorised as unknown meaning that they did not bear any sequence similarity to known sequences. A third of the unknown sequences were shown to contain large open reading frames indicating the coding potential and biological origin of the unknowns. Furthermore, a small proportion of these potentially coding sequences were shown to have functional similarities as they were deemed to contain known protein domain signatures. These results indicated that unknown sequences captured through the UnXplore framework were not artefacts and were indeed of biological origin. To test this formally, supervised kmer-based machine learning models were devised, tested and validated. These models are currently distributed in a package called TetraPredX that can accurately predict whether a sequence originated from bacteria, archaea, virus or plasmid. TetraPredX models were applied to the unknown sequence dataset and revealed that the majority of unknown sequences are of biological origin. Furthermore, TetraPredX results demonstrated that >70% of all long unknown sequences (i.e. >1kb) are likely to be of virus origin indicating an unexplored diversity of viruses that is yet to be fully characterised and classified. In order to catalogue the diversity of virus sequences in human microbiome samples analysed here, an extensive virus discovery analysis was carried out on the contigs assembled through UnXplore. This helped to characterise a vast diversity of prokaryotic, eukaryotic and unclassified virus sequences captured in a range of human microbiomes. The results obtained here demonstrate the need to systematically interrogate metagenomic datasets to fully comprehend and compile the presence of both known and unknown uncultivated microbes within them. A comprehensive survey of metagenomic datasets carried out in this manner would provide a more complete picture of the known and unknown organisms that surround us

    Deep Learning-Enabled Cerebrovascular Reactivity Processing Software

    Get PDF
    Magnetic Resonance Imaging (MRI) is a non-invasive medical imaging technique that is used to generate high-resolution images of the brain. Blood oxygenation level dependent (BOLD) imaging is a functional MRI technique that maps the differences in cerebral blood flow (CBF). Cerebrovascular Reactivity (CVR) is a provocative test used with BOLD-MRI studies where a vasoactive stimulus is applied and the corresponding changes in the CBF differences are analyzed. This test is analogous to a cardiac stress test where the patient exercises and the change in heart blood flow are observed. CVR is measured as the ratio of the change in BOLD signals to the change in the vasoactive stimulus. The vasoactive stimulus used in this work is the arterial partial pressure of carbon dioxide. The gas control for applying the stimulus is accomplished using a computer-controlled device called RespirAct RA-MRTM. CVR studies can highlight abnormalities in the brain vasculature and therefore, indicate underlying pathological conditions. Studies have demonstrated a close correlation between an irregular CVR distribution and cerebrovascular diseases like Alzheimer’s disease, steno-occlusive disease (SOD), stroke, and traumatic brain injury. SOD is the most common cause of ischemic stroke all over the world. Patients with symptomatic SOD are at a high risk of recurrent ischemic stroke. Therefore, information about the severity and spatial location of irregular CVR at the brain tissue level can help guide clinical treatment. The current generation of CVR analyses and assessments are conducted manually by a team of doctors and radiologists, using their subject expertise and years of experience. In this work, a next-generation CVR processing and visualization software application is presented that furthers the research capabilities of CVR analyses. The proposed software is capable of processing raw BOLD-MRI files and generating CVR maps. It is developed using open-source tools and deployed as a stand-alone application that runs on a virtual machine. Additionally, convolutional neural networks (CNNs) are designed to facilitate the screening of SOD patients by classifying the CVR maps into healthy and unhealthy patients. Some popular pre-trained networks, like, EfficientNetB0, InceptionV3, ResNet50, and VGG16 are fine-tuned to accomplish the target classification. For training the CNNs, the original dataset consisted of 68 healthy and 163 unhealthy images. To increase the number of trainable samples and address the data imbalance, image augmentation techniques were applied. An empirical evaluation-based design strategy is implemented for optimizing the network architecture. The performance of different CNN architectures along with fine-tuning of hyperparameters was analyzed and the most optimal network is presented. Results from transfer learning are compared as well. Experiments indicated that a customized CNN with two convolution layers with 32 filters and a hidden fully connected layer with 32 neurons produced the best results. This model used batch normalizations and dropout regularization after the convolution layers and the fully connected layer. It achieves high training/validation/prediction results consistent with expert clinical readings. The proposed software integrates the complete CVR research workflow, including file management, processing MRI files, data visualization, and producing CVR maps, and serves as a clinical decision support system, automating the workflow by 75% and providing a one-stop software solution. It is suggested that this software is used as a research tool to produce CVR maps and make data-driven decisions on SOD screenings, along with validation by experts

    Collected Papers (Neutrosophics and other topics), Volume XIV

    Get PDF
    This fourteenth volume of Collected Papers is an eclectic tome of 87 papers in Neutrosophics and other fields, such as mathematics, fuzzy sets, intuitionistic fuzzy sets, picture fuzzy sets, information fusion, robotics, statistics, or extenics, comprising 936 pages, published between 2008-2022 in different scientific journals or currently in press, by the author alone or in collaboration with the following 99 co-authors (alphabetically ordered) from 26 countries: Ahmed B. Al-Nafee, Adesina Abdul Akeem Agboola, Akbar Rezaei, Shariful Alam, Marina Alonso, Fran Andujar, Toshinori Asai, Assia Bakali, Azmat Hussain, Daniela Baran, Bijan Davvaz, Bilal Hadjadji, Carlos DĂ­az Bohorquez, Robert N. Boyd, M. Caldas, Cenap Özel, Pankaj Chauhan, Victor Christianto, Salvador Coll, Shyamal Dalapati, Irfan Deli, Balasubramanian Elavarasan, Fahad Alsharari, Yonfei Feng, Daniela GĂźfu, Rafael Rojas GualdrĂłn, Haipeng Wang, Hemant Kumar Gianey, Noel Batista HernĂĄndez, Abdel-Nasser Hussein, Ibrahim M. Hezam, Ilanthenral Kandasamy, W.B. Vasantha Kandasamy, Muthusamy Karthika, Nour Eldeen M. Khalifa, Madad Khan, Kifayat Ullah, Valeri Kroumov, Tapan Kumar Roy, Deepesh Kunwar, Le Thi Nhung, Pedro LĂłpez, Mai Mohamed, Manh Van Vu, Miguel A. Quiroz-MartĂ­nez, Marcel Migdalovici, Kritika Mishra, Mohamed Abdel-Basset, Mohamed Talea, Mohammad Hamidi, Mohammed Alshumrani, Mohamed Loey, Muhammad Akram, Muhammad Shabir, Mumtaz Ali, Nassim Abbas, Munazza Naz, Ngan Thi Roan, Nguyen Xuan Thao, Rishwanth Mani Parimala, Ion Pătrașcu, Surapati Pramanik, Quek Shio Gai, Qiang Guo, Rajab Ali Borzooei, Nimitha Rajesh, JesĂșs Estupiñan Ricardo, Juan Miguel MartĂ­nez Rubio, Saeed Mirvakili, Arsham Borumand Saeid, Saeid Jafari, Said Broumi, Ahmed A. Salama, Nirmala Sawan, Gheorghe Săvoiu, Ganeshsree Selvachandran, Seok-Zun Song, Shahzaib Ashraf, Jayant Singh, Rajesh Singh, Son Hoang Le, Tahir Mahmood, Kenta Takaya, Mirela Teodorescu, Ramalingam Udhayakumar, Maikel Y. Leyva VĂĄzquez, V. Venkateswara Rao, Luige Vlădăreanu, Victor Vlădăreanu, Gabriela Vlădeanu, Michael Voskoglou, Yaser Saber, Yong Deng, You He, Youcef Chibani, Young Bae Jun, Wadei F. Al-Omeri, Hongbo Wang, Zayen Azzouz Omar
    • 

    corecore