34 research outputs found

    Annotate and retrieve in vivo images using hybrid self-organizing map

    Get PDF
    Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making

    A Correlation Based Recommendation System for Large Data Sets

    Get PDF
    Correlation determination brings out relationships in data that had not been seen before and it is imperative to successfully use the power of correlations for data mining. In this paper, we have used the concepts of correlations to cluster data, and merged it with recommendation algorithms. We have proposed two correlation clustering algorithms (RBACC and LGBACC), that are based on finding Spearman’s rank correlation coefficient among data points, and using dimensionality reduction approach (PCA) along with graph theory respectively, to produce high quality hierarchical clusters. Both these algorithms have been tested on real life data (New York yellow cabs dataset taken from http://www.nyc.gov), using distributed and parallel computing (Spark and R). They are found to be scalable and perform better than the existing hierarchical clustering algorithms. These two approaches have been used to replace similarity measures in recommendation algorithms and generate a correlation clustering based recommendation system model. We have combined the power of correlation analysis with that of prediction analysis to propose a better recommendation system. It is found that this model makes better quality recommendations as compared to the random recommendation model. This model has been validated using a real time, large data set (MovieLens dataset, taken from http://grouplens.org/datasets/movielens/latest). The results show that combining correlated points with the predictive power of recommendation algorithms, produce better quality recommendations which are faster to compute. LGBACC has approximately 25% better prediction capability but at the same time takes significantly more prediction time compared to RBACC.

    Explainable Information Retrieval using Deep Learning for Medical images

    Get PDF
    Image segmentation is useful to extract valuable information for an efficient analysis on the region of interest. Mostly, the number of images generated from a real life situation such as streaming video, is large and not ideal for traditional segmentation with machine learning algorithms. This is due to the following factors (a) numerous image features (b) complex distribution of shapes, colors and textures (c) imbalance data ratio of underlying classes (d) movements of the camera, objects and (e) variations in luminance for site capture. So, we have proposed an efficient deep learning model for image classification and the proof-of-concept has been the case studied on gastrointestinal images for bleeding detection. The Explainable Artificial Intelligence (XAI) module has been utilised to reverse engineer the test results for the impact of features on a given test dataset. The architecture is generally applicable in other areas of image classification. The proposed method has been compared with state-of-the-art including Logistic Regression, Support Vector Machine, Artificial Neural Network and Random Forest. It has reported F1 score of 0.76 on the real world streaming dataset which is comparatively better than traditional methods

    Automated IoT device identification based on full packet information using real-time network traffic

    Get PDF
    In an Internet of Things (IoT) environment, a large volume of potentially confidential data might be leaked from sensors installed everywhere. To ensure the authenticity of such sensitive data, it is important to initially verify the source of data and its identity. Practically, IoT device identification is the primary step toward a secure IoT system. An appropriate device identification approach can counteract malicious activities such as sending false data that trigger irreparable security issues in vital or emergency situations. Recent research indicates that primary identity metrics such as Internet Protocol (IP) or Media Access Control (MAC) addresses are insufficient due to their instability or easy accessibility. Thus, to identify an IoT device, analysis of the header information of packets by the sensors is of imperative consideration. This paper proposes a combination of sensor measurement and statistical feature sets in addition to a header feature set using a classification-based device identification framework. Various machine Learning algorithms have been adopted to identify different combinations of these feature sets to provide enhanced security in IoT devices. The proposed method has been evaluated through normal and under-attack circumstances by collecting real-time data from IoT devices connected in a lab setting to show the system robustness

    Machine learning applications for smart building energy utilization : a survey

    Get PDF
    The United Nations launched sustainable development goals in 2015 that include goals for sustainable energy. From global energy consumption, households consume 20–30% of energy in Europe, North America and Asia; furthermore, the overall global energy consumption has steadily increased in the recent decades. Consequently, to meet the increased energy demand and to promote efficient energy consumption, there is a persistent need to develop applications enhancing utilization of energy in buildings. However, despite the potential significance of AI in this area, few surveys have systematically categorized these applications. Therefore, this paper presents a systematic review of the literature, and then creates a novel taxonomy for applications of smart building energy utilization. The contributions of this paper are (a) a systematic review of applications and machine learning methods for smart building energy utilization, (b) a novel taxonomy for the applications, (c) detailed analysis of these solutions and techniques used for the applications (electric grid, smart building energy management and control, maintenance and security, and personalization), and, finally, (d) a discussion on open issues and developments in the field

    N-semble-based method for identifying Parkinson's disease genes

    Get PDF
    Parkinson’s disease (PD) genes identification plays an important role in improving the diagnosis and treatment of the disease. A number of machine learning methods have been proposed to identify disease-related genes, but only few of these methods are adopted for PD. This work puts forth a novel neural network-based ensemble (n-semble) method to identify Parkinson’s disease genes. The artificial neural network is trained in a unique way to ensemble the multiple model predictions. The proposed n-semble method is composed of four parts: (1) protein sequences are used to construct feature vectors using physicochemical properties of amino acid; (2) dimensionality reduction is achieved using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method, (3) the Jaccard method is applied to find likely negative samples from unknown (candidate) genes, and (4) gene prediction is performed with n-semble method. The proposed n-semble method has been compared with Smalter’s, ProDiGe, PUDI and EPU methods using various evaluation metrics. It has been concluded that the proposed n-semble method outperforms the existing gene identification methods over the other methods and achieves significantly higher precision, recall and F Score of 88.9%, 90.9% and 89.8%, respectively. The obtained results confirm the effectiveness and validity of the proposed framework

    A Comprehensive Security Architecture for Information Management throughout the Lifecycle of IoT Products

    Get PDF
    The Internet of things (IoT) is expected to have an impact on business and the world at large in a way comparable to the Internet itself. An IoT product is a physical product with an associated virtual counterpart connected to the internet with computational as well as communication capabilities. The possibility to collect information from internet-connected products and sensors gives unprecedented possibilities to improve and optimize product use and maintenance. Virtual counterpart and digital twin (DT) concepts have been proposed as a solution for providing the necessary information management throughout the whole product lifecycle, which we here call product lifecycle information management (PLIM). Security in these systems is imperative due to the multiple ways in which opponents can attack the system during the whole lifecycle of an IoT product. To address this need, the current study proposes a security architecture for the IoT, taking into particular consideration the requirements of PLIM. The security architecture has been designed for the Open Messaging Interface (O-MI) and Open Data Format (O-DF) standards for the IoT and product lifecycle management (PLM) but it is also applicable to other IoT and PLIM architectures. The proposed security architecture is capable of hindering unauthorized access to information and restricts access levels based on user roles and permissions. Based on our findings, the proposed security architecture is the first security model for PLIM to integrate and coordinate the IoT ecosystem, by dividing the security approaches into two domains: user client and product domain. The security architecture has been deployed in smart city use cases in three different European cities, Helsinki, Lyon, and Brussels, to validate the security metrics in the proposed approach. Our analysis shows that the proposed security architecture can easily integrate the security requirements of both clients and products providing solutions for them as demonstrated in the implemented use cases

    Citizen participation: crowd-sensed sustainable indoor location services

    Full text link
    In the present era of sustainable innovation, the circular economy paradigm dictates the optimal use and exploitation of existing finite resources. At the same time, the transition to smart infrastructures requires considerable investment in capital, resources and people. In this work, we present a general machine learning approach for offering indoor location awareness without the need to invest in additional and specialised hardware. We explore use cases where visitors equipped with their smart phone would interact with the available WiFi infrastructure to estimate their location, since the indoor requirement poses a limitation to standard GPS solutions. Results have shown that the proposed approach achieves a less than 2m accuracy and the model is resilient even in the case where a substantial number of BSSIDs are dropped.Comment: Preprint submitted to Elsevie

    Comparing seven methods for state-of-health time series prediction for the lithium-ion battery packs of forklifts

    Get PDF
    A key aspect for the forklifts is the state-of-health (SoH) assessment to ensure the safety and the reliability of uninterrupted power source. Forecasting the battery SoH well is imperative to enable preventive maintenance and hence to reduce the costs. This paper demonstrates the capabilities of gradient boosting regression for predicting the SoH timeseries under circumstances when there is little prior information available about the batteries. We compared the gradient boosting method with light gradient boosting, extra trees, extreme gradient boosting, random forests, long short-term memory networks and with combined convolutional neural network and long short-term memory networks methods. We used multiple predictors and lagged target signal decomposition results as additional predictors and compared the yielded prediction results with different sets of predictors for each method. For this work, we are in possession of a unique data set of 45 lithium-ion battery packs with large variation in the data. The best model that we derived was validated by a novel walk-forward algorithm that also calculates point-wise confidence intervals for the predictions; we yielded reasonable predictions and confidence intervals for the predictions. Furthermore, we verified this model against five other lithium-ion battery packs; the best model generalised to greater extent to this set of battery packs. The results about the final model suggest that we were able to enhance the results in respect to previously developed models. Moreover, we further validated the model for extracting cycle counts presented in our previous work with data from new forklifts; their battery packs completed around 3000 cycles in a 10-year service period, which corresponds to the cycle life for commercial Nickel–Cobalt–Manganese (NMC) cells
    corecore