19 research outputs found

    A Novel Multidimensional Reference Model For Heterogeneous Textual Datasets Using Context, Semantic And Syntactic Clues

    Get PDF
    With the advent of technology and use of latest devices, they produces voluminous data. Out of it, 80% of the data are unstructured and remaining 20% are structured and semi-structured. The produced data are in heterogeneous format and without following any standards. Among heterogeneous (structured, semi-structured and unstructured) data, textual data are nowadays used by industries for prediction and visualization of future challenges. Extracting useful information from it is really challenging for stakeholders due to lexical and semantic matching. Few studies have been solving this issue by using ontologies and semantic tools, but the main limitations of proposed work were the less coverage of multidimensional terms. To solve this problem, this study aims to produce a novel multidimensional reference model using linguistics categories for heterogeneous textual datasets. The categories such context, semantic and syntactic clues are focused along with their score. The main contribution of MRM is that it checks each tokens with each term based on indexing of linguistic categories such as synonym, antonym, formal, lexical word order and co-occurrence. The experiments show that the percentage of MRM is better than the state-of-the-art single dimension reference model in terms of more coverage, linguistics categories and heterogeneous datasets

    HABCSm: A Hamming Based t-way Strategy based on Hybrid Artificial Bee Colony for Variable Strength Test Sets Generation

    Get PDF
    Search-based software engineering that involves the deployment of meta-heuristics in applicable software processes has been gaining wide attention. Recently, researchers have been advocating the adoption of meta-heuristic algorithms for t-way testing strategies (where t points the interaction strength among parameters). Although helpful, no single meta-heuristic based t-way strategy can claim dominance over its counterparts. For this reason, the hybridization of meta-heuristic algorithms can help to ascertain the search capabilities of each by compensating for the limitations of one algorithm with the strength of others. Consequently, a new meta-heuristic based t-way strategy called Hybrid Artificial Bee Colony (HABCSm) strategy, based on merging the advantages of the Artificial Bee Colony (ABC) algorithm with the advantages of a Particle Swarm Optimization (PSO) algorithm is proposed in this paper. HABCSm is the first t-way strategy to adopt Hybrid Artificial Bee Colony (HABC) algorithm with Hamming distance as its core method for generating a final test set and the first to adopt the Hamming distance as the final selection criterion for enhancing the exploration of new solutions. The experimental results demonstrate that HABCSm provides superior competitive performance over its counterparts. Therefore, this finding contributes to the field of software testing by minimizing the number of test cases required for test execution

    Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

    No full text
    Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance

    Latency-Sensitive Function Placement among Heterogeneous Nodes in Serverless Computing

    No full text
    Function as a Service (FaaS) is highly beneficial to smart city infrastructure due to its flexibility, efficiency, and adaptability, specifically for integration in the digital landscape. FaaS has serverless setup, which means that an organization no longer has to worry about specific infrastructure management tasks; the developers can focus on how to deploy and create code efficiently. Since FaaS aligns well with the IoT, it easily integrates with IoT devices, thereby making it possible to perform event-based actions and real-time computations. In our research, we offer an exclusive likelihood-based model of adaptive machine learning for identifying the right place of function. We employ the XGBoost regressor to estimate the execution time for each function and utilize the decision tree regressor to predict network latency. By encompassing factors like network delay, arrival computation, and emphasis on resources, the machine learning model eases the selection process of a placement. In replication, we use Docker containers, focusing on serverless node type, serverless node variety, function location, deadlines, and edge-cloud topology. Thus, the primary objectives are to address deadlines and enhance the use of any resource, and from this, we can see that effective utilization of resources leads to enhanced deadline compliance

    Heterogeneous Ensemble with Combined Dimensionality Reduction for Social Spam Detection

    No full text
    This study presents a novel framework based on a heterogeneous ensemble method and a hybrid dimensionality reduction technique for spam detection in micro-blogging social networks. A hybrid of Information Gain (IG) and Principal Component Analysis (PCA) (dimensionality reduction) was implemented for the selection of important features and a heterogeneous ensemble consisting of Naïve Bayes (NB), K Nearest Neighbor (KNN), Logistic Regression (LR) and Repeated Incremental Pruning to Produce Error Reduction (RIPPER) classifiers based on Average of Probabilities (AOP) was used for spam detection. The proposed framework was applied on MPI_SWS and SAC’13 Tip spam datasets and the developed models were evaluated based on accuracy, precision, recall, f-measure, and area under the curve (AUC). From the experimental results, the proposed framework (that is, Ensemble + IG + PCA) outperformed other experimented methods on studied spam datasets. Specifically, the proposed method had an average accuracy value of 87.5%, an average precision score of 0.877, an average recall value of 0.845, an average F-measure value of 0.872 and an average AUC value of 0.943. Also, the proposed method had better performance than some existing methods. Consequently, this study has shown that addressing high dimensionality in spam datasets, in this case, a hybrid of IG and PCA with a heterogeneous ensemble method can produce a more effective method for detecting spam contents

    Blockchain-Enabled Framework for Transparent Land Lease and Mortgage Management

    No full text
    A land administration system (LAS) is a structured framework designed to govern the management of land resources in a specific region or country. However, LAS faces challenges like inefficiencies, a lack of transparency, and susceptibility to fraud. The digitization of land records improved efficiency but failed to address manipulation, centralized databases, and double-spending issues. Traditional lease and mortgage management systems also suffer from complexity, errors, and a lack of real-time validation. At present, a significant influx of land transactions produces substantial data, classifiable as big data due to constant minute-to-minute occurrences like land transfers, acquisitions, document verification, and leasing/mortgaging transactions. In this context, we present a Blockchain-driven system that not only tackles alteration and double-spending issues in traditional systems but also implements distributed data management. Current state-of-the-art solutions do not fully incorporate crucial features of Blockchain, such as transparency, prevention of double-spending, auditability, immutability, and user participation. To tackle this problem, this research introduces a comprehensive Blockchain-powered framework for lease and mortgage management, addressing transparency, user involvement, and double-spending prevention. Unlike existing solutions, our framework integrates key Blockchain characteristics for a holistic approach. Through practical use cases involving property owners, banks, and financial institutions, we establish a secure, distributed, and transparent method for property financing. We verify the system by employing smart contracts and assess the cost and security parameters while validating the blockchain-based mortgage and lease functions

    Visual Signifier for Large Multi-Touch Display to Support Interaction in a Virtual Museum Interface

    No full text
    The signifier is regarded as a crucial part of interface design since it ensures that the user can manage the device appropriately and understand the interaction that is taking place. Useful signifiers keep users’ attention on learning, but poorly designed signifiers can disrupt learning by slowing progress and making it harder to use the interface. The problem is that prior research identified the qualities of signifiers, but their attributes in terms of being visually apparent in broad interaction areas were not well recognized. Implementing the signifier without sufficient visual features such as a picture, figure or gesture may interfere with the user’s ability to navigate the surface, particularly when dealing with domains that demand “leisure exploration,” such as those in culture and heritage, and notably the museum application. As technology has evolved and expanded, adopting a multi-touch tabletop as a medium of viewing should be advantageous in conserving cultural heritage. As technology advances and improves, employing a multi-touch tabletop as a public viewing medium should be advantageous in maintaining cultural heritage. Some visual elements should be incorporated into the signifier to produce a conspicuous presentation and make it easier for users to identify. In this study, a preliminary study, a card sorting survey, and a high-fidelity experiment were used to investigate users’ experience, perspective, and interpretation of the visual signifier of the museum interface for large displays. This work offered a set of integrated visual signifiers on a big multi-touch display that makes a substantial contribution to supporting navigation and interaction on a large display, therefore aiding comprehension of the exhibited information visualization
    corecore