96,046 research outputs found

    Feature selection to enhance android malware detection using modified term frequency-inverse document frequency (MTF-IDF)

    Get PDF
    This research synthesizes an evaluation of feature selection algorithm by utilizing Term Frequency-Inverse Document Frequency (TF-IDF) as the main algorithm in Android malware detection. The TF-IDF algorithm is used to filter Android features filtered before detection process. However, IDF is unaware to the training class labels and gives incorrect weight value to some features. Therefore, the proposed approach that is Modified Term Frequency – Inverse Document Frequency (MTF-IDF) algorithm give more focus on both sample and features to give correct weight value to some features. The proposed algorithm considered features based on its level of importance where weight given based on number of features involved in the sample. The related best features in the sample are selected using weight and priority ranking process using K-means. This ensures that only important malware features are selected in the Android application sample. These experiments are conducted on a sample collected from DREBIN. Comparison between existing TF-IDF algorithm and MTF-IDF algorithm have been made under various conditions such as tested on different number of sample size, different number of features used and integration of different types of features. The results showed that feature selection using MTF-IDF can improve Android malware detection analysis. It was proven that MTF-IDF is an effective Android malware detection algorithm regardless of different kinds of features or sample sizes used. MTF-IDF algorithm also proved that it can give appropriate scaling for all features in analyzing Android malware detection

    Blog Analysis with Fuzzy TFIDF

    Get PDF
    These days blogs are becoming increasingly popular because it allows anyone to share their personal diary, opinions, and comments on the World Wide Wed. Many blogs contain valuable information, but it is a difficult task to extract this information from a high number of blog comments. The goal is to analyze a high number of blog comments by clustering all blog comments by their similarity based on keyword relevance into smaller groups. TF-IDF weight has been used in classifying documents by measuring appearance frequency of each keyword in a document, but it is not effective in differentiating semantic similarities between words. By applying fuzzy semantic to TF-IDF, TF-IDF becomes fuzzy TF-IDF and has the ability to rank semantic relevancy. Fuzzy VSM can be effective in exploring hidden relationship between blog comments by adapting fuzzy TF-IDF and fuzzy semantic for extending Vector Space Model to fuzzy VSM. Therefore, fuzzy VSM can cluster a high number of blog comments into small number of groups based on document similarity and semantic relevancy

    Bug or Not? Bug Report Classification Using N-Gram IDF

    Get PDF
    Previous studies have found that a significant number of bug reports are misclassified between bugs and non-bugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to extract key terms of any length from texts, these key terms can be used as the features to classify bug reports. We build classification models with logistic regression and random forest using features from N-gram IDF and topic modeling, which is widely used in various software engineering tasks. With a publicly available dataset, our results show that our N-gram IDF-based models have a superior performance than the topic-based models on all of the evaluated cases. Our models show promising results and have a potential to be extended to other software engineering tasks.Comment: 5 pages, ICSME 201

    From Surviving to Thriving: Evaluation of the International Diabetes Federation Life for a Child Program

    Get PDF
    IDF-LFAC aims to provide: (1) insulin and syringes; (2) blood glucose monitoring (BGM) equipment; (3) appropriate clinical care; (4) HbA1c testing; (5) diabetes education; and (6) technical support and training for health professionals, as well as 7) facilitating relevant clinical research, and where possible 8) assisting with capacity building. IDF-LFAC receives financial and in-kind support from private foundations, individuals, and corporations. Insulin and blood glucose monitoring equipment distribution is made possible by donations of insulin and the purchase of blood glucose monitors and strips at a reduced price from large pharmaceutical companies.The goal of this evaluation is to assess IDF-LFAC's organizational structure, strategic framework, processes, program impact, and potential to catalyze longterm sustainable improvements to T1D care delivery systems in its partner countries. LSHTM were commissioned to undertake the evaluation in 2014 when IDF-LFAC had active programs in 45 countries

    The accessibility dimension for structured document retrieval

    Get PDF
    Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values

    Silencing CHALCONE SYNTHASE in maize impedes the incorporation of tricin into lignin and increases lignin content

    Get PDF
    Lignin is a phenolic heteropolymer that is deposited in secondary-thickened cell walls, where it provides mechanical strength. A recent structural characterization of cell walls from monocot species showed that the flavone tricin is part of the native lignin polymer, where it is hypothesized to initiate lignin chains. In this study, we investigated the consequences of altered tricin levels on lignin structure and cell wall recalcitrance by phenolic profiling, nuclear magnetic resonance, and saccharification assays of the naturally silenced maize (Zea mays) C2-Idf (inhibitor diffuse) mutant, defective in the CHALCONE SYNTHASE Colorless2 (C2) gene. We show that the C2-Idf mutant produces highly reduced levels of apigenin-and tricin-related flavonoids, resulting in a strongly reduced incorporation of tricin into the lignin polymer. Moreover, the lignin was enriched in beta-beta and beta-5 units, lending support to the contention that tricin acts to initiate lignin chains and that, in the absence of tricin, more monolignol dimerization reactions occur. In addition, the C2-Idf mutation resulted in strikingly higher Klason lignin levels in the leaves. As a consequence, the leaves of C2-Idf mutants had significantly reduced saccharification efficiencies compared with those of control plants. These findings are instructive for lignin engineering strategies to improve biomass processing and biochemical production

    Are glucose profiles well-controlled within the targets recommended by the International Diabetes Federation in type 2 diabetes? A meta-analysis of results from continuous glucose monitoring based studies

    Get PDF
    AIMS: To assess continuous glucose monitoring (CGM) derived intra-day glucose profiles using global guideline for type 2 diabetes recommended by the International Diabetes Federation (IDF). METHODS: The Cochrane Library, MEDLINE, PubMed, CINAHL and Science Direct were searched to identify observational studies reporting intra-day glucose profiles using CGM in people with type 2 diabetes on any anti-diabetes agents. Overall and subgroup analyses were conducted to summarise mean differences between reported glucose profiles (fasting glucose, pre-meal glucose, postprandial glucose and post-meal glucose spike/excursion) and the IDF targets. RESULTS: Twelve observational studies totalling 731 people were included. Pooled fasting glucose (0.81 mmol/L, 95% CI, 0.53-1.09 mmol/L), postprandial glucose after breakfast (1.63 mmol/L, 95% CI, 0.79-2.48 mmol/L) and post-breakfast glucose spike (1.05 mmol/L, 95% CI, 0.13-1.96 mmol/L) were significantly higher than the IDF targets. Pre-lunch glucose, pre-dinner glucose and postprandial glucose after lunch and dinner were above the IDF targets but not significantly. Subgroup analysis showed significantly higher fasting glucose and postprandial glucose after breakfast in all groups: HbA1c <7% and ≥7% (53 mmol/mol) and duration of diabetes <10 years and ≥10 years. CONCLUSIONS: Independent of HbA1c, fasting glucose and postprandial glucose after breakfast are not well-controlled in type 2 diabetes

    Classification of metamorphic virus using n-grams signatures

    Get PDF
    Metamorphic virus has a capability to change, translate, and rewrite its own code once infected the system to bypass detection. The computer system then can be seriously damage by this undetected metamorphic virus. Due to this, it is very vital to design a metamorphic virus classification model that can detect this virus. This paper focused on detection of metamorphic virus using Term Frequency Inverse Document Frequency (TF-IDF) technique. This research was conducted using Second Generation virus dataset. The first step is the classification model to cluster the metamorphic virus using TF-IDF technique. Then, the virus cluster is evaluated using Naïve Bayes algorithm in terms of accuracy using performance metric. The types of virus classes and features are extracted from bi-gram assembly language. The result shows that the proposed model was able to classify metamorphic virus using TF-IDF with optimal number of virus class with average accuracy of 94.2%
    • …
    corecore