100 research outputs found

    Assessing malware detection using hardware performance counters

    Get PDF
    Despite the use of modern anti-virus (AV) software, malware is a prevailing threat to today's computing systems. AV software cannot cope with the increasing number of evasive malware, calling for more robust malware detection techniques. Out of the many proposed methods for malware detection, researchers have suggested microarchitecture-based mechanisms for detection of malicious software in a system. For example, Intel embeds a shadow stack in their modern architectures that maintains the integrity between function calls and their returns by tracking the function's return address. Any malicious program that exploits an application to overflow the return addresses can be restrained using the shadow stack. Researchers also propose the use of Hardware Performance Counters (HPCs). HPCs are counters embedded in modern computing architectures that count the occurrence of architectural events, such as cache hits, clock cycles, and integer instructions. Malware detectors that leverage HPCs create a profile of an application by reading the counter values periodically. Subsequently, researchers use supervised machine learning-based (ML) classification techniques to differentiate malicious profiles amongst benign ones. It is important to note that HPCs count the occurrence of microarchitectural events during execution of the program. However, whether a program is malicious or benign is the high-level behavior of a program. Since HPCs do not surveil the high-level behavior of an application, we hypothesize that the counters may fail to capture the difference in the behavioral semantics of a malicious and benign software. To investigate whether HPCs capture the behavioral semantics of the program, we recreate the experimental setup from the previously proposed systems. To this end, we leverage HPCs to profile applications such as MS-Office and Chrome as benign applications and known malware binaries as malicious applications. Standard ML classifiers demand a normally distributed dataset, where the variance is independent of the mean of the data points. To transform the profile into more normal-like distribution and to avoid over-fitting the machine learning models, we employ power transform on the profiles of the applications. Moreover, HPCs can monitor a broad range of hardware-based events. We use Principal Component Analysis (PCA) for selecting the top performance events that show maximum variation in the least number of features amongst all the applications profiled. Finally, we train twelve supervised machine learning classifiers such as Support Vector Machine (SVM) and MultiLayer Perceptron (MLPs) on the profiles from the applications. We model each classifier as a binary classifier, where the two classes are 'Benignware' and 'Malware.' Our results show that for the 'Malware' class, the average recall and F2-score across the twelve classifiers is 0.22 and 0.70 respectively. The low recall score shows that the ML classifiers tag malware as benignware. Even though we exercise a statistical approach for selecting our features, the classifiers are not able to distinguish between malware and benignware based on the hardware-based events monitored by the HPCs. The incapability of the profiles from HPCs in capturing the behavioral characteristic of an application force us to question the use of HPCs as malware detectors

    Automatic Generation of Training Corpus for Natural Language Processing Tasks

    Get PDF
    Machine learning models that perform grammar error correction (GEC) suffer from insufficient training data. This disclosure describes techniques that automatically generate a large corpus of training data for GEC and other natural language processing tasks. With specific user permission, the techniques leverage the edit histories of documents by identifying changes to documents attributable to grammatical corrections by users. The training set for the GEC machine learning model is automatically augmented by sentences known to be ungrammatical (e.g., original text, before revision by user) or grammatical (e.g., text after revision by user), and labeled as such. The techniques enable the provision of a very large corpus of training data for grammar error-correcting or other natural language processing ML models

    Situational analysis of ‘Virtual evaluation,’ amidst the COVID-19 pandemic for future exploration! - An experience from a single-center study

    Get PDF
    It was 30th January 2020 when India reported its first COVID 19 in Kerala. Soon, the pandemic of SARS-CoV-2 was inevitably knocking at the doors of the small hilly state of Himachal Pradesh (HP). On 20th March 2020, HP reported its first two cases of SARS-CoV-2 in Kangra district. Since then, although the COVID-19 pandemic was ensuing in the state, the epidemic was well contained due to the extensive collective efforts of the health department and other stakeholders. COVID 19 pandemic has emerged as a significant barrier, hampering all the regular activities and impacting all spheres of life. Particularly in HP, health care services were predominantly delivered through government services across the state.  As program managers, post-graduates of Community Medicine are the critical stakeholders of health care delivery at the peripheral level and integral implementers of Flu-Clinic, Contact Tracing, Surveillance and field survey teams. Ensuing pandemic enforces the need for the early placement of trained workforce in the periphery for apt delivery of specialized services amongst the community

    A study to determine socio demographic corelates of reproductive tract infection amongst women of reproductive age group

    Get PDF
    Background: Reproductive tract infection (RTI) is a public health problem, especially in developing country like India. The associated odium with this reproductive morbidity is often a stumbling block in seeking health care. The aim was to study the prevalence of RTI symptoms and its socio-demographic corelates.Methods: A cross-sectional study was undertaken in the rural field practice area of department of community medicine, Indira Gandhi Medical College, Shimla, Himachal Pradesh, India, from July 2018 to September 2018. Total sample size calculated was 410. Random sampling was used to select eligible couple to whom a predesigned, pretested, semi-structured and anonymous interview schedule was administered after taking consent.Results: The prevalence of self-reported reproductive tract infections was found to be 41.2%. The prevalence was more in lower socio-economic classes, and it was statistically significant. Other socio-demographic corelates (age, education, occupation) did not showed any significant association.Conclusions: The reproductive tract infections prevalence is found to be considerably high in the women of reproductive age group. The frequency was higher among multigravida women and those using cloth during menstrual periods. RTIs are usually spurned by women and even the health care providers, so there is a need to give due consideration to this aspect of reproductive health

    Seroprevalence trends of transfusion transmitted infections among blood donors in a tertiary care hospital of Himachal Pradesh, India

    Get PDF
    Background: Transfusion transmitted infections (TTIs) is a major concern for patients and physicians worldwide. Blood banks in all health care institutions worldwide screen blood for TTIs and ensure that only non-reactive blood is released for clinical use. The present study aimed to study the seroprevalence and trends of transfusion transmitted infections in blood donors in Shimla district of Himachal Pradesh, India.Methods: retrospective review of blood donor’s hospital records (replacement donors and as voluntary donors) covering the period January 2008 to December 2014 was conducted. The serological results for Hepatitis B, Hepatitis C, HIV, syphilis and malaria were retrieved.Results: A total of 39,083 blood donors of both sexes attended the blood bank during this period. Overall, HBC, HIV, syphilis and malaria rate for blood donors was found to be 0.45%. 0.16%, 0.08%, 0.07% and 0.003% respectively. There is a downward trend in sereoprevalence of all screened TTIs namely HBV, HCV, HIV and syphilis and malaria from 2008-2011.Conclusions: The study exhibits that over a period of years there is rise in voluntary blood donations which is heartening and encouraging. Trend analysis for prevalence TTIs among blood donors has shown a decreasing trend. It is recommended that continual quality assured screening of donated blood should be carried out as per the prescribed norms to deal with acquired TTI's.

    Hirayama disease : neutral and flexion magnetic resonance imaging and utility of inter-segmental angle of flexion

    Get PDF
    Purpose: Hirayama disease (HD) is a rare disease that was commonly mis-diagnosed in the past. The importance of neutral and flexion magnetic resonance imaging (MRI) in its accurate diagnosis has been emphasized along with utility of the inter-segmental angle of flexion. Aim of the study was to observe MRI findings of HD in neutral and flexion position and measure the inter-segmental angle of flexion. Material and methods: Cervical MR images of 17 patients of suspected HD were evaluated retrospectively for loss of attachment (LOA) of posterior dura, lower cervical cord atrophy, T2 hyperintensity, loss of cervical lordosis, enhancement of posterior epidural venous plexus, and inter-segmental angle of flexion on neutral and flexion MRIs. Results: Flexion MRI showed LOA of posterior dura (most commonly and maximum at C6 vertebral level) and intense enhancement in posterior epidural space in almost all patients. The mean inter-segmental angle of flexion at C5-C6 was 9.2°, and at C6-C7 it was 6°. Neutral MRI revealed LOA in 64.7%, lower cervical cord atrophy in all patients, T2 hyperintensity in the lower cervical cord in 35.2% of patients, and loss of cervical lordosis in 58.8% of patients. Conclusions: Flexion MRI is the gold standard for diagnosis of HD; however, certain imaging attributes, i.e. loss of attachment of posterior dura, asymmetrical lower cervical cord atrophy, T2 hyperintensity, and loss of cervical lordosis, can be seen on neutral MRI as well, which subsequently prompts the radiologist to include flexion MRI for confirmation. The inter-segmental angle of flexion is increased in patients with HD, which plays a role in planning timely surgical intervention

    Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit Posts

    Get PDF
    Conversational Agents (CAs) powered with deep language models (DLMs) have shown tremendous promise in the domain of mental health. Prominently, the CAs have been used to provide informational or therapeutic services (e.g., cognitive behavioral therapy) to patients. However, the utility of CAs to assist in mental health triaging has not been explored in the existing work as it requires a controlled generation of follow-up questions (FQs), which are often initiated and guided by the mental health professionals (MHPs) in clinical settings. In the context of `depression\u27, our experiments show that DLMs coupled with process knowledge in a mental health questionnaire generate 12.54% and 9.37% better FQs based on similarity and longest common subsequence matches to questions in the PHQ-9 dataset respectively, when compared with DLMs without process knowledge support. Despite coupling with process knowledge, we find that DLMs are still prone to hallucination, i.e., generating redundant, irrelevant, and unsafe FQs. We demonstrate the challenge of using existing datasets to train a DLM for generating FQs that adhere to clinical process knowledge. To address this limitation, we prepared an extended PHQ-9 based dataset, PRIMATE, in collaboration with MHPs. PRIMATE contains annotations regarding whether a particular question in the PHQ-9 dataset has already been answered in the user\u27s initial description of the mental health condition. We used PRIMATE to train a DLM in a supervised setting to identify which of the PHQ-9 questions can be answered directly from the user\u27s post and which ones would require more information from the user. Using performance analysis based on MCC scores, we show that PRIMATE is appropriate for identifying questions in PHQ-9 that could guide generative DLMs towards controlled FQ generation (with minimal hallucination) suitable for aiding triaging. The dataset created as a part of this research can be obtained from: https://github.com/primate-mh/Primate2022
    • …
    corecore