28 research outputs found

    2023 SPARC Book Of Abstracts

    Get PDF

    Ensuring Reliability in Deduplicated Data by Erasure Coded Replication

    No full text
    As computer systems are taking more and more responsibilities in critical processes, the demand for storage is increasing due to widespread applications. Saving the digital information in a large disk is expensive and unreliable. As a result, if the disk fails all the data is lost. Therefore, the yearning for a better understanding of the system’s reliability is ever increasing. In greatest hit storage environments, deduplication is applied as an effective technique to optimize the storage space utilization. Usually, the data deduplication impacts the bad result for the reliability of the storage system because of the information sharing.In this paper, reliability guaranteed deduplication algorithm is proposed by considering reliability during the deduplication process. The deduplicated data are distributed to the storage pool by applying the consistent hash ring as a replicas placement strategy. The proposed mechanism is evaluated and the result is compared with pure replication and erasure coded replication. The proposed mechanism can provide the better storage utilization and the one hundred percent of assurance for demanded reliability level in compared with the existing systems

    Myanmar Compound Word Errors Detection and Suggestion Generation

    No full text
    In Myanmar Language, Pronunciation andorthography has differences because spelling is oftennot an accurate reflection of pronunciation.Pronunciation of a word may lead to misspelling. In thispaper, we present Myanmar Compound misused wordserrors detection and suggestion generation system. Inthis paper, we propose Myanmar compound misusedword error detection algorithm to apply in Myanmarspell checker. After detecting Myanmar compoundmisused words errors, we provide suggestion list, whichimplemented with two methods: Cosine Similarity andLevenshtein Distance Algorithm. To evaluate theefficiency of the system, we tested with various types ofMyanmar sentences which contain various types of spellerrors. According to the evaluation results, our proposedsystem achieves promising accuracy (over 90 %) forMyanmar compound misused words errors detection. Byanalyzing these two results, we found that LevenshteinDistance Algorithm can provide better relevantsuggestion list

    Investigation of Android Device for Discovering Hadoop Cloud Storage Artifacts

    No full text
    Hadoop Cloud Storage has been embraced byboth individuals and organizations as it can offercost-effective, large capacity storage and multifunctionalservices on a wide range of device. It isfast raising popularity to access Hadoop Cloudservices via Android device. The widespread usage ofHadoop Cloud Storage could create the environmentthat is potentially conducive to malicious activitiesand illegal operations. Thus, the investigation ofHadoop Cloud presents the emerging challenge forthe digital forensic community. Extracting residualartifacts from the cloud server is potentially difficultdue to privacy policies followed by cloud providers.The attached Android device may store usefulartifacts to investigate the illegal usages of HadoopCloud Storage. This paper utilizes ClouderaDistribution Hadoop (CDH); a popular HadoopCloud Storage. This paper conducts a preliminaryinvestigation to locate and extract the residualartifacts from Android device that has accessed theCDH Cloud. The extracted artifacts can assist theforensic examiners in real world Hadoop Cloudforensics. The crime scenario which is extended theForensic Copra’s crime case is examined under theguide of CDH Forensic Investigation Framework

    Forensic Investigation on Hadoop Hortonworks Data Platform

    No full text
    Nowadays Hadoop becomes a popularbusiness paradigm for managing and storing the BigData. It is possible for malicious users to abuse BigData storage system and the number of illegal usageson them has increased rapidly. Hadoop Big Datastorage system is an emerging challenge to theforensic investigators. Therefore procedures forforensic investigation of Hadoop are necessary.Forensic investigation may take time withoutknowing where the data remnants can reside. Thispaper proposes a forensic investigation processmodel for Hadoop storage of Hortonworks DataPlatform (HDP) and discovers the important dataremnants. The investigation scope contains not onlyon Hadoop Server but also the attached clientmachines. The resulting data remnants fromconducting forensic investigation research onHadoop HDP assist the forensics examiners andpractitioners for generating the evidences. We alsopresent the investigation of HDP with a crimescenario

    Cloud Infrastructure Resource Demand Prediction Model Using Parameter Optimization and Feature Selection

    No full text
    Cloud computing offer highly scalable, andeconomical infrastructure for promising heterogeneousplatforms and various applications. According to thegrowing demand nature of cloud infrastructureresources, the cloud providers face the challenge ofperforming the effective resource management. Thispaper presents the development of the CPU resourcedemand prediction model for cloud infrastructure toovercome the critical issue of the cloud providers forworkload forecasting and optimal resourcemanagement. The model is developed based on thepowerful machine learning technique, Random Forests(RF) algorithm via the real data center workload traces.To get the best prediction model by RF, the parameteroptimization is performed. Moreover, some features ofworkload traces cannot influence in prediction and alsogive overheads in model development time. So, thefeature selection is applied to extract the importantfeatures. The performance evaluation of the proposedmodel against four workload traces is also presented

    Ensuring Reliability in Deduplicated Data by Erasure Coded Replication

    No full text
    As computer systems are taking more and moreresponsibilities in critical processes, the demand forstorage is increasing due to widespread applications.Saving the digital information in a large disk isexpensive and unreliable. As a result, if the disk failsall the data is lost. Therefore, the yearning for abetter understanding of the system’s reliability is everincreasing. In greatest hit storage environments,deduplication is applied as an effective technique tooptimize the storage space utilization. Usually, thedata deduplication impacts the bad result for thereliability of the storage system because of theinformation sharing.In this paper, reliability guaranteed deduplicationalgorithm is proposed by considering reliabilityduring the deduplication process. The deduplicateddata are distributed to the storage pool by applyingthe consistent hash ring as a replicas placementstrategy. The proposed mechanism is evaluated andthe result is compared with pure replication anderasure coded replication. The proposed mechanismcan provide the better storage utilization and the onehundred percent of assurance for demandedreliability level in compared with the existing systems

    A Study on a Joint Deep Learning Model for Myanmar Text Classification

    No full text
    Text classification is one of the most criticalareas of research in the field of natural languageprocessing (NLP). Recently, most of the NLP tasksachieve remarkable performance by using deeplearning models. Generally, deep learning modelsrequire a huge amount of data to be utilized. Thispaper uses pre-trained word vectors to handle theresource-demanding problem and studies theeffectiveness of a joint Convolutional Neural Networkand Long Short Term Memory (CNN-LSTM) forMyanmar text classification. The comparativeanalysis is performed on the baseline ConvolutionalNeural Networks (CNN), Recurrent Neural Networks(RNN) and their combined model CNN-RNN
    corecore