230 research outputs found

    Scalable Techniques for Similarity Search

    Get PDF
    Document similarity is similar to the nearest neighbour problem and has applications in various domains. In order to determine the similarity / dissimilarity of the documents first they need to be converted into sets containing shingles. Each document is converted into k-shingles, k being the length of each shingle. The similarity is calculated using Jaccard distance between sets and output into a characteristic matrix, the complexity to parse this matrix is significantly high especially when the sets are large. In this project we explore various approaches such as Min hashing, LSH & Bloom Filter to decrease the matrix size and to improve the time complexity. Min hashing creates a signature matrix which significantly smaller compared to a characteristic matrix. In this project we will look into Min-Hashing implementation, pros and cons. Also we will explore Locality Sensitive Hashing, Bloom Filters and their advantages

    Visualizing Bags of Vectors

    Full text link
    The motivation of this work is two-fold - a) to compare between two different modes of visualizing data that exists in a bag of vectors format b) to propose a theoretical model that supports a new mode of visualizing data. Visualizing high dimensional data can be achieved using Minimum Volume Embedding, but the data has to exist in a format suitable for computing similarities while preserving local distances. This paper compares the visualization between two methods of representing data and also proposes a new method providing sample visualizations for that method

    Application of Adversarial Attacks on Malware Detection Models

    Get PDF
    Malware detection is vital as it ensures that a computer is safe from any kind of malicious software that puts users at risk. Too many variants of these malicious software are being introduced everyday at increased speed. Thus, to guarantee security of computer systems, huge advancements in the field of malware detection are made and one such approach is to use machine learning for malware detection. Even though machine learning is very powerful, it is prone to adversarial attacks. In this project, we will try to apply adversarial attacks on malware detection models. To perform these attacks, fake samples that are generated using Generative Adversarial Networks (GAN) algorithm are used and these fake malware data along with the actual data is given to a machine learning model for malware detection. Here, we will also be experimenting with the percentage of fake malware samples to be considered and observe the behavior of the model according to the given input. The novelty of this project is given by the use of adversarial samples that are generated by the implementation of word embeddings produced by our generative algorithms

    DEVELOPMENT OF SITE - SPECIFIC DRUG DELIVERY SYSTEMS USING HOT MELT EXTRUSION AND FUSED DEPOSITION MODELING 3D PRINTING

    Get PDF
    Due to preferential site of drug absorption and the need for increased concentration of medication at the required tissues, dosage forms should be designed or formulated in such a way that medication is released at a specific site or after a specific time in the gastrointestinal tract (GIT). Knowledge of transit times of dosage forms in each part of GIT, by use of particular polymers or employing specific delivery systems such as floating systems, delivery of medication to specific sites in the GIT can be achieved. HME coupled FDM 3D printing has the capability to create customized dosage forms for personalized pharmacotherapy with its ability to produce dosage forms with complex structures, customized shapes, and sizes. Chronotherapy deals with synchronizing drug delivery with the body’s circadian rhythm to optimize therapeutic efficacy and minimize side effects. Using the advantage of pH-dependent solubility of Eudragit S100 (ES100) (as an enteric polymer that solubilizes and releases the drug at above pH 7), a chronotherapeutic drug delivery system for KTP and IBU was successfully developed for the treatment of arthritis conditions in the early morning hours. The drug release studies conducted in different media showed the desired lag time and release characteristics. Maintaining a constant plasma drug concentration is not beneficial in all disease conditions. Some diseases may require pulse delivery of drugs to avoid unwanted adverse effects and drug exposure. Various biological factors influence the transit time of drugs in the upper gastrointestinal tract and possess a challenge to the drugs that are locally active in the stomach, unstable at a high pH, or poorly soluble in the lower parts of the gastrointestinal tract. To overcome these issues, a floating pulsatile system was developed which showed a high potential to deliver drugs that need high residence time in the stomach and the pulsatile release of theophylline. Quality by design (QbD) is defined as a systematic approach to development that begins with predefined objectives and emphasizes product and process understanding based on sound science and quality risk management. QbD is combined with FDM 3D printing to develop personalized dosage forms for patient-centric pharmacotherapy

    VOICE OVER LONG TERM EVOLUTION SERVICE QUALITY MEASUREMENTS AND DIAGNOSTICS USING MACHINE LEARNING TECHNIQUES

    Get PDF
    Techniques are described for a machine learning based Voice over Long Term Evolution (VoLTE) trouble-shooting/diagnostic approach which can look at various data sources in the mobile packet core and identify the key issues by observations and correlations across data fields using machine learning techniques. It helps mobile operators to quickly identify fault domains in VoLTE calls and take corrective actions to enhance customer Quality of Experience (QoE)

    R20. Development of sustained release gastroretentive floating tablets using HME coupled 3D printing: A QbD approach

    Get PDF
    Corresponding author (Pharmaceutics and Drug delivery): Nagireddy Dumpa, [email protected]://egrove.olemiss.edu/pharm_annual_posters/1019/thumbnail.jp

    StoryNet: A 5W1H-based knowledge graph to connect stories

    Get PDF
    Title from PDF of title page viewed January 19, 2022Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (page 149-164)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2021Stories are a powerful medium through which the human community has exchanged information since the dawn of the information age. They have taken multiple forms like articles, movies, books, plays, short films, magazines, mythologies, etc. With the ever-growing complexity of information representation, exchange, and interaction, it became highly important to find ways that convey the stories more effectively. With a world that is diverging more and more, it is harder to draw parallels and connect the information from all around the globe. Even though there have been efforts to consolidate the information on a large scale like Wikipedia, Wiki Data, etc, they are devoid of any real-time happenings. With the recent advances in Natural Language Processing (NLP), we propose a framework to connect these stories together making it easier to find the links between them thereby helping us understand and explore the links between the stories and possibilities that revolve around them. Our framework is based on the 5W + 1H (What, Who, Where, When, Why, and How) format that represents stories in a format that is both easily understandable by humans and accurately generated by the deep learning models. We have used 311 calls and cyber security datasets as case studies for which a few NLP techniques like classification, Topic Modelling, Question Answering, and Question Generation were used along with the 5W1H framework to segregate the stories into clusters. This is a generic framework and can be used to apply to any field. We have evaluated two approaches for generating results - training-based and rule-based. For the rule-based approach, we used Stanford NLP parsers to identify patterns for the 5W + 1H terms, and for the training based approach, BERT embeddings were used and both were compared using an ensemble score (average of CoLA, SST-2, MRPC, QQP, STS-B, MNLI, QNLI, and RTE) along with BLEU and ROUGE scores. A few approaches are studied for training-based analysis - using BERT, Roberta, XLNet, ALBERT, ELECTRA, and AllenNLP Transformer QA with the datasets - CVE, NVD, SQuAD v1.1, and SQuAD v2.0, and compared them with custom annotations for identifying 5W + 1H. We've presented the performance and accuracy of both approaches in the results section. Our method gave a boost in the score from 30% (baseline) to 91% when trained on the 5W+1H annotations.Introduction -- Related work -- The 5W1H Framework and the models included -- StoryNet Application: Evaluation and Results -- Conclusion and Future Wor

    R15. Development and characterization of hot-melt extruded ocular inserts of moxifloxacin for bacterial keratitis

    Get PDF
    Corresponding author (Pharmaceutics and Drug delivery): Ruchi Thakkar, [email protected]://egrove.olemiss.edu/pharm_annual_posters/1014/thumbnail.jp

    Qualitative ultrastructural changes and morphometry of deccani sheep spermatozoa preserved with egg yolk citrate extender

    Get PDF
    The present investigation was aimed to study the sequential changes in the sperm cell deterioration dur-ing liquid storage of Deccani sheep breed semen from dilution to 48 h of storage along with its seminal characteris-tics and sperm morphometric measurement. Thus the two Deccani adult rams (aged 2 years), were selected (six ejaculates/each ram) and the collected semen was diluted with Egg yolk citrate extender (EYC) (final concentration - 400 million spermatozoa/0.2 ml semen).Seminal characteristics were assessed along with sperm morphologi-cal changes by Electron microscopy immediately after dilution, at 24 and 48 h of storage, respectively. Sperm morphometry was analysed by Image analysis.The percentage of Individual motility, Live spermatozoa, Acrosomal integrity and Hos-test reactive sperm decreased significantly (P<0.05) from 80.41 to 49.16%, 82.75 to 51.25%, 94.16 to 83% and 76 to 48.58%, respectively during liquid storage of semen from initial dilution to 48 h of storage. The sperm head length (?m), Head width, sperm head area (?m2), sperm head perimeter (?m), mid piece length (?m), proximal mid piece width (?m), distal midpiece width (?m), volume of mid piece (?m3) and acrosomal cap length (?m) were 7.80, 4.33, 26.84, 20.63, 14.03, 0.74, 0.51, 4.54 and 5.24, respectively. Electron microscopic qualitative evaluation revealed that the main site of injury is the apical ridge of ram spermatozoa when stored at 5ºC. The electron density of the mitochondria reduced indicating concomittant depletion of ATP and loss of motility resulting in reduction of fertility

    A Sandbox Tool to Bias(Stress)-Test Fairness Algorithms

    Full text link
    Motivated by the growing importance of reducing unfairness in ML predictions, Fair-ML researchers have presented an extensive suite of algorithmic "fairness-enhancing" remedies. Most existing algorithms, however, are agnostic to the sources of the observed unfairness. As a result, the literature currently lacks guiding frameworks to specify conditions under which each algorithmic intervention can potentially alleviate the underpinning cause of unfairness. To close this gap, we scrutinize the underlying biases (e.g., in the training data or design choices) that cause observational unfairness. We present a bias-injection sandbox tool to investigate fairness consequences of various biases and assess the effectiveness of algorithmic remedies in the presence of specific types of bias. We call this process the bias(stress)-testing of algorithmic interventions. Unlike existing toolkits, ours provides a controlled environment to counterfactually inject biases in the ML pipeline. This stylized setup offers the distinct capability of testing fairness interventions beyond observational data and against an unbiased benchmark. In particular, we can test whether a given remedy can alleviate the injected bias by comparing the predictions resulting after the intervention in the biased setting with true labels in the unbiased regime -- that is, before any bias injection. We illustrate the utility of our toolkit via a proof-of-concept case study on synthetic data. Our empirical analysis showcases the type of insights that can be obtained through our simulations
    corecore