1,617 research outputs found

    A confidence level algorithm for the determination of absolute configuration using vibrational circular dichroism or raman optical activity

    Get PDF
    Spectral comparison is an important part of the assignment of the absolute configuration (AC) by vibrational circular dichroism (VCD), or equally by Raman optical activity (ROA). In order to avoid bias caused by personal interpretation, numerical methods have been developed to compare measured and calculated spectra. Using a neighbourhood similarity measure, the agreement between a computed and measured VCD or ROA spectrum is expressed numerically to introduce a novel confidence level measure. This allows users of vibrational optical activity (VOA) techniques (VCD and ROA) to assess the reliability of their assignment of the AC of a compound. To that end, a database of successful AC determinations is compiled along with neighbourhood similarity values between the experimental spectrum and computed spectra for both enantiomers. For any new AC determination, the neighbourhood similarities between the experimental spectrum and the computed spectra for both enantiomers are projected on the database allowing an interpretation of the reliability of their assignment

    Graph clustering and anomaly detection of access control log for forensic purposes

    Get PDF
    Attacks on operating system access control have become a significant and increasingly common problem. This type of security threat is recorded in a forensic artifact such as an authentication log. Forensic investigators will generally examine the log to analyze such incidents. An anomaly is highly correlated to an attacker's attempts to compromise the system. In this paper, we propose a novel method to automatically detect an anomaly in the access control log of an operating system. The logs will be first preprocessed and then clustered using an improved MajorClust algorithm to get a better cluster. This technique provides parameter-free clustering so that it automatically can produce an analysis report for the forensic investigators. The clustering results will be checked for anomalies based on a score that considers some factors such as the total members in a cluster, the frequency of the events in the log file, and the inter-arrival time of a specific activity. We also provide a graph-based visualization of logs to assist the investigators with easy analysis. Experimental results compiled on an open dataset of a Linux authentication log show that the proposed method achieved the accuracy of 83.14% in the authentication log dataset

    Auto Halal Detection Products Based on Euclidian Distance and Cosine Similarity

    Get PDF
    Although Indonesia is the world the world's most populous Muslim-majority country, the number of halal-certified products in Indonesia is only 20% of the products on the Indonesian market. Halal certification is voluntary as such there are many food products which are halal but are not certified as halal. In principle, these food products may have similar halal ingredients with halal-certified products.  In this study, we build a system that can compare products that have not been certified halal with halal certified products based on its ingredients.  The food products are collected from Open Food Facts, Institute  For  Foods,  Drugs,  And  Cosmetics Indonesian  Council  Of  Ulama (LPPOM MUI) and our halal system. As of this paper writing, the halal-certified products are obtained from LPPOM MUI.  The system uses the Euclidean Distance and Cosine Similarity that generate top-5 similar products. Those two similarity calculations are based on Term Frequency-Inverse Entity Frequency weighting function.  The weighting function calculates the frequency of a term on a product name and ingredients.  If a similarity value of a product with no halal certification and a halal-certified product is higher than 75%, then the former could be indicated as a halal product. In the end, the system can give a recommendation of unknown products from a related pool of halal-certified products based on similarity of product composition. Cosine similarity accuracy is higher than Euclidean Distance and MoreLikeThis accuracy. Cosine similarity gets the highest precision because the cosine similarity is based on the vector angle of the term in a product

    Quantitative Analysis of Evaluation Criteria for Generative Models

    Get PDF
    Machine Learning (ML) is rapidly becoming integrated in critical aspects of cybersecurity today, particularly in the area of network intrusion/anomaly detection. However, ML techniques require large volumes of data to be effective. The available data is a critical aspect of the ML process for training, classification, and testing purposes. One solution to the problem is to generate synthetic data that is realistic. With the application of ML to this area, one promising application is the use of ML to perform the data generation. With the ability to generate synthetic data comes the need to evaluate the “realness” of the generated data. This research focuses specifically on the problem of evaluating the evaluation criteria. Quantitative analysis of evaluation criteria is important so that future research can have quantitative evidence for the evaluation criteria they utilize. The goal of this research is to provide a framework that can be used to inform and improve the process of generating synthetic semi-structured sequential data. A series of experiments evaluating a chosen set of metrics on discriminative ability and efficiency is performed. This research shows that the choice of feature space in which distances are calculated in is critical. The ability to discriminate between real and generated data hinges on the space that the distances are calculated in. Additionally, the choice of metric significantly affects the sample distance distributions in a suitable feature space. There are three main contributions from this work. First, this work provides the first known framework for evaluating metrics for semi-structured sequential synthetic data generation. Second, this work provides a “black box” evaluation framework which is generator agnostic. Third, this research provides the first known evaluation of metrics for semi-structured sequential data

    Analytical and experimental study of mean flow and turbulence characteristics inside the passages of an axial flow inducer

    Get PDF
    The effort conducted to gather additional understanding of the complex inviscid and viscid effects existing within the passages of a three-bladed axial flow inducer operating at a flow coefficient of 0.065 is summarized. The experimental investigations included determination of the blade static pressure and blade limiting streamline angle distributions, and measurement of the three components of mean velocity, turbulence intensities and turbulence stresses at locations inside the inducer blade passage utilizing a rotating three-sensor hotwire probe. Applicable equations were derived for the hotwire data reduction analysis and solved numerically to obtain the appropriate flow parameters. Analytical investigations were conducted to predict the three-dimensional inviscid flow in the inducer by numerically solving the exact equations of motion, and to approximately predict the three-dimensional viscid flow by incorporating the dominant viscous terms into the exact equations. The analytical results are compared with the experimental measurements and design values where appropriate

    The role of social tags in web resource discovery:  an evaluation of user-generated keywords

    Get PDF
    Social tags are user generated metadata and play vital role in Information Retrieval (IR) of web resources. This study is an attempt to determine the similarities between social tags extracted from LibraryThing and Library of Congress Subject Headings (LCSH) for the titles chosen for study by adopting Cosine similarity method. The result shows that social tags and controlled vocabularies are not quite similar due to the free nature of social tags mostly assigned by users whereas controlled vocabularies are attributed by subject experts. In the context of information retrieval and text mining, the Cosine similarity is most commonly adopted method to evaluate the similarity of vectors as it provides an important measurement in terms of degree to know how similar two documents are likely to be in relation to their subject matter. The LibraryThing tags and LCSH are represented in vectors to measure Cosine similarity between them
    corecore