337,380 research outputs found

    The Google Similarity Distance

    Full text link
    Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers the equivalent of `society' is `database,' and the equivalent of `use' is `way to search the database.' We present a new theory of similarity between words and phrases based on information distance and Kolmogorov complexity. To fix thoughts we use the world-wide-web as database, and Google as search engine. The method is also applicable to other search engines and databases. This theory is then applied to construct a method to automatically extract similarity, the Google similarity distance, of words and phrases from the world-wide-web using Google page counts. The world-wide-web is the largest database on earth, and the context information entered by millions of independent users averages out to provide automatic semantics of useful quality. We give applications in hierarchical clustering, classification, and language translation. We give examples to distinguish between colors and numbers, cluster names of paintings by 17th century Dutch masters and names of books by English novelists, the ability to understand emergencies, and primes, and we demonstrate the ability to do a simple automatic English-Spanish translation. Finally, we use the WordNet database as an objective baseline against which to judge the performance of our method. We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87% with the expert crafted WordNet categories.Comment: 15 pages, 10 figures; changed some text/figures/notation/part of theorem. Incorporated referees comments. This is the final published version up to some minor changes in the galley proof

    Quantum-inspired algorithm for direct multi-class classification

    Get PDF
    Over the last few decades, quantum machine learning has emerged as a groundbreaking discipline. Harnessing the peculiarities of quantum computation for machine learning tasks offers promising advantages. Quantum-inspired machine learning has revealed how relevant benefits for machine learning problems can be obtained using the quantum information theory even without employing quantum computers. In the recent past, experiments have demonstrated how to design an algorithm for binary classification inspired by the method of quantum state discrimination, which exhibits high performance with respect to several standard classifiers. However, a generalization of this quantuminspired binary classifier to a multi-class scenario remains nontrivial. Typically, a simple solution in machine learning decomposes multi-class classification into a combinatorial number of binary classifications, with a concomitant increase in computational resources. In this study, we introduce a quantum-inspired classifier that avoids this problem. Inspired by quantum state discrimination, our classifier performs multi-class classification directly without using binary classifiers. We first compared the performance of the quantum-inspired multi-class classifier with eleven standard classifiers. The comparison revealed an excellent performance of the quantum-inspired classifier. Comparing these results with those obtained using the decomposition in binary classifiers shows that our method improves the accuracy and reduces the time complexity. Therefore, the quantum-inspired machine learning algorithm proposed in this work is an effective and efficient framework for multi-class classification. Finally, although these advantages can be attained without employing any quantum component in the hardware, we discuss how it is possible to implement the model in quantum hardware

    Arts, Computers and Artificial Intelligence

    Get PDF
    Science and art seem to belong to different cultures. Science and technology, mainly the products of the intellect, use terminology and vocabulary that are concise and well defined. In contrast, in artistic expression, ambiguity is a powerful component. Still the relationship between these two different categories of human activity is interesting and fascinating. In this paper, a general comparison of these two disciplines will be introduced. Then the possibility of mechanical creation of art using computers and artificial intelligence will be discussed. This will be followed by two techniques which are used to create poetry and music. First, a statistical approach for mechanical composition of music will be presented. This method uses parameters of existing music to create similar music. Second, a method of mechanical composition of poetry will be presented which combines linguistic models, a classification dictionary and semantic information

    A NEW MODEL ON BENTHIC FORAMINIFER IMAGE CLASSIFICATION AND DEFINITIONS BASED ON CONVENTIONAL NEURAL NETWORK (CNN)

    Get PDF
    Fossil studies are of great importance in order to observe the change of living species over the years, to make inferences by using the information provided by the observed species, and to understand the developing and changing structure of the world we live in over the years. However, the examination and interpretation of fossil specimens is a complex and long process. Artificial intelligence studies have begun to be applied to this field in order to facilitate the working methods of paleontologists. The detection and classification of fossil specimens with the aid of computers simplifies this process as much as possible compared to manual classification processes and reduces foreign dependency for fossil assemblages for which paleontologists are not experts. To achieve this, 9 benthic foraminiferal species and non-foraminiferal sample photographs from a selected dataset were used. In this study, a new method developed for the classification of benthic foraminifera using deep convolutional neural networks, reaching higher accuracy than the results in the literature, is presented. With this method, at least 70% accuracy rates were achieved in the test results of the trained system. This study, which reached high accuracy rates with a new method, has created a successful development for the branch of paleontology in the use of artificial intelligence in microfossil identification

    ANALISIS POSTUR KERJA KARYAWAN KANTOR PADA PT XZ

    Get PDF
    Ergonomics is a systematic branch of science to utilize information about human nature, human capabilities and limitations to design effective, safe and comfortable work systems. Ergonomics includes many things related to employee work, one of which is office ergonomics which includes the entire work environment and work tools related to computers, chairs and others. High demands for office employees at PT. XZ requires employees to work for quite a long time where based on existing surveys, it is found that office workers spend more than 75% of their working time sitting in front of the computer. Jobs like this are related to several ergonomic risks felt by employees, so it is necessary to measure the level of ergonomics risk in office employees at PT. XZ. Rapid Office Strain Assessment (ROSA) is a rapid analysis to measure work risks associated with the use of computers where this method of assessment is designed to measure the risk of worker injury and determine the level of change action based on reports of worker discomfort. From the results of the study it was found that complaints on the employee's body using the CMDQ questionnaire showed that 5 employees felt the most complaints in the lower back by 28.5%, the neck 21%, the upper back 18% and finally the hips / buttocks by 12 ,8%. From the analysis of work posture using the ROSA method, the final score of the five employees is the same, namely 5, which means that it is included in the warning level classification so that it is necessary to improve work posture according to the setting procedure for computer work stations, namely by paying attention to chair height, elbow position, monitor surface distance, monitor height, computer surface position, back and forth backrest, telephone distance, wrist angle, and mouse position

    Recognition of prokaryotic promoters based on a novel variable-window Z-curve method

    Get PDF
    Transcription is the first step in gene expression, and it is the step at which most of the regulation of expression occurs. Although sequenced prokaryotic genomes provide a wealth of information, transcriptional regulatory networks are still poorly understood using the available genomic information, largely because accurate prediction of promoters is difficult. To improve promoter recognition performance, a novel variable-window Z-curve method is developed to extract general features of prokaryotic promoters. The features are used for further classification by the partial least squares technique. To verify the prediction performance, the proposed method is applied to predict promoter fragments of two representative prokaryotic model organisms (Escherichia coli and Bacillus subtilis). Depending on the feature extraction and selection power of the proposed method, the promoter prediction accuracies are improved markedly over most existing approaches: for E. coli, the accuracies are 96.05% (σ70 promoters, coding negative samples), 90.44% (σ70 promoters, non-coding negative samples), 92.13% (known sigma-factor promoters, coding negative samples), 92.50% (known sigma-factor promoters, non-coding negative samples), respectively; for B. subtilis, the accuracies are 95.83% (known sigma-factor promoters, coding negative samples) and 99.09% (known sigma-factor promoters, non-coding negative samples). Additionally, being a linear technique, the computational simplicity of the proposed method makes it easy to run in a matter of minutes on ordinary personal computers or even laptops. More importantly, there is no need to optimize parameters, so it is very practical for predicting other species promoters without any prior knowledge or prior information of the statistical properties of the samples

    A descriptive review and classification of organizational information security awareness research

    Get PDF
    Information security awareness (ISA) is a vital component of information security in organizations. The purpose of this research is to descriptively review and classify the current body of knowledge on ISA. A sample of 59 peer-reviewed academic journal articles, which were published over the last decade from 2008 to 2018, were analyzed. Articles were classified using coding techniques from the grounded theory literature-review method. The results show that ISA research is evolving with behavioral research studies still being explored. Quantitative empirical research is the dominant methodology and the top three theories used are general deterrence theory, theory of planned behavior, and protection motivation theory. Future research could focus on qualitative approaches to provide greater depth of ISA understanding
    corecore