320 research outputs found

    The Structural Information Filtered Features Potential for Machine Learning calculations of energies and forces of atomic systems.

    Get PDF
    In the last ten years, machine learning potentials have been successfully applied to the study of crystals, and molecules. However, more complex materials like clusters, macro-molecules, and glasses are out reach of current methods. The input of any machine learning system is a tensor of features (the most universal type are rank 1 tensors or vectors of features), the quality of any machine learning system is directly related to how well the feature space describes the original physical system. So far, the feature engineering process for machine learning potentials can not describe complex material. The current methods are highly inefficient transforming the information of the physical structure into the feature vector, the losses of information constraint the accuracy of machine learning potentials. This work introduces the Structural Information Filtered Features (SIFF), the SIFF is a feature engineering method, based on maximizing the transfer of information from the physical structure to the feature space. The SIFF are thought as a universal feature, universal in two senses. First is able to describe complex systems, as well as molecules, and crystals. Second it can be easily used as input for any machine learning algorithm. When applied to crystals the SIFF does as well as the best feature engineering methods for this materials (SOAP, CGNN). When applied to molecules the SIFF performs better than the Bag of Bonds method, especially when the number of structures is reduced to less than 10000, in this conditions the SIFF shows a superior performance, due to its superior information transference. Whit respect to complex system, the SIFF is compared to the Behler and Parrinello approach, here the SIFF method reach an error of 0.083 eV/structure in 18110 second, in contrast the Behler and Parrinello method achieved and error of 0.109 eV/structure in 61969 seconds. The main disadvantage of the SIFF method is that the conventionality of the feature space grows exponentially with the number of chemical species in the system

    Towards Computational Assessment of Idea Novelty

    Get PDF
    In crowdsourcing ideation websites, companies can easily collect large amount of ideas. Screening through such volume of ideas is very costly and challenging, necessitating automatic approaches. It would be particularly useful to automatically evaluate idea novelty since companies commonly seek novel ideas. Three computational approaches were tested, based on Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA) and term frequency–inverse document frequency (TF-IDF), respectively. These three approaches were used on three set of ideas and the computed idea novelty was compared with human expert evaluation. TF-IDF based measure correlated better with expert evaluation than the other two measures. However, our results show that these approaches do not match human judgement well enough to replace it

    Comparing Pineapples with Lilikois: An Experimental Analysis of the Effects of Idea Similarity on Evaluation Performance in Innovation Contests

    Get PDF
    Identifying promising ideas from large innovation contests is challenging. Evaluators do not perform well when selecting the best ideas from large idea pools as their information processing capabilities are limited. Therefore, it seems reasonable to let crowds evaluate subsets of ideas to distribute efforts among the many. One meaningful approach to subset creation is to draw ideas into subsets according to their similarity. Whether evaluation based on subsets of similar ideas is better than compared to subsets of random ideas is unclear. We employ experimental methods with 66 crowd workers to explore the effects of idea similarity on evaluation performance and cognitive demand. Our study contributes to the understanding of idea selection by providing empirical evidence that crowd workers presented with subsets of similar ideas experience lower cognitive effort and achieve higher elimination accuracy than crowd workers presented with subsets of random ideas. Implications for research and practice are discussed
    • …
    corecore