170 research outputs found
Advances of Machine Learning in Materials Science: Ideas and Techniques
In this big data era, the use of large dataset in conjunction with machine
learning (ML) has been increasingly popular in both industry and academia. In
recent times, the field of materials science is also undergoing a big data
revolution, with large database and repositories appearing everywhere.
Traditionally, materials science is a trial-and-error field, in both the
computational and experimental departments. With the advent of machine
learning-based techniques, there has been a paradigm shift: materials can now
be screened quickly using ML models and even generated based on materials with
similar properties; ML has also quietly infiltrated many sub-disciplinary under
materials science. However, ML remains relatively new to the field and is
expanding its wing quickly. There are a plethora of readily-available big data
architectures and abundance of ML models and software; The call to integrate
all these elements in a comprehensive research procedure is becoming an
important direction of material science research. In this review, we attempt to
provide an introduction and reference of ML to materials scientists, covering
as much as possible the commonly used methods and applications, and discussing
the future possibilities.Comment: 80 pages; 22 figures. To be published in Frontiers of Physics, 18,
xxxxx, (2023
Learning Persistent Community Structures in Dynamic Networks via Topological Data Analysis
Dynamic community detection methods often lack effective mechanisms to ensure
temporal consistency, hindering the analysis of network evolution. In this
paper, we propose a novel deep graph clustering framework with temporal
consistency regularization on inter-community structures, inspired by the
concept of minimal network topological changes within short intervals.
Specifically, to address the representation collapse problem, we first
introduce MFC, a matrix factorization-based deep graph clustering algorithm
that preserves node embedding. Based on static clustering results, we construct
probabilistic community networks and compute their persistence homology, a
robust topological measure, to assess structural similarity between them.
Moreover, a novel neural network regularization TopoReg is introduced to ensure
the preservation of topological similarity between inter-community structures
over time intervals. Our approach enhances temporal consistency and clustering
accuracy on real-world datasets with both fixed and varying numbers of
communities. It is also a pioneer application of TDA in temporally persistent
community detection, offering an insightful contribution to field of network
analysis. Code and data are available at the public git repository:
https://github.com/kundtx/MFC_TopoRegComment: AAAI 202
Creativity and Machine Learning: a Survey
There is a growing interest in the area of machine learning and creativity.
This survey presents an overview of the history and the state of the art of
computational creativity theories, machine learning techniques, including
generative deep learning, and corresponding automatic evaluation methods. After
presenting a critical discussion of the key contributions in this area, we
outline the current research challenges and emerging opportunities in this
field.Comment: 25 pages, 3 figures, 2 table
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Real-world natural language processing systems need to be robust to human
adversaries. Collecting examples of human adversaries for training is an
effective but expensive solution. On the other hand, training on synthetic
attacks with small perturbations - such as word-substitution - does not
actually improve robustness to human adversaries. In this paper, we propose an
adversarial training framework that uses limited human adversarial examples to
generate more useful adversarial examples at scale. We demonstrate the
advantages of this system on the ANLI and hate speech detection benchmark
datasets - both collected via an iterative, adversarial
human-and-model-in-the-loop procedure. Compared to training only on observed
human attacks, also training on our synthetic adversarial examples improves
model robustness to future rounds. In ANLI, we see accuracy gains on the
current set of attacks (44.1%50.1%) and on two future unseen rounds of
human generated attacks (32.5%43.4%, and 29.4%40.2%). In hate
speech detection, we see AUC gains on current attacks (0.76 0.84) and a
future round (0.77 0.79). Attacks from methods that do not learn the
distribution of existing human adversaries, meanwhile, degrade robustness
- β¦