Search CORE

170 research outputs found

Advances of Machine Learning in Materials Science: Ideas and Techniques

Author: Chong Sue Sin
Ng Yi Sheng
Wang Hui-Qiong
Zheng Jin-Cheng
Publication venue
Publication date: 26/07/2023
Field of study

In this big data era, the use of large dataset in conjunction with machine learning (ML) has been increasingly popular in both industry and academia. In recent times, the field of materials science is also undergoing a big data revolution, with large database and repositories appearing everywhere. Traditionally, materials science is a trial-and-error field, in both the computational and experimental departments. With the advent of machine learning-based techniques, there has been a paradigm shift: materials can now be screened quickly using ML models and even generated based on materials with similar properties; ML has also quietly infiltrated many sub-disciplinary under materials science. However, ML remains relatively new to the field and is expanding its wing quickly. There are a plethora of readily-available big data architectures and abundance of ML models and software; The call to integrate all these elements in a comprehensive research procedure is becoming an important direction of material science research. In this review, we attempt to provide an introduction and reference of ML to materials scientists, covering as much as possible the commonly used methods and applications, and discussing the future possibilities.Comment: 80 pages; 22 figures. To be published in Frontiers of Physics, 18, xxxxx, (2023

arXiv.org e-Print Archive

Learning Persistent Community Structures in Dynamic Networks via Topological Data Analysis

Author: Kong Dexu
Li Yang
Zhang Anping
Publication venue
Publication date: 06/01/2024
Field of study

Dynamic community detection methods often lack effective mechanisms to ensure temporal consistency, hindering the analysis of network evolution. In this paper, we propose a novel deep graph clustering framework with temporal consistency regularization on inter-community structures, inspired by the concept of minimal network topological changes within short intervals. Specifically, to address the representation collapse problem, we first introduce MFC, a matrix factorization-based deep graph clustering algorithm that preserves node embedding. Based on static clustering results, we construct probabilistic community networks and compute their persistence homology, a robust topological measure, to assess structural similarity between them. Moreover, a novel neural network regularization TopoReg is introduced to ensure the preservation of topological similarity between inter-community structures over time intervals. Our approach enhances temporal consistency and clustering accuracy on real-world datasets with both fixed and varying numbers of communities. It is also a pioneer application of TDA in temporally persistent community detection, offering an insightful contribution to field of network analysis. Code and data are available at the public git repository: https://github.com/kundtx/MFC_TopoRegComment: AAAI 202

arXiv.org e-Print Archive

Creativity and Machine Learning: a Survey

Author: Franceschelli Giorgio
Musolesi Mirco
Publication venue
Publication date: 06/04/2021
Field of study

There is a growing interest in the area of machine learning and creativity. This survey presents an overview of the history and the state of the art of computational creativity theories, machine learning techniques, including generative deep learning, and corresponding automatic evaluation methods. After presenting a critical discussion of the key contributions in this area, we outline the current research challenges and emerging opportunities in this field.Comment: 25 pages, 3 figures, 2 table

arXiv.org e-Print Archive

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

Author: Avrahami Thi
Balashankar Ananth
Beirami Ahmad
Beutel Alex
Chen Jilin
Sinha Aradhana
Publication venue
Publication date: 25/10/2023
Field of study

Real-world natural language processing systems need to be robust to human adversaries. Collecting examples of human adversaries for training is an effective but expensive solution. On the other hand, training on synthetic attacks with small perturbations - such as word-substitution - does not actually improve robustness to human adversaries. In this paper, we propose an adversarial training framework that uses limited human adversarial examples to generate more useful adversarial examples at scale. We demonstrate the advantages of this system on the ANLI and hate speech detection benchmark datasets - both collected via an iterative, adversarial human-and-model-in-the-loop procedure. Compared to training only on observed human attacks, also training on our synthetic adversarial examples improves model robustness to future rounds. In ANLI, we see accuracy gains on the current set of attacks (44.1%

\,\to\,

50.1%) and on two future unseen rounds of human generated attacks (32.5%

\,\to\,

43.4%, and 29.4%

\,\to\,

40.2%). In hate speech detection, we see AUC gains on current attacks (0.76

\to

0.84) and a future round (0.77

\to

0.79). Attacks from methods that do not learn the distribution of existing human adversaries, meanwhile, degrade robustness

arXiv.org e-Print Archive