307 research outputs found
Interfacial Interaction Enhanced Rheological Behavior in PAM/CTAC/Salt Aqueous Solution—A Coarse-Grained Molecular Dynamics Study
Interfacial interactions within a multi-phase polymer solution play critical roles in processing control and mass transportation in chemical engineering. However, the understandings of these roles remain unexplored due to the complexity of the system. In this study, we used an efficient analytical method—a nonequilibrium molecular dynamics (NEMD) simulation—to unveil the molecular interactions and rheology of a multiphase solution containing cetyltrimethyl ammonium chloride (CTAC), polyacrylamide (PAM), and sodium salicylate (NaSal). The associated macroscopic rheological characteristics and shear viscosity of the polymer/surfactant solution were investigated, where the computational results agreed well with the experimental data. The relation between the characteristic time and shear rate was consistent with the power law. By simulating the shear viscosity of the polymer/surfactant solution, we found that the phase transition of micelles within the mixture led to a non-monotonic increase in the viscosity of the mixed solution with the increase in concentration of CTAC or PAM. We expect this optimized molecular dynamic approach to advance the current understanding on chemical–physical interactions within polymer/surfactant mixtures at the molecular level and enable emerging engineering solutions
Feature Interaction Aware Automated Data Representation Transformation
Creating an effective representation space is crucial for mitigating the
curse of dimensionality, enhancing model generalization, addressing data
sparsity, and leveraging classical models more effectively. Recent advancements
in automated feature engineering (AutoFE) have made significant progress in
addressing various challenges associated with representation learning, issues
such as heavy reliance on intensive labor and empirical experiences, lack of
explainable explicitness, and inflexible feature space reconstruction embedded
into downstream tasks. However, these approaches are constrained by: 1)
generation of potentially unintelligible and illogical reconstructed feature
spaces, stemming from the neglect of expert-level cognitive processes; 2) lack
of systematic exploration, which subsequently results in slower model
convergence for identification of optimal feature space. To address these, we
introduce an interaction-aware reinforced generation perspective. We redefine
feature space reconstruction as a nested process of creating meaningful
features and controlling feature set size through selection. We develop a
hierarchical reinforcement learning structure with cascading Markov Decision
Processes to automate feature and operation selection, as well as feature
crossing. By incorporating statistical measures, we reward agents based on the
interaction strength between selected features, resulting in intelligent and
efficient exploration of the feature space that emulates human decision-making.
Extensive experiments are conducted to validate our proposed approach.Comment: Accepted to SIAM Conference on Data Mining(SDM) 202
Disentangled Causal Graph Learning forOnline Unsupervised Root Cause Analysis
The task of root cause analysis (RCA) is to identify the root causes of
system faults/failures by analyzing system monitoring data. Efficient RCA can
greatly accelerate system failure recovery and mitigate system damages or
financial losses. However, previous research has mostly focused on developing
offline RCA algorithms, which often require manually initiating the RCA
process, a significant amount of time and data to train a robust model, and
then being retrained from scratch for a new system fault.
In this paper, we propose CORAL, a novel online RCA framework that can
automatically trigger the RCA process and incrementally update the RCA model.
CORAL consists of Trigger Point Detection, Incremental Disentangled Causal
Graph Learning, and Network Propagation-based Root Cause Localization. The
Trigger Point Detection component aims to detect system state transitions
automatically and in near-real-time. To achieve this, we develop an online
trigger point detection approach based on multivariate singular spectrum
analysis and cumulative sum statistics. To efficiently update the RCA model, we
propose an incremental disentangled causal graph learning approach to decouple
the state-invariant and state-dependent information. After that, CORAL applies
a random walk with restarts to the updated causal graph to accurately identify
root causes. The online RCA process terminates when the causal graph and the
generated root cause list converge. Extensive experiments on three real-world
datasets with case studies demonstrate the effectiveness and superiority of the
proposed framework
Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing
Feature generation aims to generate new and meaningful features to create a
discriminative representation space.A generated feature is meaningful when the
generated feature is from a feature pair with inherent feature interaction. In
the real world, experienced data scientists can identify potentially useful
feature-feature interactions, and generate meaningful dimensions from an
exponentially large search space, in an optimal crossing form over an optimal
generation path. But, machines have limited human-like abilities.We generalize
such learning tasks as self-optimizing feature generation. Self-optimizing
feature generation imposes several under-addressed challenges on existing
systems: meaningful, robust, and efficient generation. To tackle these
challenges, we propose a principled and generic representation-crossing
framework to solve self-optimizing feature generation.To achieve hashing
representation, we propose a three-step approach: feature discretization,
feature hashing, and descriptive summarization. To achieve reinforcement
crossing, we develop a hierarchical reinforcement feature crossing approach.We
present extensive experimental results to demonstrate the effectiveness and
efficiency of the proposed method. The code is available at
https://github.com/yingwangyang/HRC_feature_cross.git
Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing
Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, we propose a principled and generic representation-crossing framework to solve self-optimizing feature generation. To achieve hashing representation, we propose a three-step approach: feature discretization, feature hashing, and descriptive summarization. To achieve reinforcement crossing, we develop a hierarchical reinforcement feature crossing approach. We present extensive experimental results to demonstrate the effectiveness and efficiency of the proposed method. The code is available at https://github.com/yingwangyang/HRC_feature_cross.git
Reinforced Imitative Graph Learning for Mobile User Profiling
Mobile user profiling refers to the efforts of extracting users’ characteristics from mobile activities. In order to capture the dynamic varying of user characteristics for generating effective user profiling, we propose an imitation-based mobile user profiling framework. Considering the objective of teaching an autonomous agent to imitate user mobility based on the user’s profile, the user profile is the most accurate when the agent can perfectly mimic the user behavior patterns. The profiling framework is formulated into a reinforcement learning task, where an agent is a next-visit planner, an action is a POI that a user will visit next, and the state of the environment is a fused representation of a user and spatial entities. An event in which a user visits a POI will construct a new state, which helps the agent predict users’ mobility more accurately. In the framework, we introduce a spatial Knowledge Graph (KG) to characterize the semantics of user visits over connected spatial entities. Additionally, we develop a mutual-updating strategy to quantify the state that evolves over time. Along these lines, we develop a reinforcement imitative graph learning framework for mobile user profiling. Finally, we conduct extensive experiments to demonstrate the superiority of our approach
Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective
In this paper, we study the problem of mobile user profiling, which is a
critical component for quantifying users' characteristics in the human mobility
modeling pipeline. Human mobility is a sequential decision-making process
dependent on the users' dynamic interests. With accurate user profiles, the
predictive model can perfectly reproduce users' mobility trajectories. In the
reverse direction, once the predictive model can imitate users' mobility
patterns, the learned user profiles are also optimal. Such intuition motivates
us to propose an imitation-based mobile user profiling framework by exploiting
reinforcement learning, in which the agent is trained to precisely imitate
users' mobility patterns for optimal user profiles. Specifically, the proposed
framework includes two modules: (1) representation module, which produces state
combining user profiles and spatio-temporal context in real-time; (2) imitation
module, where Deep Q-network (DQN) imitates the user behavior (action) based on
the state that is produced by the representation module. However, there are two
challenges in running the framework effectively. First, epsilon-greedy strategy
in DQN makes use of the exploration-exploitation trade-off by randomly pick
actions with the epsilon probability. Such randomness feeds back to the
representation module, causing the learned user profiles unstable. To solve the
problem, we propose an adversarial training strategy to guarantee the
robustness of the representation module. Second, the representation module
updates users' profiles in an incremental manner, requiring integrating the
temporal effects of user profiles. Inspired by Long-short Term Memory (LSTM),
we introduce a gated mechanism to incorporate new and old user characteristics
into the user profile.Comment: AAAI 202
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective
Feature transformation aims to reconstruct an effective representation space
by mathematically refining the existing features. It serves as a pivotal
approach to combat the curse of dimensionality, enhance model generalization,
mitigate data sparsity, and extend the applicability of classical models.
Existing research predominantly focuses on domain knowledge-based feature
engineering or learning latent representations. However, these methods, while
insightful, lack full automation and fail to yield a traceable and optimal
representation space. An indispensable question arises: Can we concurrently
address these limitations when reconstructing a feature space for a
machine-learning task? Our initial work took a pioneering step towards this
challenge by introducing a novel self-optimizing framework. This framework
leverages the power of three cascading reinforced agents to automatically
select candidate features and operations for generating improved feature
transformation combinations. Despite the impressive strides made, there was
room for enhancing its effectiveness and generalization capability. In this
extended journal version, we advance our initial work from two distinct yet
interconnected perspectives: 1) We propose a refinement of the original
framework, which integrates a graph-based state representation method to
capture the feature interactions more effectively and develop different
Q-learning strategies to alleviate Q-value overestimation further. 2) We
utilize a new optimization technique (actor-critic) to train the entire
self-optimizing framework in order to accelerate the model convergence and
improve the feature transformation performance. Finally, to validate the
improved effectiveness and generalization capability of our framework, we
perform extensive experiments and conduct comprehensive analyses.Comment: 21 pages, submitted to TKDD. arXiv admin note: text overlap with
arXiv:2209.08044, arXiv:2205.1452
Self-Optimizing Feature Transformation
Feature transformation aims to extract a good representation (feature) space
by mathematically transforming existing features. It is crucial to address the
curse of dimensionality, enhance model generalization, overcome data sparsity,
and expand the availability of classic models. Current research focuses on
domain knowledge-based feature engineering or learning latent representations;
nevertheless, these methods are not entirely automated and cannot produce a
traceable and optimal representation space. When rebuilding a feature space for
a machine learning task, can these limitations be addressed concurrently? In
this extension study, we present a self-optimizing framework for feature
transformation. To achieve a better performance, we improved the preliminary
work by (1) obtaining an advanced state representation for enabling reinforced
agents to comprehend the current feature set better; and (2) resolving Q-value
overestimation in reinforced agents for learning unbiased and effective
policies. Finally, to make experiments more convincing than the preliminary
work, we conclude by adding the outlier detection task with five datasets,
evaluating various state representation approaches, and comparing different
training strategies. Extensive experiments and case studies show that our work
is more effective and superior.Comment: Under review of TKDE. arXiv admin note: substantial text overlap with
arXiv:2205.1452
- …