2,486 research outputs found
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Large Language and Text-to-3D Models for Engineering Design Optimization
The current advances in generative AI for learning large neural network
models with the capability to produce essays, images, music and even 3D assets
from text prompts create opportunities for a manifold of disciplines. In the
present paper, we study the potential of deep text-to-3D models in the
engineering domain, with focus on the chances and challenges when integrating
and interacting with 3D assets in computational simulation-based design
optimization. In contrast to traditional design optimization of 3D geometries
that often searches for the optimum designs using numerical representations,
such as B-Spline surface or deformation parameters in vehicle aerodynamic
optimization, natural language challenges the optimization framework by
requiring a different interpretation of variation operators while at the same
time may ease and motivate the human user interaction. Here, we propose and
realize a fully automated evolutionary design optimization framework using
Shap-E, a recently published text-to-3D asset network by OpenAI, in the context
of aerodynamic vehicle optimization. For representing text prompts in the
evolutionary optimization, we evaluate (a) a bag-of-words approach based on
prompt templates and Wordnet samples, and (b) a tokenisation approach based on
prompt templates and the byte pair encoding method from GPT4. Our main findings
from the optimizations indicate that, first, it is important to ensure that the
designs generated from prompts are within the object class of application, i.e.
diverse and novel designs need to be realistic, and, second, that more research
is required to develop methods where the strength of text prompt variations and
the resulting variations of the 3D designs share causal relations to some
degree to improve the optimization.Comment: 9 pages, 13 figures, IEEE conference templat
Behavioral Pattern Mining and Modeling in Programming Problem Solving
abstract: Online learning platforms such as massive online open courses (MOOCs) and
intelligent tutoring systems (ITSs) have made learning more accessible and personalized. These systems generate unprecedented amounts of behavioral data and open the way for predicting students’ future performance based on their behavior, and for assessing their strengths and weaknesses in learning.
This thesis attempts to mine students’ working patterns using a programming problem solving system, and build predictive models to estimate students’ learning. QuizIT, a programming solving system, was used to collect students’ problem-solving activities from a lower-division computer science programming course in 2016 Fall semester. Differential mining techniques were used to extract frequent patterns based on each activity provided details about question’s correctness, complexity, topic, and time to represent students’ behavior. These patterns were further used to build classifiers to predict students’ performances.
Seven main learning behaviors were discovered based on these patterns, which provided insight into students’ metacognitive skills and thought processes. Besides predicting students’ performance group, the classification models also helped in finding important behaviors which were crucial in determining a student’s positive or negative performance throughout the semester.Dissertation/ThesisMasters Thesis Computer Science 201
Optimized bi-dimensional data projection for clustering visualization
We propose a new method to project n-dimensional data onto two dimensions, for visualization purposes. Our goal is to produce a bi-dimensional representation that better separate existing clusters. Accordingly, to generate this projection we apply Differential Evolution as a meta-heuristic to optimize a divergence measure of the projected data. This divergence measure is based on the Cauchy–Schwartz divergence, extended for multiple classes. It accounts for the separability of the clusters in the projected space using the Renyi entropy and Information Theoretical Clustering analysis. We test the proposed method on two synthetic and five real world data sets, obtaining well separated projected clusters in two dimensions. These results were compared with results generated by PCA and a recent likelihood based visualization method
A Data-Driven Approach for Modeling Agents
Agents are commonly created on a set of simple rules driven by theories, hypotheses, and assumptions. Such modeling premise has limited use of real-world data and is challenged when modeling real-world systems due to the lack of empirical grounding. Simultaneously, the last decade has witnessed the production and availability of large-scale data from various sensors that carry behavioral signals. These data sources have the potential to change the way we create agent-based models; from simple rules to driven by data. Despite this opportunity, the literature has neglected to offer a modeling approach to generate granular agent behaviors from data, creating a gap in the literature.
This dissertation proposes a novel data-driven approach for modeling agents to bridge the research gap. The approach is composed of four detailed steps including data preparation, attribute model creation, behavior model creation, and integration. The connection between and within each step is established using data flow diagrams.
The practicality of the approach is demonstrated with a human mobility model that uses millions of location footprints collected from social media. In this model, the generation of movement behavior is tested with five machine learning/statistical modeling techniques covering a large number of model/data configurations. Results show that Random Forest-based learning is the most effective for the mobility use case. Furthermore, agent attribute values are obtained/generated with machine learning and translational assignment techniques.
The proposed approach is evaluated in two ways. First, the use case model is compared to another model which is developed using a state-of-the-art data-driven approach. The model’s prediction performance is comparable to the state-of-the-art model. The plausibility of behaviors and model structure in the use case model is found to be closer to real-world than the state-of-the-art model. This outcome indicates that the proposed approach produces realistic results. Second, a standard mobility dataset is used for driving the mobility model in place of social media data. Despite its small size, the data and model resembled the results gathered from the primary use case indicating the possibility of using different datasets with the proposed approach
- …