3,376 research outputs found
Classification of software components based on clustering
This thesis demonstrates how in different phases of the software life cycle, software components that have similar software metrics can be grouped into homogeneous clusters. We use multi-variate analysis techniques to group similar software components. The results were applied on several real case studies from NASA and open source software. We obtained process and product related metrics during the requirements specification, product related metrics at the architectural level and code metrics from operational stage for several case studies. We implemented clustering analysis using these metrics and validated the results. This analysis makes it possible to rank the clusters and assign similar development and validation tasks for all the components in a cluster, as the components in a cluster have similar metrics and hence tend to behave alike
Expressing linear equality constraints in feedforward neural networks
We seek to impose linear, equality constraints in feedforward neural
networks. As top layer predictors are usually nonlinear, this is a difficult
task if we seek to deploy standard convex optimization methods and strong
duality. To overcome this, we introduce a new saddle-point Lagrangian with
auxiliary predictor variables on which constraints are imposed. Elimination of
the auxiliary variables leads to a dual minimization problem on the Lagrange
multipliers introduced to satisfy the linear constraints. This minimization
problem is combined with the standard learning problem on the weight matrices.
From this theoretical line of development, we obtain the surprising
interpretation of Lagrange parameters as additional, penultimate layer hidden
units with fixed weights stemming from the constraints. Consequently, standard
minimization approaches can be used despite the inclusion of Lagrange
parameters -- a very satisfying, albeit unexpected, discovery. Examples ranging
from multi-label classification to constrained autoencoders are envisaged in
the future
Stable and Causal Inference for Discriminative Self-supervised Deep Visual Representations
In recent years, discriminative self-supervised methods have made significant
strides in advancing various visual tasks. The central idea of learning a data
encoder that is robust to data distortions/augmentations is straightforward yet
highly effective. Although many studies have demonstrated the empirical success
of various learning methods, the resulting learned representations can exhibit
instability and hinder downstream performance. In this study, we analyze
discriminative self-supervised methods from a causal perspective to explain
these unstable behaviors and propose solutions to overcome them. Our approach
draws inspiration from prior works that empirically demonstrate the ability of
discriminative self-supervised methods to demix ground truth causal sources to
some extent. Unlike previous work on causality-empowered representation
learning, we do not apply our solutions during the training process but rather
during the inference process to improve time efficiency. Through experiments on
both controlled image datasets and realistic image datasets, we show that our
proposed solutions, which involve tempering a linear transformation with
controlled synthetic data, are effective in addressing these issues.Comment: ICCV 2023 accepted pape
Self-supervised Likelihood Estimation with Energy Guidance for Anomaly Segmentation in Urban Scenes
Robust autonomous driving requires agents to accurately identify unexpected
areas in urban scenes. To this end, some critical issues remain open: how to
design advisable metric to measure anomalies, and how to properly generate
training samples of anomaly data? Previous effort usually resorts to
uncertainty estimation and sample synthesis from classification tasks, which
ignore the context information and sometimes requires auxiliary datasets with
fine-grained annotations. On the contrary, in this paper, we exploit the strong
context-dependent nature of segmentation task and design an energy-guided
self-supervised frameworks for anomaly segmentation, which optimizes an anomaly
head by maximizing the likelihood of self-generated anomaly pixels. To this
end, we design two estimators for anomaly likelihood estimation, one is a
simple task-agnostic binary estimator and the other depicts anomaly likelihood
as residual of task-oriented energy model. Based on proposed estimators, we
further incorporate our framework with likelihood-guided mask refinement
process to extract informative anomaly pixels for model training. We conduct
extensive experiments on challenging Fishyscapes and Road Anomaly benchmarks,
demonstrating that without any auxiliary data or synthetic models, our method
can still achieves competitive performance to other SOTA schemes
Online Deception Detection Refueled by Real World Data Collection
The lack of large realistic datasets presents a bottleneck in online
deception detection studies. In this paper, we apply a data collection method
based on social network analysis to quickly identify high-quality deceptive and
truthful online reviews from Amazon. The dataset contains more than 10,000
deceptive reviews and is diverse in product domains and reviewers. Using this
dataset, we explore effective general features for online deception detection
that perform well across domains. We demonstrate that with generalized features
- advertising speak and writing complexity scores - deception detection
performance can be further improved by adding additional deceptive reviews from
assorted domains in training. Finally, reviewer level evaluation gives an
interesting insight into different deceptive reviewers' writing styles.Comment: 10 pages, Accepted to Recent Advances in Natural Language Processing
(RANLP) 201
Complexity Science Models of Financing Health and Social Security Fiscal Gaps
Many think health and Social Security markets and social insurance programs are broken because they are increasingly unaffordable for too many Americans. Bending the cost curve down has become a standard reference term for the main objective of reform proposals to slow cost increases or even reduce them. This paper presents an alternative model with preliminary results of statistical analyses of complexity science simulation models with historical data that quickly bend the GDP curve up to increase affordability. This paper looks beyond popular reform models to self-organizing complexity science models based on chemistry, physics, and biology theories to suggest sustainable, long-term financial reform proposals. The foundation of these proposals is not based on orthodox market failure economic models but rather on thermodynamics in general and the time evolution of Shannon information entropy in particular:complexity science,financing fiscal gaps, health and Social Security, & macroeconomics
- …