23,898 research outputs found
Centered Partition Process: Informative Priors for Clustering
There is a very rich literature proposing Bayesian approaches for clustering
starting with a prior probability distribution on partitions. Most approaches
assume exchangeability, leading to simple representations in terms of
Exchangeable Partition Probability Functions (EPPF). Gibbs-type priors
encompass a broad class of such cases, including Dirichlet and Pitman-Yor
processes. Even though there have been some proposals to relax the
exchangeability assumption, allowing covariate-dependence and partial
exchangeability, limited consideration has been given on how to include
concrete prior knowledge on the partition. For example, we are motivated by an
epidemiological application, in which we wish to cluster birth defects into
groups and we have prior knowledge of an initial clustering provided by
experts. As a general approach for including such prior knowledge, we propose a
Centered Partition (CP) process that modifies the EPPF to favor partitions
close to an initial one. Some properties of the CP prior are described, a
general algorithm for posterior computation is developed, and we illustrate the
methodology through simulation examples and an application to the motivating
epidemiology study of birth defects
Segmentation of articular cartilage and early osteoarthritis based on the fuzzy soft thresholding approach driven by modified evolutionary ABC optimization and local statistical aggregation
Articular cartilage assessment, with the aim of the cartilage loss identification, is a crucial task for the clinical practice of orthopedics. Conventional software (SW) instruments allow for just a visualization of the knee structure, without post processing, offering objective cartilage modeling. In this paper, we propose the multiregional segmentation method, having ambitions to bring a mathematical model reflecting the physiological cartilage morphological structure and spots, corresponding with the early cartilage loss, which is poorly recognizable by the naked eye from magnetic resonance imaging (MRI). The proposed segmentation model is composed from two pixel's classification parts. Firstly, the image histogram is decomposed by using a sequence of the triangular fuzzy membership functions, when their localization is driven by the modified artificial bee colony (ABC) optimization algorithm, utilizing a random sequence of considered solutions based on the real cartilage features. In the second part of the segmentation model, the original pixel's membership in a respective segmentation class may be modified by using the local statistical aggregation, taking into account the spatial relationships regarding adjacent pixels. By this way, the image noise and artefacts, which are commonly presented in the MR images, may be identified and eliminated. This fact makes the model robust and sensitive with regards to distorting signals. We analyzed the proposed model on the 2D spatial MR image records. We show different MR clinical cases for the articular cartilage segmentation, with identification of the cartilage loss. In the final part of the analysis, we compared our model performance against the selected conventional methods in application on the MR image records being corrupted by additive image noise.Web of Science117art. no. 86
Multilevel Clustering Fault Model for IC Manufacture
A hierarchical approach to the construction of compound distributions for
process-induced faults in IC manufacture is proposed. Within this framework,
the negative binomial distribution is treated as level-1 models. The
hierarchical approach to fault distribution offers an integrated picture of how
fault density varies from region to region within a wafer, from wafer to wafer
within a batch, and so on. A theory of compound-distribution hierarchies is
developed by means of generating functions. A study of correlations, which
naturally appears in microelectronics due to the batch character of IC
manufacture, is proposed. Taking these correlations into account is of
significant importance for developing procedures for statistical quality
control in IC manufacture. With respect to applications, hierarchies of yield
means and yield probability-density functions are considered.Comment: 10 pages, the International Conference "Micro- and Nanoelectronics-
2003" (ICMNE-2003),Zvenigorod, Moscow district, Russia, October 6-10, 200
Bayesian Hierarchical Modelling for Tailoring Metric Thresholds
Software is highly contextual. While there are cross-cutting `global'
lessons, individual software projects exhibit many `local' properties. This
data heterogeneity makes drawing local conclusions from global data dangerous.
A key research challenge is to construct locally accurate prediction models
that are informed by global characteristics and data volumes. Previous work has
tackled this problem using clustering and transfer learning approaches, which
identify locally similar characteristics. This paper applies a simpler approach
known as Bayesian hierarchical modeling. We show that hierarchical modeling
supports cross-project comparisons, while preserving local context. To
demonstrate the approach, we conduct a conceptual replication of an existing
study on setting software metrics thresholds. Our emerging results show our
hierarchical model reduces model prediction error compared to a global approach
by up to 50%.Comment: Short paper, published at MSR '18: 15th International Conference on
Mining Software Repositories May 28--29, 2018, Gothenburg, Swede
Are Delayed Issues Harder to Resolve? Revisiting Cost-to-Fix of Defects throughout the Lifecycle
Many practitioners and academics believe in a delayed issue effect (DIE);
i.e. the longer an issue lingers in the system, the more effort it requires to
resolve. This belief is often used to justify major investments in new
development processes that promise to retire more issues sooner.
This paper tests for the delayed issue effect in 171 software projects
conducted around the world in the period from 2006--2014. To the best of our
knowledge, this is the largest study yet published on this effect. We found no
evidence for the delayed issue effect; i.e. the effort to resolve issues in a
later phase was not consistently or substantially greater than when issues were
resolved soon after their introduction.
This paper documents the above study and explores reasons for this mismatch
between this common rule of thumb and empirical data. In summary, DIE is not
some constant across all projects. Rather, DIE might be an historical relic
that occurs intermittently only in certain kinds of projects. This is a
significant result since it predicts that new development processes that
promise to faster retire more issues will not have a guaranteed return on
investment (depending on the context where applied), and that a long-held truth
in software engineering should not be considered a global truism.Comment: 31 pages. Accepted with minor revisions to Journal of Empirical
Software Engineering. Keywords: software economics, phase delay, cost to fi
- …