604 research outputs found

    EEF-CAS: An Effort Estimation Framework with Customizable Attribute Selection

    Get PDF
    Existing estimation frameworks generally provide one-size-fits-all solutions that fail to produce accurate estimates in most environments. Research has shown that the accomplishment of accurate effort estimates is a long-term process that, above all, requires the extensive collection of effort estimation data by each organization. Collected data is generally characterized by a set of attributes that are believed to affect the development effort. The attributes that most affect development effort vary widely depending on the type of product being developed and the environment in which it is being developed. Thus, any new estimation framework must offer the flexibility of customizable attribute selection. Moreover, such attributes could provide the ability to incorporate empirical evidence and expert judgment into the effort estimation framework. Finally, because software is virtual and therefore intangible, the most important software metrics are notorious for being subjective according to the experience of the estimator. Consequently, a measurement and inference system that is robust to subjectivity and uncertainty must be in place. The Effort Estimation Framework with Customizable Attribute Selection (EEF-CAS) presented in this paper has been designed with the above requirements in mind. It is accompanied with four preparation process steps that allow for any organization implementing it to establish an estimation process. This estimation process facilitates data collection, framework customization to the organization’s needs, its calibration with the organization’s data, and the capability of continual improvement. The proposed framework described in this paper was validated in a real software development organization

    DPpack: An R Package for Differentially Private Statistical Analysis and Machine Learning

    Full text link
    Differential privacy (DP) is the state-of-the-art framework for guaranteeing privacy for individuals when releasing aggregated statistics or building statistical/machine learning models from data. We develop the open-source R package DPpack that provides a large toolkit of differentially private analysis. The current version of DPpack implements three popular mechanisms for ensuring DP: Laplace, Gaussian, and exponential. Beyond that, DPpack provides a large toolkit of easily accessible privacy-preserving descriptive statistics functions. These include mean, variance, covariance, and quantiles, as well as histograms and contingency tables. Finally, DPpack provides user-friendly implementation of privacy-preserving versions of logistic regression, SVM, and linear regression, as well as differentially private hyperparameter tuning for each of these models. This extensive collection of implemented differentially private statistics and models permits hassle-free utilization of differential privacy principles in commonly performed statistical analysis. We plan to continue developing DPpack and make it more comprehensive by including more differentially private machine learning techniques, statistical modeling and inference in the future

    3-dimensional median-based algorithms in image sequence processing

    Get PDF
    Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent University, 1990.Thesis (Master's) -- Bilkent University, 1990.Includes bibliographical references leaves 75-78.This thesis introduces new 3-dimensional median-based algorithms to be used in two of the main research areas in image sequence proc(',ssi,ng; image sequence enhancement and image sequence coding. Two new nonlinear filters are developed in the field of image sequence enhancement. The motion performances and the output statistics of these filters are evaluated. The simulations show that the filters improve the image quality to a large extent compared to other examples from the literature. The second field addressed is image sequence coding. A new 3-dimensional median-based coding and decoding method is developed for stationary images with the aim of good slow motion performance. All the algorithms developed are simulated on real image sequences using a video sequencer.Alp, Münire BilgeM.S

    Outlier Detection for Mixed Model with Application to RNA-Seq Data

    Full text link
    Extracting messenger RNA (mRNA) molecules using oligo-dT probes targeting on the Poly(A) tail is common in RNA-sequencing (RNA-seq) experiments. This approach, however, is limited when the specimen is profoundly degraded or formalin-fixed such that either the majority of mRNAs have lost their Poly(A) tails or the oligo-dT probes do not anneal with the formalin-altered adenines. For this problem, a new protocol called capture RNA sequencing was developed using probes for target sequences, which gives unbiased estimates of RNA abundance even when the specimens are degraded. However, despite the effectiveness of capture sequencing, mRNA purification by the traditional Poly(A) protocol still underlies most reference libraries. A bridging mechanism that makes the two types of measurements comparable is needed for data integration and efficient use of information. In the first project, we developed an optimization algorithm that was later applied to outlier detection in a linear mixed model for data integration. In particular, we minimized the sum of truncated convex functions, which is often encountered in models with L0 penalty. The solution is exact in one-dimensional and two-dimensional spaces. For higher-dimensional problems, we applied the algorithm in a coordinate descent fashion. Although the global optimality is compromised, this approach generates local solutions with much higher efficiency. In the second project, we investigated the differences between Poly(A) libraries and capture sequencing libraries. We showed that without conversion, directly merging the two types of measurements lead to biases in subsequent analyses. A practical solution was to use a linear mixed model to predict one type of measurements based on the other. The predicted values based on this approach have high correlations, low errors and high efficiency compared with those based on the fixed model. Moreover, the procedure eliminates false positive findings and biases introduced by the technology differences between the two measurements. In the third project, we noted outlying observations and outlying random effects when fitting the mixed model. As they lead to the discovery of dysfunctional probes and batch effects, we developed an algorithm that screened for the outliers and provided a robust estimation. Specifically, we modified the mean-shift model with variable selection using L0 penalties, which was first introduced by Gannaz (2007), McCann and Welsch (2007) and She and Owen (2012). By incorporating the optimization method proposed in the first project, the algorithm became scalable and yielded exact solutions for low-dimensional problems. In particular, under the assumption of normality, there existed analytic expressions for the penalty parameters. In simulation studies, we showed that the proposed algorithm attained reliable outlier detection, delivered robust estimation and achieved efficient computation.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147613/1/ltzuying_1.pd

    Comparing the efficiency of normal form systems to represent Boolean functions

    Get PDF
    In this paper we compare various normal form representations of Boolean functions. We extend the study of [4], pertaining to the comparison of the asymptotic efficiency of representations that are produced by normal form systems (NFSs) that are factorizations of the clone Ω of all Boolean functions. We identify some properties, such as associativity, linearity, quasi-linearity and symmetry , that allow the efficiency of the corresponding NFSs to be compared in terms of the non-trivial connectives used. We illustrate these results by comparing well-known NFSs such as the DNF, CNF, Zhegalkin (Reed-Muller) polynomial (PNF) and Median (MNF) representations, thereby confirming the results of [4]. In particular, we show that the MNF is of equivalent complexity to, e.g., the Sheffer Normal Form (SNF), UNF and WNF (associated with 1 and 0-separating functions respectively) and thus that the latter are polynomially as efficient as any other NFS, and are strictly more efficient than the DNF, CNF, and Zhegalkin polynomial representations

    A fuzzy hierarchical multiple criteria group decision support system - Decider - and its applications

    Full text link
    Decider is a Fuzzy Hierarchical Multiple Criteria Group Decision Support System (FHMC-GDSS) designed for dealing with subjective, in particular linguistic, information and objective information simultaneously to support group decision making particularly on evaluation. In this chapter, the fuzzy aggregation decision model, functions and structure of Decider are introduced. The ideas to resolve decision and evaluation problems we have faced in the development and application of Decider are presented. Two real applications of the Decider system are briefly illustrated. Finally, we discuss our further research in this area. © 2011 Springer-Verlag Berlin Heidelberg
    corecore