1,532 research outputs found
An Experimental Comparison of Three Machine Learning Techniques for Web Cost Estimation
Many comparative studies on the performance of machine learning (ML) techniques for web cost estimation (WCE) have been reported in the literature. However, not much attention have been given to understanding the conceptual differences and similarities that exist in the application of these ML techniques for WCE, which could provide credible guide for upcoming practitioners and researchers in predicting the cost of new web projects. This paper presents a comparative analysis of three prominent machine learning techniques – Case-Based Reasoning (CBR), Support Vector Regression (SVR) and Artificial Neural Network (ANN) – in terms of performance, applicability, and their conceptual differences and similarities for WCE by using data obtained from a public dataset (www.tukutuku.com). Results from experiments show that SVR and ANN provides more accurate predictions of effort, although SVR require fewer parameters to generate good predictions than ANN. CBR was not as accurate, but its good explanation attribute gives it a higher descriptive value. The study also outlined specific characteristics of the 3 ML techniques that could foster or inhibit their adoption for WCE
Pragmatic cost estimation for web applications
Cost estimation for web applications is an interesting and difficult challenge for researchers and industrial practitioners. It is a particularly valuable area of ongoing commercial research. Attaining on accurate cost estimation for web applications is an essential element in being able to provide competitive bids and remaining successful in the market. The development of prediction techniques over thirty years ago has contributed to several different strategies. Unfortunately there is no collective evidence to give substantial advice or guidance for industrial practitioners. Therefore to address this problem, this thesis shows the way by investigating the characteristics of the dataset by combining the literature review and industrial survey findings. The results of the systematic literature review, industrial survey and an initial investigation, have led to an understanding that dataset characteristics may influence the cost estimation prediction techniques. From this, an investigation was carried out on dataset characteristics. However, in the attempt to structure the characteristics of dataset it was found not to be practical or easy to get a defined structure of dataset characteristics to use as a basis for prediction model selection. Therefore the thesis develops a pragmatic cost estimation strategy based on collected advice and general sound practice in cost estimation. The strategy is composed of the following five steps: test whether the predictions are better than the means of the dataset; test the predictions using accuracy measures such as MMRE, Pred and MAE knowing their strengths and weaknesses; investigate the prediction models formed to see if they are sensible and reasonable model; perform significance testing on the predictions; and get the effect size to establish preference relations of prediction models. The results from this pragmatic cost estimation strategy give not only advice on several techniques to choose from, but also give reliable results. Practitioners can be more confident about the estimation that is given by following this pragmatic cost estimation strategy. It can be concluded that the practitioners should focus on the best strategy to apply in cost estimation rather than focusing on the best techniques. Therefore, this pragmatic cost estimation strategy could help researchers and practitioners to get reliable results. The improvement and replication of this strategy over time will produce much more useful and trusted results.Cost estimation for web applications is an interesting and difficult challenge for researchers and industrial practitioners. It is a particularly valuable area of ongoing commercial research. Attaining on accurate cost estimation for web applications is an essential element in being able to provide competitive bids and remaining successful in the market. The development of prediction techniques over thirty years ago has contributed to several different strategies. Unfortunately there is no collective evidence to give substantial advice or guidance for industrial practitioners. Therefore to address this problem, this thesis shows the way by investigating the characteristics of the dataset by combining the literature review and industrial survey findings. The results of the systematic literature review, industrial survey and an initial investigation, have led to an understanding that dataset characteristics may influence the cost estimation prediction techniques. From this, an investigation was carried out on dataset characteristics. However, in the attempt to structure the characteristics of dataset it was found not to be practical or easy to get a defined structure of dataset characteristics to use as a basis for prediction model selection. Therefore the thesis develops a pragmatic cost estimation strategy based on collected advice and general sound practice in cost estimation. The strategy is composed of the following five steps: test whether the predictions are better than the means of the dataset; test the predictions using accuracy measures such as MMRE, Pred and MAE knowing their strengths and weaknesses; investigate the prediction models formed to see if they are sensible and reasonable model; perform significance testing on the predictions; and get the effect size to establish preference relations of prediction models. The results from this pragmatic cost estimation strategy give not only advice on several techniques to choose from, but also give reliable results. Practitioners can be more confident about the estimation that is given by following this pragmatic cost estimation strategy. It can be concluded that the practitioners should focus on the best strategy to apply in cost estimation rather than focusing on the best techniques. Therefore, this pragmatic cost estimation strategy could help researchers and practitioners to get reliable results. The improvement and replication of this strategy over time will produce much more useful and trusted results
Method of Estimating Costs of a Software Web Product
The costing of a product is a key factor in the marketing process. Its proper calculation can attract customers, which will ensure a company’s life and its business expansion. These considerations have driven the Centre for the Study of software Engineering at Universidad de la Frontera (CEIS-UFRO) to develop a method to define the cost of a web software product, based on use cases and productivity. This method is adaptable to the particular characteristics of any development process, any development team, any product and any company.This article describes the method and performs an initial validation by describing a quasi-experiment designed for Web applications developed by groups of three to five people. We have proved that: a. the method may be reproduced,b. effort estimation is sensitive to the definition of productivity, c. the subjectivity introduced by the estimators does not invalidate the method. For a complete validation of this method, different web products and a larger number of estimatorswith different levels of experience should be incorporated in a future replication.Sociedad Argentina de Informática e Investigación Operativ
New Statistical Paradigms Leading to Web-Based Tools for Clinical/Translational Science
As the field of functional genetics and genomics is beginning to mature, we become confronted with new challenges. The constant drop in price for sequencing and gene expression profiling as well as the increasing number of genetic and genomic variables that can be measured makes it feasible to address more complex questions. The success with rare diseases caused by single loci or genes has provided us with a proof-of-concept that new therapies can be developed based on functional genomics and genetics.
Common diseases, however, typically involve genetic epistasis, genomic pathways, and proteomic pattern. Moreover, to better understand the underlying biologi-cal systems, we often need to integrate information from several of these sources. Thus, as the field of clinical research moves toward complex diseases, the demand for modern data base systems and advanced statistical methods increases.
The traditional statistical methods implemented in most of the bioinformatics tools currently used in the novel field of genetics and functional genomics are based on the linear model and, thus, have shortcomings when applied to nonlinear biological systems. The previous work on partially ordered data (Wittkowski 1988; 1992), when combined with theoretical results (Hoeffding 1948) and computational strategies (Deuchler 1914) has opened a new field of nonparametric statistics. With grid technology, new tools are now feasible when screening for interactions between genetics (Wittkowski, Liu 2002) and functional genomics (Wittkowski, Lee 2004).
Having more complex study designs and more specific methods available increases the demand for decision support when selecting appropriate bioinformatics tools. With the advent of rapid prototyping systems for Web based database application, we have recently begun to complement previous work on knowledge based systems with graphical Web-based tools for acquisition of DESIGN and MODEL knowledge
ARCHITECTURE-BASED RELIABILITY ANALYSIS OF WEB SERVICES
In a Service Oriented Architecture (SOA), the hierarchical complexity of Web Services (WS) and their interactions with the underlying Application Server (AS) create new challenges in providing a realistic estimate of WS performance and reliability. The current approaches often treat the entire WS environment as a black-box. Thus, the sensitivity of the overall reliability and performance to the behavior of the underlying WS architectures and AS components are not well-understood. In other words, the current research on the architecture-based analysis of WSs is limited.
This dissertation presents a novel methodology for modeling the reliability and performance of web services. WSs are treated as atomic entities but the AS is broken down into layers. More specifically, interactions of WSs with the underlying layers of an AS are investigated. One important feature of the research is investigating the impact of dynamic parameters that exist at the layers, such as configuration parameters. These parameters may have negative impact on WSs performance if they are not configured properly. WSs are developed in house and the AS considered is JBoss AS. An experimental environment is setup so that controlled service requests can be generated and important performance metrics can be recorded under various configurations of the AS. On the other hand, a simulation model is developed from the source code and run-time behavior of the existing WS and AS implementations. The model mimics the logical behavior of the WSs based on their communication with the AS layers. The simulation results are compared to the experimental results to ensure the correctness of the model. The architecture of the simulation model, which is based on Stochastic Petri Nets (SPN), is modularized in accordance to the layers and their interactions. As the web services are often executed in a complex and distributed environment, the modularized approach enables a user or a designer to observe and investigate the performance of the entire system under various conditions. In contrast, most approaches to WSs analyses are monolithic in that the entire system is treated as a closed box.
The results show that 1) the simulation model can be a viable tool for measuring the performance and reliability of WSs under different loads and conditions that may be of great interest to WS designers and the professionals involved; 2) Configuration parameters have big impacts on the overall performance; 3) The simulation model can be tuned to account for various speeds in terms of communication, hardware, and software; 4) As the simulation model is modularized, it may be used as a foundation for aggregating the modules (layers), nullifying modules, or the model can be enhanced to include other aspects of the WS architecture such as network characteristics and the hardware/operating system on which the AS and WSs execute; and 5) The simulation model is beneficial to predict the performance of web services for those cases that are difficult to replicate in a field study
Recommended from our members
Improving System Reliability for Cyber-Physical Systems
Cyber-physical systems (CPS) are systems featuring a tight combination of, and coordination between, the system's computational and physical elements. Cyber-physical systems include systems ranging from critical infrastructure such as a power grid and transportation system to health and biomedical devices. System reliability, i.e., the ability of a system to perform its intended function under a given set of environmental and operational conditions for a given period of time, is a fundamental requirement of cyber-physical systems. An unreliable system often leads to disruption of service, financial cost and even loss of human life. An important and prevalent type of cyber-physical system meets the following criteria: processing large amounts of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and an accountability requirement for safety critical systems. This thesis aims to improve system reliability for this type of cyber-physical system. To improve system reliability for this type of cyber-physical system, I present a system evaluation approach entitled automated online evaluation (AOE), which is a data-centric runtime monitoring and reliability evaluation approach that works in parallel with the cyber-physical system to conduct automated evaluation along the workflow of the system continuously using computational intelligence and self-tuning techniques and provide operator-in-the-loop feedback on reliability improvement. For example, abnormal input and output data at or between the multiple stages of the system can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and increased system reliability. One technique used by the approach is data quality analysis using computational intelligence, which applies computational intelligence in evaluating data quality in an automated and efficient way in order to make sure the running system perform reliably as expected. Another technique used by the approach is self-tuning which automatically self-manages and self-configures the evaluation system to ensure that it adapts itself based on the changes in the system and feedback from the operator. To implement the proposed approach, I further present a system architecture called autonomic reliability improvement system (ARIS). This thesis investigates three hypotheses. First, I claim that the automated online evaluation empowered by data quality analysis using computational intelligence can effectively improve system reliability for cyber-physical systems in the domain of interest as indicated above. In order to prove this hypothesis, a prototype system needs to be developed and deployed in various cyber-physical systems while certain reliability metrics are required to measure the system reliability improvement quantitatively. Second, I claim that the self-tuning can effectively self-manage and self-configure the evaluation system based on the changes in the system and feedback from the operator-in-the-loop to improve system reliability. Third, I claim that the approach is efficient. It should not have a large impact on the overall system performance and introduce only minimal extra overhead to the cyberphysical system. Some performance metrics should be used to measure the efficiency and added overhead quantitatively. Additionally, in order to conduct efficient and cost-effective automated online evaluation for data-intensive CPS, which requires large volumes of data and devotes much of its processing time to I/O and data manipulation, this thesis presents COBRA, a cloud-based reliability assurance framework. COBRA provides automated multi-stage runtime reliability evaluation along the CPS workflow using data relocation services, a cloud data store, data quality analysis and process scheduling with self-tuning to achieve scalability, elasticity and efficiency. Finally, in order to provide a generic way to compare and benchmark system reliability for CPS and to extend the approach described above, this thesis presents FARE, a reliability benchmark framework that employs a CPS reliability model, a set of methods and metrics on evaluation environment selection, failure analysis, and reliability estimation. The main contributions of this thesis include validation of the above hypotheses and empirical studies of ARIS automated online evaluation system, COBRA cloud-based reliability assurance framework for data-intensive CPS, and FARE framework for benchmarking reliability of cyber-physical systems. This work has advanced the state of the art in the CPS reliability research, expanded the body of knowledge in this field, and provided some useful studies for further research
Combining SOA and BPM Technologies for Cross-System Process Automation
This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation
Effort Estimation of Agile and Web-Based Software Using Artificial Neural Networks
The agile methodology of software development is accepted as a superior alternative to conventional methods of software development, because of its inherent benefits like iterative development, rapid delivery and reduced risk. Hence, software developers are required to estimate the effort necessary to develop projects by agile methodology in an efficient manner because the requirements keep on changing. Web has become a part and parcel of our lives. People depend on Internet for almost everything these days. Many business units depend on Internet for communication with clients and for outsourcing load to other branches. In such a scenario, there is a necessity of efficient development of web-based software. For improving the efficiency of software development, resource utilization must be optimum. For achieving this, we need to be able to ascertain effectively, what kind of people/materials are required in what quantity, for development. This research aims at developing efficient effort estimation models for agile and web-based software by using various neural networks such as Feed-Forward Neural Network (FFNN), Radial Basis Function Neural Network (RBFN), Functional Link Artificial Neural Network (FLANN) and Probabilistic Neural Network (PNN) and provide a comparative assessment of their performance. The approach used for agile software effort estimation is the Story Point Approach and that for web-based software effort estimation is the IFPUG Function Point Approach
Semantic discovery and reuse of business process patterns
Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
- …