28 research outputs found
Technical Reports (2004 - 2009)
Authors of Technical Reports (2005-2009): Choueiry, Berthe Cohen, Myra Deogun, Jitender Dwyer, Matthew Elbaum, Sebastian Goddard, Steve Henninger, Scott Jiang, Hong Lu, Ying Ramamurthy, Byrav Rothermel, Gregg Scott, Stephen Seth, Sharad Soh, Leen-Kiat Srisa-an, Witty Swanson, David Variyam, Vinodchandran Wang, Jun Xu, Lison
Recommended from our members
ReoptSMART: A Learning Query Plan Cache
The task of query optimization in modern relational database systems is important but can be computationally expensive. Parametric query optimization(PQO) has as its goal the prediction of optimal query execution plans based on historical results, without consulting the query optimizer. We develop machine learning techniques that can accurately model the output of a query optimizer. Our algorithms handle non-linear boundaries in plan space and achieve high prediction accuracy even when a limited amount of data is available for training. We use both predicted and actual query execution times for learning, and are the first to demonstrate a total net win of a PQO method over a state-of-the-art query optimizer for some workloads. ReoptSMART realizes savings not only in optimization time, but also in query execution time, for an over-all improvement by more than an order of magnitude in some cases
Multiobjective optimization of classifiers by means of 3-D convex Hull based evolutionary algorithms
The receiver operating characteristic (ROC) and detection error tradeoff (DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classification problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this research and propose two major advancements: Firstly we formulate the algorithm in detection error tradeoff space, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoff can be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D previous ROC space). A domain specific performance indicator for 3D Pareto front approximations, the volume above DET surface, is introduced, and used to guide the indicator -based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost using rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization.info:eu-repo/semantics/submittedVersio
Multiobjective optimization of classifiers by means of 3-D convex hull based evolutionary algorithms
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The receiver operating characteristic (ROC) and detection error tradeoff(DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classifi- cation problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this re- search and propose two major advancements: Firstly we formulate the algorithm in detec- tion error tradeoffspace, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoffcan be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D pre- vious ROC space). A domain specific performance indicator for 3D Pareto front approxima- tions, the volume above DET surface, is introduced, and used to guide the indicator-based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost us- ing rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization