1,165 research outputs found

    Binarized support vector machines

    Get PDF
    The widely used Support Vector Machine (SVM) method has shown to yield very good results in Supervised Classification problems. Other methods such as Classification Trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in Data Mining. In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals which are critical for the classification. The method involves the optimization of a Linear Programming problem, with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard Column-Generation strategy leads to a classification method which, in terms of classification ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables. When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler, still competitive, classifiers

    Benefits of smart breathers

    Get PDF
    The moisture in the insulation has a great impact on the life of the insulation and is not usually a consideration in loss-of-life calculations although it is always present in the insulation system

    Molecular epidemiology studies on risk factors for breast cancer and disease aggressiveness

    Get PDF
    Breast cancer is a heterogeneous disease. Aggressive subtypes are characterized by faster growth rates, increased capability to invade and metastasize, leading to poorer clinical outcomes. In this thesis, we use a molecular epidemiology approach to investigate the association between risk factors and aggressive breast cancer defined by tumor characteristics, intrinsic subtypes, mode of detection, and survival. Using a variety of methods, we analyzed data from well-characterized breast cancer cohorts in Sweden, genome-wide association studies, and gene expression profiling of tumors. In Paper I, we found that breast cancer genetic load, defined by rare deleterious variants in 31 breast cancer genes, and unlike common variants, is positively associated with unfavorable tumor characteristics, patient survival, and mode of detection. In Paper II, we observed that women with low breast cancer risk defined by the Tyrer-Cuzick risk score were more likely to develop aggressive tumors. We computed a low-risk gene expression profile that was consistently associated with worse prognosis. In addition, our analysis showed that increased proliferation rather than estrogen status underlie this association. In Paper III, we examined gene expression profiles in a subset of aggressive breast cancer tumors, known as interval cancers. By taking mammographic density and intrinsic PAM50 subtypes into account, we found an interval cancer gene expression profile to be associated with immune subtypes in breast cancer, particularly those involving interferon response. In Paper IV, we show that breast cancer has a shared immune-related genetic component with celiac disease, an autoimmune disorder. In consistency with previous epidemiological findings, we found that a higher genetic load for celiac disease was associated with lower breast cancer risk. Overall, this thesis aims to provide scientific evidence towards a better understanding of the factors underlying the development of aggressive breast cancers that could shed light on the design of better preventative strategies aimed at lowering disease mortalit

    Benefits of smart breathers

    Get PDF
    The moisture in the insulation has a great impact on the life of the insulation and is not usually a consideration in loss-of-life calculations although it is always present in the insulation system

    Binarized support vector machines

    Get PDF
    The widely used Support Vector Machine (SVM) method has shown to yield very good results in Supervised Classification problems. Other methods such as Classification Trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in Data Mining. In this work, we propose an SVM-based method that automatically detects the most important predictor variables, and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals which are critical for the classification. The method involves the optimization of a Linear Programming problem, with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard Column-Generation strategy leads to a classification method which, in terms of classification ability, is competitive against the standard linear SVM and Classification Trees. Moreover, the proposed method is robust, i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables. When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler, still competitive, classifiers.Supervised classification, Binarization, Column generation, Support vector machines

    A dissimilarity-based approach for Classification

    Get PDF
    The Nearest Neighbor classifier has shown to be a powerful tool for multiclass classification. In this note we explore both theoretical properties and empirical behavior of a variant of such method, in which the Nearest Neighbor rule is applied after selecting a set of so-called prototypes, whose cardinality is fixed in advance, by minimizing the empirical mis-classification cost. With this we alleviate the two serious drawbacks of the Nearest Neighbor method: high storage requirements and time-consuming queries. The problem is shown to be NP-Hard. Mixed Integer Programming (MIP) programs are formulated, theoretically compared and solved by a standard MIP solver for problem instances of small size. Large sized problem instances are solved by a metaheuristic yielding good classification rules in reasonable time.operations research and management science;

    Supervised Classification and Mathematical Optimization

    Get PDF
    Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data

    Settlement of a Light Rail Pier Supported on Large Diameter Bored Piles Remediated by Jet Grouting

    Get PDF
    A support Pier (Pier 161) for a Light Rail line being constructed for the Metro Manila Light Rail Project encountered large settlements after the installation of the Precast Deck Girders. This pier supports bridge crossing across the San Juan River with a total span of sixty (60) meters. This Pier is supported on six (6) 1500 mm diameter bored piles designed to extend down to 17 meters or socketed into bedrock at least 2.0 meters based on design requirements. The structure started to settle during the erection of the superstructure when the dead load reached about 700 metric tons. Total settlement was about 42 mm when the erection was halted at a dead load of about 1600 metric tons. The pier was designed to carry a maximum total load of about 2100 metric tons (DL + LL). Subsequent subsurface investigation conducted by our office indicated that the bored piles were terminated prematurely and were not socketed into bedrock as originally specified. The Bored pile tips were resting on approximately 150 mm of soft to very soft clay and highly weathered bedrock, which is partly natural soil and drill cuttings. Several remediation procedures were considered but finally, Jet grouting was selected . This paper discusses the problems associated with the settlement and the ensuing solution using Jet Grouted Piles

    The Ciao clp(FD) library. A modular CLP extension for Prolog

    Get PDF
    We present a new free library for Constraint Logic Programming over Finite Domains, included with the Ciao Prolog system. The library is entirely written in Prolog, leveraging on Ciao's module system and code transformation capabilities in order to achieve a highly modular design without compromising performance. We describe the interface, implementation, and design rationale of each modular component. The library meets several design goals: a high level of modularity, allowing the individual components to be replaced by different versions; highefficiency, being competitive with other TT> implementations; a glass-box approach, so the user can specify new constraints at different levels; and a Prolog implementation, in order to ease the integration with Ciao's code analysis components. The core is built upon two small libraries which implement integer ranges and closures. On top of that, a finite domain variable datatype is defined, taking care of constraint reexecution depending on range changes. These three libraries form what we call the TT> kernel of the library. This TT> kernel is used in turn to implement several higher-level finite domain constraints, specified using indexicals. Together with a labeling module this layer forms what we name the TT> solver. A final level integrates the CLP (J7©) paradigm with our TT> solver. This is achieved using attributed variables and a compiler from the CLP (J7©) language to the set of constraints provided by the solver. It should be noted that the user of the library is encouraged to work in any of those levels as seen convenient: from writing a new range module to enriching the set of TT> constraints by writing new indexicals

    The regularity of a toric variety

    Get PDF
    We give a method for computing the degrees of the minimal syzygies of a toric variety by means of combinatorial techniques. Indeed, we complete the explicit description of the minimal free resolution of the associated semigroup algebra, using the simplicial representation of Koszul homology which appeared in A. Campillo and C. Marijuán (1991, Sém. Théor. Nombres Bordeaux3, 249–260). As an application, we obtain an algorithm for computing the Castelnuovo–Mumford regularity of a projective toric variety. This regularity is explicitly bounded by means of the semigroup generators which parametrize the variety
    corecore