1,239 research outputs found

    Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm

    Full text link
    This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness function of the genetic algorithm is the average cost of classification when using the decision tree, including both the costs of tests (features, measurements) and the costs of classification errors. ICET is compared here with three other algorithms for cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5, which classifies without regard to cost. The five algorithms are evaluated empirically on five real-world medical datasets. Three sets of experiments are performed. The first set examines the baseline performance of the five algorithms on the five datasets and establishes that ICET performs significantly better than its competitors. The second set tests the robustness of ICET under a variety of conditions and shows that ICET maintains its advantage. The third set looks at ICET's search in bias space and discovers a way to improve the search.Comment: See http://www.jair.org/ for any accompanying file

    The management of context-sensitive features: A review of strategies

    Get PDF
    In this paper, we review five heuristic strategies for handling context- sensitive features in supervised machine learning from examples. We discuss two methods for recovering lost (implicit) contextual information. We mention some evidence that hybrid strategies can have a synergetic effect. We then show how the work of several machine learning researchers fits into this framework. While we do not claim that these strategies exhaust the possibilities, it appears that the framework includes all of the techniques that can be found in the published literature on context-sensitive learning

    How to Shift Bias: Lessons from the Baldwin Effect

    Full text link

    Longmeyer Exposes or Creates Uncertainty about the Duty to Inform Remainder Beneficiaries of a Revocable Trust

    Get PDF
    This article discusses the surprising Longmeyer decision, handed down by the Supreme Court of Kentucky earlier this year in which a predecessor trustee was held to have a duty to give certain notifications to former remainder beneficiaries of a revocable trust. The authors then examine how Longmeyer might have been decided in other states and under other statutory schemes. The article concludes with observations concerning when certain notices to trust beneficiaries may be conducive to effective trust administration and suggestions to those who administer trusts on how best to comply with beneficiary notice requirements

    SentiCircles for contextual and conceptual semantic sentiment analysis of Twitter

    Get PDF
    Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words’ sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure

    Climate change response: a report to establish the knowledge required for a TIANZ response and policy formulation with the Government post Kyoto Protocol ratification

    Get PDF
    The Tourism Industry Association of New Zealand commissioned this report ‘as a definitive reference point for the Tourism sector with regard to its greenhouse gas emissions (CO₂) and the potential impacts on the sector, in order to establish the underpinning knowledge required for a subsequent TIANZ response and policy formulation with the Government post the Kyoto Protocol ratification’. The value of the tourism sector, in terms of GDP and employment is self-evident but there is also growing awareness of the New Zealand environment by the international market which is critical to New Zealand’s future prosperity. Both the tourism sector and the Government recognise the importance of the ‘state of New Zealand’s environment’ and the need to genuinely sustain the image of ‘100% Pure New Zealand’, as it is implicitly linked to maintaining credibility and growth in a highly competitive market.Prepared for the Tourism Industry Association New Zealand (TIANZ), Landcare Research Contract Report, LC0102/107

    A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations

    Get PDF
    Bilingual dictionaries for technical terms such as biomedical terms are an important resource for machine translation systems as well as for humans who would like to understand a concept described in a foreign language. Often a biomedical term is first proposed in English and later it is manually translated to other languages. Despite the fact that there are large monolingual lexicons of biomedical terms, only a fraction of those term lexicons are translated to other languages. Manually compiling large-scale bilingual dictionaries for technical domains is a challenging task because it is difficult to find a sufficiently large number of bilingual experts. We propose a cross-lingual similarity measure for detecting most similar translation candidates for a biomedical term specified in one language (source) from another language (target). Specifically, a biomedical term in a language is represented using two types of features: (a) intrinsic features that consist of character n-grams extracted from the term under consideration, and (b) extrinsic features that consist of unigrams and bigrams extracted from the contextual windows surrounding the term under consideration. We propose a cross-lingual similarity measure using each of those feature types. First, to reduce the dimensionality of the feature space in each language, we propose prototype vector projection (PVP)—a non-negative lower-dimensional vector projection method. Second, we propose a method to learn a mapping between the feature spaces in the source and target language using partial least squares regression (PLSR). The proposed method requires only a small number of training instances to learn a cross-lingual similarity measure. The proposed PVP method outperforms popular dimensionality reduction methods such as the singular value decomposition (SVD) and non-negative matrix factorization (NMF) in a nearest neighbor prediction task. Moreover, our experimental results covering several language pairs such as English–French, English–Spanish, English–Greek, and English–Japanese show that the proposed method outperforms several other feature projection methods in biomedical term translation prediction tasks
    • …
    corecore