21 research outputs found

    Data Mining with Skewed Data

    Get PDF

    SLCO1B1 rs4149056 polymorphism associated with statin-induced myopathy is differently distributed according to ethnicity in the Brazilian general population: Amerindians as a high risk ethnic group

    Get PDF
    Background\ud Recent studies reported the association between SLCO1B1 polymorphisms and the development of statin-induced myopathy. In the scenario of the Brazilian population, being one of the most heterogeneous in the world, the main aim here was to evaluate SLCO1B1 polymorphisms according to ethnic groups as an initial step for future pharmacogenetic studies.\ud \ud Methods\ud One hundred and eighty-two Amerindians plus 1,032 subjects from the general urban population were included. Genotypes for the SLCO1B1 rs4149056 (c.T521C, p.V174A, exon 5) and SLCO1B1 rs4363657 (g.T89595C, intron 11) polymorphisms were detected by polymerase chain reaction followed by high resolution melting analysis with the Rotor Gene 6000® instrument.\ud \ud Results\ud The frequencies of the SLCO1B1 rs4149056 and rs4363657 C variant allele were higher in Amerindians (28.3% and 26.1%) and were lower in African descent subjects (5.7% and 10.8%) compared with Mulatto (14.9% and 18.2%) and Caucasian descent (14.8% and 15.4%) ethnic groups (p < 0.001 and p < 0.001, respectively). Linkage disequilibrium analysis show that these variant alleles are in different linkage disequilibrium patterns depending on the ethnic origin.\ud \ud Conclusion\ud Our findings indicate interethnic differences for the SLCO1B1 rs4149056 C risk allele frequency among Brazilians. These data will be useful in the development of effective programs for stratifying individuals regarding adherence, efficacy and choice of statin-type.PCJLS is recipient from fellowship from FAPESP, Proc. 2010-17465-8, Brazil. The technical assistance of the Laboratory of Genetics and Molecular Cardiology group, Heart Institute group is gratefully acknowledged

    End-to-end neural network architecture for fraud scoring in card payments

    Full text link
    [EN] Millions of euros are lost every year due to fraudulent card transactions. The design and implementation of efficient fraud detection methods is mandatory to minimize such losses. In this paper, we present a neural network based system for fraud detection in banking systems. We use a real world dataset, and describe an end-to-end solution from the practitioner's perspective, by focusing on the following crucial aspects: unbalancedness, data processing and cost metric evaluation. Our analysis shows that the proposed solution achieves comparable performance values with state-of-the-art proprietary and costly solutions. (c) 2017 Elsevier B.V. All rights reserved.Gomez, J.; Arévalo, J.; Paredes Palacios, R.; Nin, J. (2018). End-to-end neural network architecture for fraud scoring in card payments. Pattern Recognition Letters. 105:175-181. https://doi.org/10.1016/j.patrec.2017.08.024S17518110

    A comparison of classification methods applied to credit card fraud detection

    No full text
    Em anos recentes, muitos algoritmos bio-inspirados têm surgido para resolver problemas de classificação. Em confirmação a isso, a revista Nature, em 2002, publicou um artigo que já apontava para o ano de 2003 o uso comercial de Sistemas Imunológicos Artificiais para detecção de fraude em instituições financeiras por uma empresa britânica. Apesar disso, não observamos, a luz de nosso conhecimento, nenhuma publicação científica com resultados promissores desde então. Nosso trabalho tratou de aplicar Sistemas Imunológicos Artificiais (AIS) para detecção de fraude em cartões de crédito. Comparamos AIS com os métodos de Árvore de Decisão (DT), Redes Neurais (NN), Redes Bayesianas (BN) e Naive Bayes (NB). Para uma comparação mais justa entre os métodos, busca exaustiva e algoritmo genético (GA) foram utilizados para selecionar um conjunto paramétrico otimizado, no sentido de minimizar o custo de fraude na base de dados de cartões de crédito cedida por um emissor de cartões de crédito brasileiro. Em adição à essa otimização, fizemos também uma análise e busca por parâmetros mais robustos via multi-resolução, estes parâmetros são apresentados neste trabalho. Especificidades de bases de fraude como desbalanceamento de dados e o diferente custo entre falso positivo e negativo foram levadas em conta. Todas as execuções foram realizadas no Weka, um software público e Open Source, e sempre foram utilizadas bases de teste para validação dos classificadores. Os resultados obtidos são consistentes com Maes et al. que mostra que BN são melhores que NN e, embora NN seja um dos métodos mais utilizados hoje, para nossa base de dados e nossas implementações, encontra-se entre os piores métodos. Apesar do resultado pobre usando parâmetros default, AIS obteve o melhor resultado com os parâmetros otimizados pelo GA, o que levou DT e AIS a apresentarem os melhores e mais robustos resultados entre todos os métodos testados.In 2002, January the 31st, the famous journal Nature, with a strong impact in the scientific environment, published some news about immune based systems. Among the different considered applications, we can find detection of fraudulent financial transactions. One can find there the possibility of a commercial use of such system as close as 2003, in a British company. In spite of that, we do not know of any scientific publication that uses Artificial Immune Systems in financial fraud detection. This work reports results very satisfactory on the application of Artificial Immune Systems (AIS) to credit card fraud detection. In fact, scientific financial fraud detection publications are quite rare, as point out Phua et al. [PLSG05], in particular for credit card transactions. Phua et al. points out the fact that no public database of financial fraud transactions is available for public tests as the main cause of such a small number of publications. Two of the most important publications in this subject that report results about their implementations are the prized Maes (2000), that compares Neural Networks and Bayesian Networks in credit card fraud detection, with a favored result for Bayesian Networks and Stolfo et al. (1997), that proposed the method AdaCost. This thesis joins both these works and publishes results in credit card fraud detection. Moreover, in spite the non availability of Maes data and implementations, we reproduce the results of their and amplify the set of comparisons in such a way to compare the methods Neural Networks, Bayesian Networks, and also Artificial Immune Systems, Decision Trees, and even the simple Naïve Bayes. We reproduce in certain way the results of Stolfo et al. (1997) when we verify that the usage of a cost sensitive meta-heuristics, in fact generalized from the generalization done from the AdaBoost to the AdaCost, applied to several tested methods substantially improves it performance for all methods, but Naive Bayes. Our analysis took into account the skewed nature of the dataset, as well as the need of a parametric adjustment, sometimes through the usage of genetic algorithms, in order to obtain the best results from each compared method

    A comparison of classification methods applied to credit card fraud detection

    Get PDF
    Em anos recentes, muitos algoritmos bio-inspirados têm surgido para resolver problemas de classificação. Em confirmação a isso, a revista Nature, em 2002, publicou um artigo que já apontava para o ano de 2003 o uso comercial de Sistemas Imunológicos Artificiais para detecção de fraude em instituições financeiras por uma empresa britânica. Apesar disso, não observamos, a luz de nosso conhecimento, nenhuma publicação científica com resultados promissores desde então. Nosso trabalho tratou de aplicar Sistemas Imunológicos Artificiais (AIS) para detecção de fraude em cartões de crédito. Comparamos AIS com os métodos de Árvore de Decisão (DT), Redes Neurais (NN), Redes Bayesianas (BN) e Naive Bayes (NB). Para uma comparação mais justa entre os métodos, busca exaustiva e algoritmo genético (GA) foram utilizados para selecionar um conjunto paramétrico otimizado, no sentido de minimizar o custo de fraude na base de dados de cartões de crédito cedida por um emissor de cartões de crédito brasileiro. Em adição à essa otimização, fizemos também uma análise e busca por parâmetros mais robustos via multi-resolução, estes parâmetros são apresentados neste trabalho. Especificidades de bases de fraude como desbalanceamento de dados e o diferente custo entre falso positivo e negativo foram levadas em conta. Todas as execuções foram realizadas no Weka, um software público e Open Source, e sempre foram utilizadas bases de teste para validação dos classificadores. Os resultados obtidos são consistentes com Maes et al. que mostra que BN são melhores que NN e, embora NN seja um dos métodos mais utilizados hoje, para nossa base de dados e nossas implementações, encontra-se entre os piores métodos. Apesar do resultado pobre usando parâmetros default, AIS obteve o melhor resultado com os parâmetros otimizados pelo GA, o que levou DT e AIS a apresentarem os melhores e mais robustos resultados entre todos os métodos testados.In 2002, January the 31st, the famous journal Nature, with a strong impact in the scientific environment, published some news about immune based systems. Among the different considered applications, we can find detection of fraudulent financial transactions. One can find there the possibility of a commercial use of such system as close as 2003, in a British company. In spite of that, we do not know of any scientific publication that uses Artificial Immune Systems in financial fraud detection. This work reports results very satisfactory on the application of Artificial Immune Systems (AIS) to credit card fraud detection. In fact, scientific financial fraud detection publications are quite rare, as point out Phua et al. [PLSG05], in particular for credit card transactions. Phua et al. points out the fact that no public database of financial fraud transactions is available for public tests as the main cause of such a small number of publications. Two of the most important publications in this subject that report results about their implementations are the prized Maes (2000), that compares Neural Networks and Bayesian Networks in credit card fraud detection, with a favored result for Bayesian Networks and Stolfo et al. (1997), that proposed the method AdaCost. This thesis joins both these works and publishes results in credit card fraud detection. Moreover, in spite the non availability of Maes data and implementations, we reproduce the results of their and amplify the set of comparisons in such a way to compare the methods Neural Networks, Bayesian Networks, and also Artificial Immune Systems, Decision Trees, and even the simple Naïve Bayes. We reproduce in certain way the results of Stolfo et al. (1997) when we verify that the usage of a cost sensitive meta-heuristics, in fact generalized from the generalization done from the AdaBoost to the AdaCost, applied to several tested methods substantially improves it performance for all methods, but Naive Bayes. Our analysis took into account the skewed nature of the dataset, as well as the need of a parametric adjustment, sometimes through the usage of genetic algorithms, in order to obtain the best results from each compared method

    Analyzing Safe Haven, Hedging and Diversifier Characteristics of Heterogeneous Cryptocurrencies against G7 and BRICS Market Indexes

    No full text
    Cryptocurrency markets have experienced large growth in recent years, with an increase in the number and diversity of traded assets. Previous work has addressed the economic properties of Bitcoin with regards to its hedging or diversification properties. However, the surge of many alternatives, applications, and decentralized finance services on a variety of blockchain networks requires a re-examination of those properties, including indexes from outside the big economies and the inclusion of a variety of cryptocurrencies. In this paper, we report the results of studying the most representative cryptocurrency of each consensus mechanism by trading volume, forming a list of twenty-four cryptocurrencies from the 1st of January 2018 to the 30th of September 2022. Using the Baur and McDermott model, we examine hedge, safe haven, and diversifier properties of all assets for all G7 country&rsquo;s major indexes as well as all BRICS major indexes breaking it down by two attributes: kind of blockchain technology and pre/during COVID health crisis. Results show that both attributes play an important role in the hedge, safe haven, and diversifier properties associated with the asset. Concretely: stablecoins appear to be the only ones to maintain hedge property in most analyzed markets pre- and during-COVID; Bitcoin investment properties shifted after the COVID crisis started; China and Russia stopped being correlated with the cryptocurrency after the COVID crisis hit

    An empirical approach and practical framework for a decentralized Ethereum Ecosystem Index (EEI)

    No full text
    Stock market indices are pivotal tools for establishing market benchmarks, enabling investors to navigate risk and volatility while capitalizing on the stock market’s prospects through index funds. For participants in decentralized finance (DeFi), the formulation of a token index emerges as a vital resource. Nevertheless, this endeavor is complex, encompassing challenges such as transaction fees and the variable availability of tokens, attributed to their brief history or limited liquidity. This research introduces an index tailored for the Ethereum ecosystem, the leading smart contract platform, and conducts a comparative analysis of capitalization-weighted (CW) and equal-weighted (EW) index performances. The article delineates exhaustive criteria for token eligibility, intending to serve as a comprehensive guide for fellow researchers. The results indicate a consistent superior performance of CW indices over EW indices in terms of return and risk metrics, with a 30-constituent CW index outshining its counterparts with varied constituent numbers. The recommended CW30 index demonstrates substantial advantages in comparison to established benchmarks, including prominent indices like DeFi Pulse Index (DPI) and CRypto IndeX (CRIX). Additionally, the article explores the practicality of implementing the CW30 in Layer 2 networks of the Ethereum Ecosystem, advocating for the Arbitrum infrastructure as the optimal choice for the decentralized crypto index protocol herein referred to as the Ethereum Ecosystem Index (EEI). The study’s insights aspire to enrich the DeFi ecosystem, offering a nuanced understanding of network selection and a strategic framework for implementation. This research significantly enhances the existing literature on index construction and performance within the Ethereum ecosystem. To our knowledge, it represents a pioneering comprehensive analysis of an index that accurately mirrors the Ethereum market, advancing our comprehension of its intricacies and wider ramifications. Moreover, this study stands as one of the initial thorough examinations of index construction methodologies within the nascent asset class of crypto. The insights gleaned provide a pragmatic approach to index construction and introduce an index poised to serve as a benchmark for index products. In illuminating the unique facets of the Ethereum ecosystem, this research makes a substantial contribution to the current discourse on crypto, offering valuable perspectives for investors, market stakeholders, and the ongoing exploration of digital assets

    Investigation of Genetic Disturbances in Oxygen Sensing and Erythropoietin Signaling Pathways in Cases of Idiopathic Erythrocytosis

    Get PDF
    Background. Idiopathic erythrocytosis is the term reserved for cases with unexplained origins of abnormally increased hemoglobin after initial investigation. Extensive molecular investigation of genes associated with oxygen sensing and erythropoietin signaling pathways, in those cases, usually involves sequencing all of their exons and it may be time consuming. Aim. To perform a strategy for molecular investigation of patients with idiopathic erythrocytosis regarding oxygen sensing and erythropoietin signaling pathways. Methods. Samples of patients with idiopathic erythrocytosis were evaluated for the EPOR, VHL, PHD2, and HIF-2α genes using bidirectional sequencing of their hotspots. Results. One case was associated with HIF-2α mutation. Sequencing did not identify any pathogenic mutation in 4 of 5 cases studied in any of the studied genes. Three known nonpathogenic polymorphisms were found (VHL p.P25L, rs35460768; HIF-2α p.N636N, rs35606117; HIF-2α p.P579P, rs184760160). Conclusion. Extensive molecular investigation of cases considered as idiopathic erythrocytosis does not frequently change the treatment of the patient. However, we propose a complementary molecular investigation of those cases comprising genes associated with erythrocytosis phenotype to meet both academic and genetic counseling purposes
    corecore