168 research outputs found

    Predicting Multi-class Customer Profiles Based on Transactions: a Case Study in Food Sales

    Get PDF
    Predicting the class of a customer profile is a key task in marketing, which enables businesses to approach the right customer with the right product at the right time through the right channel to satisfy the customer's evolving needs. However, due to costs, privacy and/or data protection, only the business' owned transactional data is typically available for constructing customer profiles. Predicting the class of customer profiles based on such data is challenging, as the data tends to be very large, heavily sparse and highly skewed. We present a new approach that is designed to efficiently and accurately handle the multi-class classification of customer profiles built using sparse and skewed transactional data. Our approach first bins the customer profiles on the basis of the number of items transacted. The discovered bins are then partitioned and prototypes within each of the discovered bins selected to build the multi-class classifier models. The results obtained from using four multi-class classifiers on real-world transactional data from the food sales domain consistently show the critical numbers of items at which the predictive performance of customer profiles can be substantially improved

    Reducing Spatial Data Complexity for Classification Models

    Get PDF
    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the comparable compression levels

    Unsupervised Ensembles Techniques for Visualization

    Get PDF
    In this paper we introduce two unsupervised techniques for visualization purposes based on the use of ensemble methods. The unsupervised techniques which are often quite sensitive to the presence of outliers are combined with the ensemble approaches in order to overcome the influence of outliers. The first technique is based on the use of Principal Component Analysis and the second one is known for its topology preserving characteristics and is based on the combination of the Scale Invariant Map and Maximum Likelihood Hebbian learning. In order to show the advantage of these novel ensemble-based techniques the results of some experiments carried out on artificial and real data sets are included

    Area-level and individual correlates of active transportation among adults in Germany: A population-based multilevel study

    Get PDF
    This study aimed at estimating the prevalence in adults of complying with the aerobic physical activity (PA) recommendation through transportation-related walking and cycling. Furthermore, potential determinants of transportation-related PA recommendation compliance were investigated. 10,872 men and 13,144 women aged 18 years or older participated in the cross-sectional 'German Health Update 2014/15 - EHIS' in Germany. Transportation-related walking and cycling were assessed using the European Health Interview Survey-Physical Activity Questionnaire. Three outcome indicators were constructed: walking, cycling, and total active transportation (>= 600 metabolic equivalent, MET-min/week). Associations were analyzed using multilevel regression analysis. Forty-two percent of men and 39% of women achieved >= 600 MET-min/week with total active transportation. The corresponding percentages for walking were 27% and 28% and for cycling 17% and 13%, respectively. Higher population density, older age, lower income, higher work-related and leisure-time PA, not being obese, and better self-perceived health were positively associated with transportation-related walking and cycling and total active transportation among both men and women. The promotion of walking and cycling among inactive people has great potential to increase PA in the general adult population and to comply with PA recommendations. Several correlates of active transportation were identified which should be considered when planning public health policies and interventions

    Epitaxy: Programmable Atom Equivalents

    Get PDF
    The programmability of DNA makes it an attractive structure-directing ligand for the assembly of nanoparticle (NP) superlattices in a manner that mimics many aspects of atomic crystallization. However, the synthesis of multilayer single crystals of defined size remains a challenge. Though previous studies considered lattice mismatch as the major limiting factor for multilayer assembly, thin film growth depends on many interlinked variables. Here, a more comprehensive approach is taken to study fundamental elements, such as the growth temperature and the thermodynamics of interfacial energetics, to achieve epitaxial growth of NP thin films. Both surface morphology and internal thin film structure are examined to provide an understanding of particle attachment and reorganization during growth. Under equilibrium conditions, single crystalline, multilayer thin films can be synthesized over 500 ร— 500 ฮผmยฒ areas on lithographically patterned templates, whereas deposition under kinetic conditions leads to the rapid growth of glassy films. Importantly, these superlattices follow the same patterns of crystal growth demonstrated in atomic thin film deposition, allowing these processes to be understood in the context of well-studied atomic epitaxy and enabling a nanoscale model to study fundamental crystallization processes. Through understanding the role of epitaxy as a driving force for NP assembly, we are able to realize 3D architectures of arbitrary domain geometry and size.United States. Air Force Office of Scientific Research (AFOSR FA9550-11-1-0275)United States. Air Force Office of Scientific Research (FA9550-12-1-0280)United States. Department of Defense (N00014-15-1-0043)United States. Department of Energy (Grant DE-SC0000989-0002)National Science Foundation (U.S.) (Award DMR-1121262

    A linguistic analysis of lying in negative evaluations: The speech act performance of Chinese learners of Korean

    Get PDF
    ์ด ๋…ผ๋ฌธ์€ ์ค‘๊ตญ์ธ ํ•œ๊ตญ์–ด ํ•™์Šต์ž์™€ ํ•œ๊ตญ์–ด ํ™”์ž๋“ค ์‚ฌ์ด์˜ โ€˜๊ฑฐ์ง“๋งโ€™ ํ™”ํ–‰ ์–‘์ƒ์„ ์–ธ์–ดํ•™์ ์œผ๋กœ ๋ถ„์„ํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” โ€˜๊ฑฐ์ง“๋งโ€™์ด๋ž€ ์š”์ฒญ, ์‚ฌ๊ณผ, ๊ฑฐ์ ˆ ๋“ฑ๊ณผ ๊ฐ™์€ ํ™”ํ–‰์˜ ์ผ์ข…์œผ๋กœ์„œ โ€˜๋ถ€์ •์  ํ‰๊ฐ€โ€™์— ์†ํ•˜๋ฉฐ ๋Œ€ํ™” ์ฐธ์—ฌ์ž๋‚˜ ์ƒํ™ฉ์„ ๊ณ ๋ คํ•œ ์†Œ์œ„ โ€˜์„ ์˜์˜ ๊ฑฐ์ง“๋งโ€™์„ ๊ฐ€๋ฆฌํ‚ค๋Š” ๊ฒƒ์œผ๋กœ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ์ค‘๊ตญ์ธ ํ•œ๊ตญ์–ด ํ•™์Šต์ž 15๋ช…๊ณผ ํ•œ๊ตญ์–ด ํ™”์ž 15๋ช…์„ ๋Œ€์ƒ์œผ๋กœ ๋‹ดํ™”์™„์„ฑํ…Œ์ŠคํŠธ(DCT)์™€ ๋ถ€์—ฐ์„ค๋ช…์งˆ๋ฌธ์ง€(QFE)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ”ผ์‹คํ—˜์ž๋“ค์˜ ํ™”ํ–‰์„ ๋ถ„์„ํ•˜์˜€๋‹ค. ํ”ผ์‹คํ—˜์ž ์ž์‹ ๋“ค์˜ ์„ค๋ช…๊ณผ ํ•œ๊ตญ์–ด๊ต์œก ์ „๋ฌธ๊ฐ€ ๋‹ค์„ฏ ๋ช…์˜ ํŒ์ •์„ ์ข…ํ•ฉํ•ด โ€˜๊ฑฐ์ง“๋งโ€™ ํ™”ํ–‰์„ ๊ฐ€๋ ค๋‚ด๊ณ  ํ†ต๊ณ„ ์ฒ˜๋ฆฌ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๋ก ์— ๋„๋‹ฌํ–ˆ๋‹ค. ํ•œ๊ตญ์–ด ํ™”์ž๋“ค์ด ์ค‘๊ตญ์ธ ํ•œ๊ตญ์–ด ํ•™์Šต์ž๋“ค๋ณด๋‹ค (์„ ์˜์˜) ๊ฑฐ์ง“๋ง์„ ๋” ๋งŽ์ด ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‘ ์ง‘๋‹จ ๋ชจ๋‘ ๋ถ€์ •์  ํ‰๊ฐ€๊ฐ€ ์‚ฌ๋ฌผ์— ๊ด€๋ จ๋œ ๊ฒฝ์šฐ๋ณด๋‹ค ์‚ฌ๋žŒ์— ๊ด€๋ จ๋œ ๊ฒฝ์šฐ์— โ€˜๊ฑฐ์ง“๋งโ€™ ํ™”ํ–‰์„ ๋” ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ™”์ž์™€ ์ฒญ์ž ์‚ฌ์ด์˜ ์นœ์†Œ๊ด€๊ณ„(distance)๋‚˜ ์ƒํ•˜๊ด€๊ณ„(power)๋Š” ๊ฑฐ์ง“๋ง ์‚ฌ์šฉ์— ์ง์ ‘์  ์ƒ๊ด€ ๊ด€๊ณ„๋ฅผ ๋ณด์—ฌ์ฃผ์ง€ ์•Š์•˜๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ์ง€๊ธˆ๊นŒ์ง€ ํ™”ํ–‰ ์—ฐ๊ตฌ ์ค‘์—์„œ ์ƒ๋Œ€์ ์œผ๋กœ ์—ฐ๊ตฌ๊ฐ€ ๋ถ€์ง„ํ–ˆ๋˜ ๋ถ€์ •ํ‰๊ฐ€์™€ โ€˜๊ฑฐ์ง“๋งโ€™ ํ™”ํ–‰์— ๋Œ€ํ•œ ๋ถ„์„์„ ์‹œ๋„ํ–ˆ๋‹ค๋Š” ์ ์—์„œ ์˜๋ฏธ๊ฐ€ ์žˆ๋‹ค. ๋˜ํ•œ ํ•œ๊ตญ์–ด ํ™”์ž์™€ ์ค‘๊ตญ์ธ ํ•œ๊ตญ์–ด ํ•™์Šต์ž ์‚ฌ์ด์— ๋ณด์ด๋Š” ํ™”ํ–‰ ์ˆ˜ํ–‰์˜ ์ฐจ์ด๋ฅผ ๋ฌธํ™”์ธ์‹(cultural awareness)์˜ ๊ด€์ ์—์„œ ํ•ด์„ํ•ด ๋ณผ ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ๋„ ์—ด์–ด ์ฃผ์—ˆ๋‹ค

    Outlier Resistant PCA Ensembles

    Get PDF
    Statistical re-sampling techniques have been used extensively and successfully in the machine learning approaches for generation of classifier and predictor ensembles. It has been frequently shown that combining so called unstable predictors has a stabilizing effect on and improves the performance of the prediction system generated in this way. In this paper we use the re-sampling techniques in the context of Principal Component Analysis (PCA). We show that the proposed PCA ensembles exhibit a much more robust behaviour in the presence of outliers which can seriously affect the performance of an individual PCA algorithm. The performance and characteristics of the proposed approaches are illustrated on a number of experimental studies where an individual PCA is compared to the introduced PCA ensemble

    Epitaxy: Programmable Atom Equivalents Versus Atoms

    Get PDF
    The programmability of DNA makes it an attractive structure-directing ligand for the assembly of nanoparticle (NP) superlattices in a manner that mimics many aspects of atomic crystallization. However, the synthesis of multilayer single crystals of defined size remains a challenge. Though previous studies considered lattice mismatch as the major limiting factor for multilayer assembly, thin film growth depends on many interlinked variables. Here, a more comprehensive approach is taken to study fundamental elements, such as the growth temperature and the thermodynamics of interfacial energetics, to achieve epitaxial growth of NP thin films. Both surface morphology and internal thin film structure are examined to provide an understanding of particle attachment and reorganization during growth. Under equilibrium conditions, single crystalline, multilayer thin films can be synthesized over 500 ร— 500 ฮผm2 areas on lithographically patterned templates, whereas deposition under kinetic conditions leads to the rapid growth of glassy films. Importantly, these superlattices follow the same patterns of crystal growth demonstrated in atomic thin film deposition, allowing these processes to be understood in the context of well-studied atomic epitaxy and enabling a nanoscale model to study fundamental crystallization processes. Through understanding the role of epitaxy as a driving force for NP assembly, we are able to realize 3D architectures of arbitrary domain geometry and size

    Epitaxy: Programmable Atom Equivalents Versus Atoms

    Get PDF
    The programmability of DNA makes it an attractive structure-directing ligand for the assembly of nanoparticle (NP) superlattices in a manner that mimics many aspects of atomic crystallization. However, the synthesis of multilayer single crystals of defined size remains a challenge. Though previous studies considered lattice mismatch as the major limiting factor for multilayer assembly, thin film growth depends on many interlinked variables. Here, a more comprehensive approach is taken to study fundamental elements, such as the growth temperature and the thermodynamics of interfacial energetics, to achieve epitaxial growth of NP thin films. Both surface morphology and internal thin film structure are examined to provide an understanding of particle attachment and reorganization during growth. Under equilibrium conditions, single crystalline, multilayer thin films can be synthesized over 500 ร— 500 ฮผm2 areas on lithographically patterned templates, whereas deposition under kinetic conditions leads to the rapid growth of glassy films. Importantly, these superlattices follow the same patterns of crystal growth demonstrated in atomic thin film deposition, allowing these processes to be understood in the context of well-studied atomic epitaxy and enabling a nanoscale model to study fundamental crystallization processes. Through understanding the role of epitaxy as a driving force for NP assembly, we are able to realize 3D architectures of arbitrary domain geometry and size
    • โ€ฆ
    corecore