52 research outputs found

    Relationship between product based loyalty and clustering based on supermarket visit and spending patterns

    Get PDF
    Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers’ loyalty. The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies

    Temporal mining of the web and supermarket data using fuzzy and rough set clustering

    Get PDF
    xviii, 117 leaves : ill. (some col.) ; 28 cm.Includes abstract.Includes bibliographical references (leaves 114-117).Clustering is an important aspect of data mining. Many data mining applications tend to be more amenable to non-conventional clustering techniques. In this research three clustering methods are employed to analyze the web usage and super market data sets: conventional, rough set and fuzzy methods. Interval clusters based on fuzzy memberships are also created. The web usage data were collected from three educational web sites. The supermarket data spanned twenty-six weeks of transactions from twelve stores spanning three regions. Cluster sizes obtained using the three methods are compared, and cluster characteristics are analyzed. Web users and supermarket customers tend to change their characteristics over a period of time. These changes may be temporary or permanent. This thesis also studies the changes in cluster characteristics over time. Both experiments demonstrate that the rough and fuzzy methods are more subtle and accurate in capturing the slight differences among clusters

    Modeling and evaluation of knowledge discovery in wholesale and retail industry

    Get PDF
    x, 168 leaves : ill. ; 29 cm.Includes abstract.Includes bibliographical references (leaves 163-168).This thesis demonstrates an enterprise-wide Knowledge Discovery in Databases (KDD) process CRISP for wholesale and retail industry, which can facilitate business decision-making processes and improve corporate profits. While part of the KDD process described here is well documented, the modeling and evaluations used in the commercial products is not reported in literature. Hence, the focus of this thesis is on the development and evaluation of models used in the knowledge discovery. Description of the underlying models will help the decision makers better understand the quality and limitations of the KDD process. The usefulness of KDD process CRISP is illustrated for two companies, i.e. a multinational retailer and a small chain of specialty grocery stores. The detailed steps highlight business understanding, data exploration, data preparation. data modeling, results evaluation, and interpretation. The methodologies applied in this thesis include prediction, clustering and association to discover knowledge about products/suppliers, consumers, and business units

    Comparative Analysis To Determine Predictive Model Accuracy : A dynamic currency exchange rate predictive model development using SAP HANA Predictive Analytic Library (PAL) algorithm

    Get PDF
    The present thesis describes the development and implementation of a dynamic currency exchange rate predictive model. The aim of the thesis was to measure and determine the accuracy of a dynamic currency exchange rate predictive model by analysing different historical data samples. The theoretical framework of the thesis focused on research into different disciplines related to predicted analytics and the different data mining algorithms. The study was carried out using quantitative data samples and SAP high performance analytic appliance predictive analysis library (PAL) Time series double exponential algorithm. The measurement was done by comparing the predicted or forecasted exchange rates against the actual exchange rates. Standard statistical methods were used to determine the accuracy of the predictive model. The results of the study showed that last three months data sample or most recent data gives better predictive results for short term forecasting while the full data sample or entire data set gives better result for longer term forecasting. Based on the study, it is recommended that fundamental analysis of currency exchange method which takes account of the driving forces behind currency exchange rates such as political and economic situation, the rise and fall of interest rates and other economic indicators should be incorporated along technical analysis which involves the use of historical data to get give better accuracy

    WLAN-paikannuksen elinkaaren tukeminen

    Get PDF
    The advent of GPS positioning at the turn of the millennium provided consumers with worldwide access to outdoor location information. For the purposes of indoor positioning, however, the GPS signal rarely penetrates buildings well enough to maintain the same level of positioning granularity as outdoors. Arriving around the same time, wireless local area networks (WLAN) have gained widespread support both in terms of infrastructure deployments and client proliferation. A promising approach to bridge the location context then has been positioning based on WLAN signals. In addition to being readily available in most environments needing support for location information, the adoption of a WLAN positioning system is financially low-cost compared to dedicated infrastructure approaches, partly due to operating on an unlicensed frequency band. Furthermore, the accuracy provided by this approach is enough for a wide range of location-based services, such as navigation and location-aware advertisements. In spite of this attractive proposition and extensive research in both academia and industry, WLAN positioning has yet to become the de facto choice for indoor positioning. This is despite over 20 000 publications and the foundation of several companies. The main reasons for this include: (i) the cost of deployment, and re-deployment, which is often significant, if not prohibitive, in terms of work hours; (ii) the complex propagation of the wireless signal, which -- through interaction with the environment -- renders it inherently stochastic; (iii) the use of an unlicensed frequency band, which means the wireless medium faces fierce competition by other technologies, and even unintentional radiators, that can impair traffic in unforeseen ways and impact positioning accuracy. This thesis addresses these issues by developing novel solutions for reducing the effort of deployment, including optimizing the indoor location topology for the use of WLAN positioning, as well as automatically detecting sources of cross-technology interference. These contributions pave the way for WLAN positioning to become as ubiquitous as the underlying technology.GPS-paikannus avattiin julkiseen käyttöön vuosituhannen vaihteessa, jonka jälkeen sitä on voinut käyttää sijainnin paikantamiseen ulkotiloissa kaikkialla maailmassa. Sisätiloissa GPS-signaali kuitenkin harvoin läpäisee rakennuksia kyllin hyvin voidakseen tarjota vastaavaa paikannustarkkuutta. Langattomat lähiverkot (WLAN), mukaan lukien tukiasemat ja käyttölaitteet, yleistyivät nopeasti samoihin aikoihin. Näiden verkkojen signaalien käyttö on siksi alusta asti tarjonnut lupaavia mahdollisuuksia sisätilapaikannukseen. Useimmissa ympäristöissä on jo valmiit WLAN-verkot, joten paikannuksen käyttöönotto on edullista verrattuna järjestelmiin, jotka vaativat erillisen laitteiston. Tämä johtuu osittain lisenssivapaasta taajuusalueesta, joka mahdollistaa kohtuuhintaiset päätelaitteet. WLAN-paikannuksen tarjoama tarkkuus on lisäksi riittävä monille sijaintipohjaisille palveluille, kuten suunnistamiselle ja paikkatietoisille mainoksille. Näistä lupaavista alkuasetelmista ja laajasta tutkimuksesta huolimatta WLAN-paikannus ei ole kuitenkaan pystynyt lunastamaan paikkaansa pääasiallisena sisätilapaikannusmenetelmänä. Vaivannäöstä ei ole puutetta; vuosien saatossa on julkaistu yli 20 000 tieteellistä artikkelia sekä perustettu useita yrityksiä. Syitä tähän kehitykseen on useita. Ensinnäkin, paikannuksen pystyttäminen ja ylläpito vaativat aikaa ja vaivaa. Toiseksi, langattoman signaalin eteneminen ja vuorovaikutus ympäristön kanssa on hyvin monimutkaista, mikä tekee mallintamisesta vaikeaa. Kolmanneksi, eri teknologiat ja laitteet kilpailevat lisenssivapaan taajuusalueen käytöstä, mikä johtaa satunnaisiin paikannustarkkuuteen vaikuttaviin tietoliikennehäiriöihin. Väitöskirja esittelee uusia menetelmiä joilla voidaan merkittävästi pienentää paikannusjärjestelmän asennuskustannuksia, jakaa ympäristö automaattisesti osiin WLAN-paikannusta varten, sekä tunnistaa mahdolliset langattomat häiriölähteet. Nämä kehitysaskeleet edesauttavat WLAN-paikannuksen yleistymistä jokapäiväiseen käyttöön

    Data mining using neural networks

    Get PDF
    Data mining is about the search for relationships and global patterns in large databases that are increasing in size. Data mining is beneficial for anyone who has a huge amount of data, for example, customer and business data, transaction, marketing, financial, manufacturing and web data etc. The results of data mining are also referred to as knowledge in the form of rules, regularities and constraints. Rule mining is one of the popular data mining methods since rules provide concise statements of potentially important information that is easily understood by end users and also actionable patterns. At present rule mining has received a good deal of attention and enthusiasm from data mining researchers since rule mining is capable of solving many data mining problems such as classification, association, customer profiling, summarization, segmentation and many others. This thesis makes several contributions by proposing rule mining methods using genetic algorithms and neural networks. The thesis first proposes rule mining methods using a genetic algorithm. These methods are based on an integrated framework but capable of mining three major classes of rules. Moreover, the rule mining processes in these methods are controlled by tuning of two data mining measures such as support and confidence. The thesis shows how to build data mining predictive models using the resultant rules of the proposed methods. Another key contribution of the thesis is the proposal of rule mining methods using supervised neural networks. The thesis mathematically analyses the Widrow-Hoff learning algorithm of a single-layered neural network, which results in a foundation for rule mining algorithms using single-layered neural networks. Three rule mining algorithms using single-layered neural networks are proposed for the three major classes of rules on the basis of the proposed theorems. The thesis also looks at the problem of rule mining where user guidance is absent. The thesis proposes a guided rule mining system to overcome this problem. The thesis extends this work further by comparing the performance of the algorithm used in the proposed guided rule mining system with Apriori data mining algorithm. Finally, the thesis studies the Kohonen self-organization map as an unsupervised neural network for rule mining algorithms. Two approaches are adopted based on the way of self-organization maps applied in rule mining models. In the first approach, self-organization map is used for clustering, which provides class information to the rule mining process. In the second approach, automated rule mining takes the place of trained neurons as it grows in a hierarchical structure

    Visualising Business Data: A Survey

    Get PDF
    A rapidly increasing number of businesses rely on visualisation solutions for their data management challenges. This demand stems from an industry-wide shift towards data-driven approaches to decision making and problem-solving. However, there is an overwhelming mass of heterogeneous data collected as a result. The analysis of these data become a critical and challenging part of the business process. Employing visual analysis increases data comprehension thus enabling a wider range of users to interpret the underlying behaviour, as opposed to skilled but expensive data analysts. Widening the reach to an audience with a broader range of backgrounds creates new opportunities for decision making, problem-solving, trend identification, and creative thinking. In this survey, we identify trends in business visualisation and visual analytic literature where visualisation is used to address data challenges and identify areas in which industries use visual design to develop their understanding of the business environment. Our novel classification of literature includes the topics of businesses intelligence, business ecosystem, customer-centric. This survey provides a valuable overview and insight into the business visualisation literature with a novel classification that highlights both mature and less developed research directions

    Dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users

    Full text link
    [EN] The electricity sector is currently undergoing a process of liberalization and separation of roles, which is being implemented under the regulatory auspices of each Member State of the European Union and, therefore, with different speeds, perspectives and objectives that must converge on a common horizon, where Europe will benefit from an interconnected energy market in which producers and consumers can participate in free competition. This process of liberalization and separation of roles involves two consequences or, viewed another way, entails a major consequence from which other immediate consequence, as a necessity, is derived. The main consequence is the increased complexity in the management and supervision of a system, the electrical, increasingly interconnected and participatory, with connection of distributed energy sources, much of them from renewable sources, at different voltage levels and with different generation capacity at any point in the network. From this situation the other consequence is derived, which is the need to communicate information between agents, reliably, safely and quickly, and that this information is analyzed in the most effective way possible, to form part of the processes of decision taking that improve the observability and controllability of a system which is increasing in complexity and number of agents involved. With the evolution of Information and Communication Technologies (ICT), and the investments both in improving existing measurement and communications infrastructure, and taking the measurement and actuation capacity to a greater number of points in medium and low voltage networks, the availability of data that informs of the state of the network is increasingly higher and more complete. All these systems are part of the so-called Smart Grids, or intelligent networks of the future, a future which is not so far. One such source of information comes from the energy consumption of customers, measured on a regular basis (every hour, half hour or quarter-hour) and sent to the Distribution System Operators from the Smart Meters making use of Advanced Metering Infrastructure (AMI). This way, there is an increasingly amount of information on the energy consumption of customers, being stored in Big Data systems. This growing source of information demands specialized techniques which can take benefit from it, extracting a useful and summarized knowledge from it. This thesis deals with the use of this information of energy consumption from Smart Meters, in particular on the application of data mining techniques to obtain temporal patterns that characterize the users of electrical energy, grouping them according to these patterns in a small number of groups or clusters, that allow evaluating how users consume energy, both during the day and during a sequence of days, allowing to assess trends and predict future scenarios. For this, the current techniques are studied and, proving that the current works do not cover this objective, clustering or dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users are developed. These techniques are tested and validated on a database of hourly energy consumption values for a sample of residential customers in Spain during years 2008 and 2009. The results allow to observe both the characterization in consumption patterns of the different types of residential energy consumers, and their evolution over time, and to assess, for example, how the regulatory changes that occurred in Spain in the electricity sector during those years influenced in the temporal patterns of energy consumption.[ES] El sector eléctrico se halla actualmente sometido a un proceso de liberalización y separación de roles, que está siendo aplicado bajo los auspicios regulatorios de cada Estado Miembro de la Unión Europea y, por tanto, con distintas velocidades, perspectivas y objetivos que deben confluir en un horizonte común, en donde Europa se beneficiará de un mercado energético interconectado, en el cual productores y consumidores podrán participar en libre competencia. Este proceso de liberalización y separación de roles conlleva dos consecuencias o, visto de otra manera, conlleva una consecuencia principal de la cual se deriva, como necesidad, otra consecuencia inmediata. La consecuencia principal es el aumento de la complejidad en la gestión y supervisión de un sistema, el eléctrico, cada vez más interconectado y participativo, con conexión de fuentes distribuidas de energía, muchas de ellas de origen renovable, a distintos niveles de tensión y con distinta capacidad de generación, en cualquier punto de la red. De esta situación se deriva la otra consecuencia, que es la necesidad de comunicar información entre los distintos agentes, de forma fiable, segura y rápida, y que esta información sea analizada de la forma más eficaz posible, para que forme parte de los procesos de toma de decisiones que mejoran la observabilidad y controlabilidad de un sistema cada vez más complejo y con más agentes involucrados. Con el avance de las Tecnologías de Información y Comunicaciones (TIC), y las inversiones tanto en mejora de la infraestructura existente de medida y comunicaciones, como en llevar la obtención de medidas y la capacidad de actuación a un mayor número de puntos en redes de media y baja tensión, la disponibilidad de datos sobre el estado de la red es cada vez mayor y más completa. Todos estos sistemas forman parte de las llamadas Smart Grids, o redes inteligentes del futuro, un futuro ya no tan lejano. Una de estas fuentes de información proviene de los consumos energéticos de los clientes, medidos de forma periódica (cada hora, media hora o cuarto de hora) y enviados hacia las Distribuidoras desde los contadores inteligentes o Smart Meters, mediante infraestructura avanzada de medida o Advanced Metering Infrastructure (AMI). De esta forma, cada vez se tiene una mayor cantidad de información sobre los consumos energéticos de los clientes, almacenada en sistemas de Big Data. Esta cada vez mayor fuente de información demanda técnicas especializadas que sepan aprovecharla, extrayendo un conocimiento útil y resumido de la misma. La presente Tesis doctoral versa sobre el uso de esta información de consumos energéticos de los contadores inteligentes, en concreto sobre la aplicación de técnicas de minería de datos (data mining) para obtener patrones temporales que caractericen a los usuarios de energía eléctrica, agrupándolos según estos mismos patrones en un número reducido de grupos o clusters, que permiten evaluar la forma en que los usuarios consumen la energía, tanto a lo largo del día como durante una secuencia de días, permitiendo evaluar tendencias y predecir escenarios futuros. Para ello se estudian las técnicas actuales y, comprobando que los trabajos actuales no cubren este objetivo, se desarrollan técnicas de clustering o segmentación dinámica aplicadas a curvas de carga de consumo eléctrico diario de clientes domésticos. Estas técnicas se prueban y validan sobre una base de datos de consumos energéticos horarios de una muestra de clientes residenciales en España durante los años 2008 y 2009. Los resultados permiten observar tanto la caracterización en consumos de los distintos tipos de consumidores energéticos residenciales, como su evolución en el tiempo, y permiten evaluar, por ejemplo, cómo influenciaron en los patrones temporales de consumos los cambios regulatorios que se produjeron en España en el sector eléctrico durante esos años.[CA] El sector elèctric es troba actualment sotmès a un procés de liberalització i separació de rols, que s'està aplicant davall els auspicis reguladors de cada estat membre de la Unió Europea i, per tant, amb distintes velocitats, perspectives i objectius que han de confluir en un horitzó comú, on Europa es beneficiarà d'un mercat energètic interconnectat, en el qual productors i consumidors podran participar en lliure competència. Aquest procés de liberalització i separació de rols comporta dues conseqüències o, vist d'una altra manera, comporta una conseqüència principal de la qual es deriva, com a necessitat, una altra conseqüència immediata. La conseqüència principal és l'augment de la complexitat en la gestió i supervisió d'un sistema, l'elèctric, cada vegada més interconnectat i participatiu, amb connexió de fonts distribuïdes d'energia, moltes d'aquestes d'origen renovable, a distints nivells de tensió i amb distinta capacitat de generació, en qualsevol punt de la xarxa. D'aquesta situació es deriva l'altra conseqüència, que és la necessitat de comunicar informació entre els distints agents, de forma fiable, segura i ràpida, i que aquesta informació siga analitzada de la manera més eficaç possible, perquè forme part dels processos de presa de decisions que milloren l'observabilitat i controlabilitat d'un sistema cada vegada més complex i amb més agents involucrats. Amb l'avanç de les tecnologies de la informació i les comunicacions (TIC), i les inversions, tant en la millora de la infraestructura existent de mesura i comunicacions, com en el trasllat de l'obtenció de mesures i capacitat d'actuació a un nombre més gran de punts en xarxes de mitjana i baixa tensió, la disponibilitat de dades sobre l'estat de la xarxa és cada vegada major i més completa. Tots aquests sistemes formen part de les denominades Smart Grids o xarxes intel·ligents del futur, un futur ja no tan llunyà. Una d'aquestes fonts d'informació prové dels consums energètics dels clients, mesurats de forma periòdica (cada hora, mitja hora o quart d'hora) i enviats cap a les distribuïdores des dels comptadors intel·ligents o Smart Meters, per mitjà d'infraestructura avançada de mesura o Advanced Metering Infrastructure (AMI). D'aquesta manera, cada vegada es té una major quantitat d'informació sobre els consums energètics dels clients, emmagatzemada en sistemes de Big Data. Aquesta cada vegada major font d'informació demanda tècniques especialitzades que sàpiguen aprofitar-la, extraient-ne un coneixement útil i resumit. La present tesi doctoral versa sobre l'ús d'aquesta informació de consums energètics dels comptadors intel·ligents, en concret sobre l'aplicació de tècniques de mineria de dades (data mining) per a obtenir patrons temporals que caracteritzen els usuaris d'energia elèctrica, agrupant-los segons aquests mateixos patrons en una quantitat reduïda de grups o clusters, que permeten avaluar la forma en què els usuaris consumeixen l'energia, tant al llarg del dia com durant una seqüència de dies, i que permetent avaluar tendències i predir escenaris futurs. Amb aquesta finalitat, s'estudien les tècniques actuals i, en comprovar que els treballs actuals no cobreixen aquest objectiu, es desenvolupen tècniques de clustering o segmentació dinàmica aplicades a corbes de càrrega de consum elèctric diari de clients domèstics. Aquestes tècniques es proven i validen sobre una base de dades de consums energètics horaris d'una mostra de clients residencials a Espanya durant els anys 2008 i 2009. Els resultats permeten observar tant la caracterització en consums dels distints tipus de consumidors energètics residencials, com la seua evolució en el temps, i permeten avaluar, per exemple, com van influenciar en els patrons temporals de consums els canvis reguladors que es van produir a Espanya en el sector elèctric durant aquests anys.Benítez Sánchez, IJ. (2015). Dynamic segmentation techniques applied to load profiles of electric energy consumption from domestic users [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/59236TESI

    Data analytics

    Get PDF
    This study guide is devoted to substantiating the nature, role and importance of data, information, analytical work, explanation of its basic principles within modern information environment, as well as consideration of the main approaches and basic tools while performing the analytical tasks by specialists in the sphere of political analytics as well as of social work

    Data analytics e intelligenza artificiale per l\u2019analisi di bilancio. Performance e profili di business degli spin-off accademici

    Get PDF
    This research applies neural networks \u2013 namely: Self-Organising Maps (SOMs) - to analyse a bunch of financial indicators drawn from the balance sheet of academic spin-offs. The goal of the work is twofold: first, it aims at processing financial data to extract knowledge about the still uncertain role and strategic profile of academic spin-offs; and second, it aims at understating whether SOMs are able or not to support investigations on firms\u2019 performance, and to decide strategic orientation thanks to the processing of financial indicators. After a deep literature review about both the application of SOMs to financial reporting data and the business profile of academic spin-offs, the paper carries on an empirical investigation on 810 Italian academic spin-offs, using their financial reporting data. The results show that SOMs are able to extract the main features of different academic spin-off archetypes that can be then explained via traditional financial analysis instruments
    corecore