12 research outputs found

    Data mining in medical records for the enhancement of strategic decisions: a case study

    Get PDF
    The impact and popularity of competition concept has been increasing in the last decades and this concept has escalated the importance of giving right decision for organizations. Decision makers have encountered the fact of using proper scientific methods instead of using intuitive and emotional choices in decision making process. In this context, many decision support models and relevant systems are still being developed in order to assist the strategic management mechanisms. There is also a critical need for automated approaches for effective and efficient utilization of massive amount of data to support corporate and individuals in strategic planning and decision-making. Data mining techniques have been used to uncover hidden patterns and relations, to summarize the data in novel ways that are both understandable and useful to the executives and also to predict future trends and behaviors in business. There has been a large body of research and practice focusing on different data mining techniques and methodologies. In this study, a large volume of record set extracted from an outpatient clinic’s medical database is used to apply data mining techniques. In the first phase of the study, the raw data in the record set are collected, preprocessed, cleaned up and eventually transformed into a suitable format for data mining. In the second phase, some of the association rule algorithms are applied to the data set in order to uncover rules for quantifying the relationship between some of the attributes in the medical records. The results are observed and comparative analysis of the observed results among different association algorithms is made. The results showed us that some critical and reasonable relations exist in the outpatient clinic operations of the hospital which could aid the hospital management to change and improve their managerial strategies regarding the quality of services given to outpatients.Decision Making, Medical Records, Data Mining, Association Rules, Outpatient Clinic.

    Analysis and Prediction of Bangalore Traffic South Road Accidents

    Get PDF
    Data mining is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cut costs or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Traffic accidents cause enormous losses for a country and plenty of national assets drain away every year due to it. The rapid proliferation of Global Position Service (GPS) devices and mounting number of traffic monitoring systems employed by municipalities have opened the door for advanced traffic control and personalized route planning. But the complexity of traffic accident analysis has brought many difficulties to traffic management and decision-making. Most state of the art traffic management and information systems focus on data analysis and very little has been done in the sense of prediction. This paper provides details about how road accidents and traffic data can be analysed and used to predict the probability of an accident to occur. To start with, the analysis has been done on the Bangalore city traffic considering five traffic stations of the south region of the city – Basavanagudi, Kumaraswamy layout, Banashankari, Jayanagar and Chamarajapet

    Study report on Indian agriculture with IoT

    Get PDF
    Most of the population of our country are depends on agriculture for their survival. Agriculture plays an important role in our country economy. But since past few years production from agriculture sector is decreasing drastically. Agriculture sector saw a drastic downfall in its productivity from past few years, there are many reasons for this downfall. In this paper we will discuss about past, present and future of agriculture in our country, agricultural policies which are provided by government to improve the growth of agriculture and reasons why we are not able see the growth in agriculture. And also we will see how can we adopt automation into agriculture using various emerging technologies like IoT (Internet of Things), data mining, cloud computing and machine learning and some authors done some quality work previously on this topic we will discuss that also. Here we will see previous work done by various authors which can be useful to increase the productivity of agriculture secto

    Data Masking, Encryption, and their Effect on Classification Performance: Trade-offs Between Data Security and Utility

    Get PDF
    As data mining increasingly shapes organizational decision-making, the quality of its results must be questioned to ensure trust in the technology. Inaccuracies can mislead decision-makers and cause costly mistakes. With more data collected for analytical purposes, privacy is also a major concern. Data security policies and regulations are increasingly put in place to manage risks, but these policies and regulations often employ technologies that substitute and/or suppress sensitive details contained in the data sets being mined. Data masking and substitution and/or data encryption and suppression of sensitive attributes from data sets can limit access to important details. It is believed that the use of data masking and encryption can impact the quality of data mining results. This dissertation investigated and compared the causal effects of data masking and encryption on classification performance as a measure of the quality of knowledge discovery. A review of the literature found a gap in the body of knowledge, indicating that this problem had not been studied before in an experimental setting. The objective of this dissertation was to gain an understanding of the trade-offs between data security and utility in the field of analytics and data mining. The research used a nationally recognized cancer incidence database, to show how masking and encryption of potentially sensitive demographic attributes such as patients’ marital status, race/ethnicity, origin, and year of birth, could have a statistically significant impact on the patients’ predicted survival. Performance parameters measured by four different classifiers delivered sizable variations in the range of 9% to 10% between a control group, where the select attributes were untouched, and two experimental groups where the attributes were substituted or suppressed to simulate the effects of the data protection techniques. In practice, this represented a corroboration of the potential risk involved when basing medical treatment decisions using data mining applications where attributes in the data sets are masked or encrypted for patient privacy and security concerns

    Opinion mining with the SentWordNet lexical resource

    Get PDF
    Sentiment classification concerns the application of automatic methods for predicting the orientation of sentiment present on text documents. It is an important subject in opinion mining research, with applications on a number of areas including recommender and advertising systems, customer intelligence and information retrieval. SentiWordNet is a lexical resource of sentiment information for terms in the English language designed to assist in opinion mining tasks, where each term is associated with numerical scores for positive and negative sentiment information. A resource that makes term level sentiment information readily available could be of use in building more effective sentiment classification methods. This research presents the results of an experiment that applied the SentiWordNet lexical resource to the problem of automatic sentiment classification of film reviews. First, a data set of relevant features extracted from text documents using SentiWordNet was designed and implemented. The resulting feature set is then used as input for training a support vector machine classifier for predicting the sentiment orientation of the underlying film review. Several scenarios exploring variations on the parameters that generate the data set, outlier removal and feature selection were executed. The results obtained are compared to other methods documented in the literature. It was found that they are in line with other experiments that propose similar approaches and use the same data set of film reviews, indicating SentiWordNet could become an important resource for the task of sentiment classification. Considerations on future improvements are also presented based on a detailed analysis of classification results

    Feature Selection and Classification Methods for Decision Making: A Comparative Analysis

    Get PDF
    The use of data mining methods in corporate decision making has been increasing in the past decades. Its popularity can be attributed to better utilizing data mining algorithms, increased performance in computers, and results which can be measured and applied for decision making. The effective use of data mining methods to analyze various types of data has shown great advantages in various application domains. While some data sets need little preparation to be mined, whereas others, in particular high-dimensional data sets, need to be preprocessed in order to be mined due to the complexity and inefficiency in mining high dimensional data processing. Feature selection or attribute selection is one of the techniques used for dimensionality reduction. Previous research has shown that data mining results can be improved in terms of accuracy and efficacy by selecting the attributes with most significance. This study analyzes vehicle service and sales data from multiple car dealerships. The purpose of this study is to find a model that better classifies existing customers as new car buyers based on their vehicle service histories. Six different feature selection methods such as; Information Gain, Correlation Based Feature Selection, Relief-F, Wrapper, and Hybrid methods, were used to reduce the number of attributes in the data sets are compared. The data sets with the attributes selected were run through three popular classification algorithms, Decision Trees, k-Nearest Neighbor, and Support Vector Machines, and the results compared and analyzed. This study concludes with a comparative analysis of feature selection methods and their effects on different classification algorithms within the domain. As a base of comparison, the same procedures were run on a standard data set from the financial institution domain

    New Approach for Market Intelligence Using Artificial and Computational Intelligence

    Get PDF
    Small and medium sized retailers are central to the private sector and a vital contributor to economic growth, but often they face enormous challenges in unleashing their full potential. Financial pitfalls, lack of adequate access to markets, and difficulties in exploiting technology have prevented them from achieving optimal productivity. Market Intelligence (MI) is the knowledge extracted from numerous internal and external data sources, aimed at providing a holistic view of the state of the market and influence marketing related decision-making processes in real-time. A related, burgeoning phenomenon and crucial topic in the field of marketing is Artificial Intelligence (AI) that entails fundamental changes to the skillssets marketers require. A vast amount of knowledge is stored in retailers’ point-of-sales databases. The format of this data often makes the knowledge they store hard to access and identify. As a powerful AI technique, Association Rules Mining helps to identify frequently associated patterns stored in large databases to predict customers’ shopping journeys. Consequently, the method has emerged as the key driver of cross-selling and upselling in the retail industry. At the core of this approach is the Market Basket Analysis that captures knowledge from heterogeneous customer shopping patterns and examines the effects of marketing initiatives. Apriori, that enumerates frequent itemsets purchased together (as market baskets), is the central algorithm in the analysis process. Problems occur, as Apriori lacks computational speed and has weaknesses in providing intelligent decision support. With the growth of simultaneous database scans, the computation cost increases and results in dramatically decreasing performance. Moreover, there are shortages in decision support, especially in the methods of finding rarely occurring events and identifying the brand trending popularity before it peaks. As the objective of this research is to find intelligent ways to assist small and medium sized retailers grow with MI strategy, we demonstrate the effects of AI, with algorithms in data preprocessing, market segmentation, and finding market trends. We show with a sales database of a small, local retailer how our Åbo algorithm increases mining performance and intelligence, as well as how it helps to extract valuable marketing insights to assess demand dynamics and product popularity trends. We also show how this results in commercial advantage and tangible return on investment. Additionally, an enhanced normal distribution method assists data pre-processing and helps to explore different types of potential anomalies.Små och medelstora detaljhandlare är centrala aktörer i den privata sektorn och bidrar starkt till den ekonomiska tillväxten, men de möter ofta enorma utmaningar i att uppnå sin fulla potential. Finansiella svårigheter, brist på marknadstillträde och svårigheter att utnyttja teknologi har ofta hindrat dem från att nå optimal produktivitet. Marknadsintelligens (MI) består av kunskap som samlats in från olika interna externa källor av data och som syftar till att erbjuda en helhetssyn av marknadsläget samt möjliggöra beslutsfattande i realtid. Ett relaterat och växande fenomen, samt ett viktigt tema inom marknadsföring är artificiell intelligens (AI) som ställer nya krav på marknadsförarnas färdigheter. Enorma mängder kunskap finns sparade i databaser av transaktioner samlade från detaljhandlarnas försäljningsplatser. Ändå är formatet på dessa data ofta sådant att det inte är lätt att tillgå och utnyttja kunskapen. Som AI-verktyg erbjuder affinitetsanalys en effektiv teknik för att identifiera upprepade mönster som statistiska associationer i data lagrade i stora försäljningsdatabaser. De hittade mönstren kan sedan utnyttjas som regler som förutser kundernas köpbeteende. I detaljhandel har affinitetsanalys blivit en nyckelfaktor bakom kors- och uppförsäljning. Som den centrala metoden i denna process fungerar marknadskorgsanalys som fångar upp kunskap från de heterogena köpbeteendena i data och hjälper till att utreda hur effektiva marknadsföringsplaner är. Apriori, som räknar upp de vanligt förekommande produktkombinationerna som köps tillsammans (marknadskorgen), är den centrala algoritmen i analysprocessen. Trots detta har Apriori brister som algoritm gällande låg beräkningshastighet och svag intelligens. När antalet parallella databassökningar stiger, ökar också beräkningskostnaden, vilket har negativa effekter på prestanda. Dessutom finns det brister i beslutstödet, speciellt gällande metoder att hitta sällan förekommande produktkombinationer, och i att identifiera ökande popularitet av varumärken från trenddata och utnyttja det innan det når sin höjdpunkt. Eftersom målet för denna forskning är att hjälpa små och medelstora detaljhandlare att växa med hjälp av MI-strategier, demonstreras effekter av AI med hjälp av algoritmer i förberedelsen av data, marknadssegmentering och trendanalys. Med hjälp av försäljningsdata från en liten, lokal detaljhandlare visar vi hur Åbo-algoritmen ökar prestanda och intelligens i datautvinningsprocessen och hjälper till att avslöja värdefulla insikter för marknadsföring, framför allt gällande dynamiken i efterfrågan och trender i populariteten av produkterna. Ytterligare visas hur detta resulterar i kommersiella fördelar och konkret avkastning på investering. Dessutom hjälper den utvidgade normalfördelningsmetoden i förberedelsen av data och med att hitta olika slags anomalier

    Time of chemical treatments prediction in agricultural production based on Data mining techniques using wireless communication systems

    Get PDF
    Успешно одређивање временског периода у коме су остварени услови за појаву болести и временског тренутка у коме је потребно обавити хемијске третмане представља комплексан проблем због сложености креирања предикционих модела чији је задатак дефинисање веза између појаве болести и тренутних метеоролошких услова. Тачност предикције времена хемијских третмана директно утиче на економичност пољопривредне производње и количине потребних пестицида, те доприноси заштити животне средине, као и здравијим пољопривредним производима. Спроведеним истраживањем је креирано софтверско решење, базирано на примени data mining техника и бежичних комуникационих система, чији је задатак прикупљање метеоролошких и просторно-временских параметара, на основу којих се врши предикција остварености услова за појаву биљних болести, а самим тим и предикција времена хемијских третмана. Развијено решење је затвореног типа, односно у оквиру истог се врши прикупљање потребних података, обрада података у циљу креирања предикционих модела и примена тих модела за предикцију времена хемијских третмана. Како би се утврдила тачност предикције, извршено је тестирање добијених предикционих модела за креиране скупове података. Закључено је да тачност предикције износи 93,71%, што оправдава коришћење система заснованог на data mining техникама и бежичним комуникационим системима за предикцију времена хемијских третмана

    Essentials of Business Analytics

    Get PDF
    corecore