973 research outputs found

    Analytical study and computational modeling of statistical methods for data mining

    Get PDF
    Today, there is tremendous increase of the information available on electronic form. Day by day it is increasing massively. There are enough opportunities for research to retrieve knowledge from the data available in this information. Data mining and app

    A survey of temporal knowledge discovery paradigms and methods

    Get PDF
    With the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and implemented systems are using an enhanced temporal understanding to explain aspects of behavior associated with the implicit time-varying nature of the universe. This paper investigates the confluence of these two areas, surveys the work to date, and explores the issues involved and the outstanding problems in temporal data mining

    ENHANCEMENT OF CHURN PREDICTION ALGORITHMS

    Get PDF
    Customer churn can be described as the process by which consumers of goods and services discontinue the consumption of a product or service and switch over to a competitor.It is of great concern to many companies. Thus, decision support systems are needed to overcome this pressing issue and ensure good return on investments for organizations. Decision support systems use analytical models to provide the needed intelligence to analyze an integrated customer record database to predict customers that will churn and offer recommendations that will prevent them from churning. Customers churn prediction, unlike most conventional business intelligence techniques, deals with customer demographics, net worth-value, and market opportunities. It is used in determining customers who are likely to churn, those likely to remain loyal to the organization, and for prediction of future churn rates. Customer defection is naturally a slow rate event, and it is not easily detected by most business intelligent solutions available in the market; especially when data is skewed, large, and distinct. Thus, accurate and precise prediction methods are needed to detect the churning trend. In this study, a churn model that applies business intelligence techniques to detect the possibility that a customer will churn using churn trend analysis of customer records is proposed. The model applies clustering algorithms and enhanced SPRINT decision tree algorithms to explore customer record database, and identify the customer profile and behavior patterns. The Model then predicts the possibility that a customer will churn. Additionally, it offers solutions for retaining customers and making them loyal to a business entity by recommending customer-relationship management measures

    Improving efficiency and reducing waste for sustainable beef supply chain

    Get PDF
    In this thesis, novel methodologies were developed to improve the sustainability of beef supply chain by reducing their environmental and physical waste. These methodologies would assist stakeholders of beef supply chain viz. farmers, abattoir, processor, logistics and retailer in identification of the root causes of waste and hotspots of greenhouse emissions and their consequent mitigation. Numerous quantitative and qualitative research methods were used to develop these methodologies such as current reality tree method, big data analytics, interpretive structural modelling, toposis and cloud computing technology. Real data set from social media and interviews of stakeholders of Indian beef supply chain were used. Numerous issues associated with waste minimisation and reducing carbon footprint of beef supply chain are addressed including: (a) Identification of root causes of waste generated in the beef supply chain using Current Reality Tree method and their consequent mitigation (b) Application of social media data for waste minimisation in beef supply chain. (c) Developing consumer centric beef supply chain by amalgamation of big data technique and interpretive structural modeling (c) Reducing carbon footprint of beef supply chain using Information and Communication Technology (ICT) (d) Developing cloud computing framework for sustainable supplier selection in beef supply chain (e) Updating the existing literature on improving sustainability of beef supply chain. The efficacy of the proposed methodologies was demonstrated using case studies. These frameworks may play a crucial role to assist the decision makers of all stakeholders of beef supply chain in waste minimization and reducing carbon footprint thereby improving the sustainability of beef supply chain. The proposed methodologies are generic in nature and can be applied to other domains of red meat industry or to any other food supply chain

    Computer vision based classification of fruits and vegetables for self-checkout at supermarkets

    Get PDF
    The field of machine learning, and, in particular, methods to improve the capability of machines to perform a wider variety of generalised tasks are among the most rapidly growing research areas in today’s world. The current applications of machine learning and artificial intelligence can be divided into many significant fields namely computer vision, data sciences, real time analytics and Natural Language Processing (NLP). All these applications are being used to help computer based systems to operate more usefully in everyday contexts. Computer vision research is currently active in a wide range of areas such as the development of autonomous vehicles, object recognition, Content Based Image Retrieval (CBIR), image segmentation and terrestrial analysis from space (i.e. crop estimation). Despite significant prior research, the area of object recognition still has many topics to be explored. This PhD thesis focuses on using advanced machine learning approaches to enable the automated recognition of fresh produce (i.e. fruits and vegetables) at supermarket self-checkouts. This type of complex classification task is one of the most recently emerging applications of advanced computer vision approaches and is a productive research topic in this field due to the limited means of representing the features and machine learning techniques for classification. Fruits and vegetables offer significant inter and intra class variance in weight, shape, size, colour and texture which makes the classification challenging. The applications of effective fruit and vegetable classification have significant importance in daily life e.g. crop estimation, fruit classification, robotic harvesting, fruit quality assessment, etc. One potential application for this fruit and vegetable classification capability is for supermarket self-checkouts. Increasingly, supermarkets are introducing self-checkouts in stores to make the checkout process easier and faster. However, there are a number of challenges with this as all goods cannot readily be sold with packaging and barcodes, for instance loose fresh items (e.g. fruits and vegetables). Adding barcodes to these types of items individually is impractical and pre-packaging limits the freedom of choice when selecting fruits and vegetables and creates additional waste, hence reducing customer satisfaction. The current situation, which relies on customers correctly identifying produce themselves leaves open the potential for incorrect billing either due to inadvertent error, or due to intentional fraudulent misclassification resulting in financial losses for the store. To address this identified problem, the main goals of this PhD work are: (a) exploring the types of visual and non-visual sensors that could be incorporated into a self-checkout system for classification of fruits and vegetables, (b) determining a suitable feature representation method for fresh produce items available at supermarkets, (c) identifying optimal machine learning techniques for classification within this context and (d) evaluating our work relative to the state-of-the-art object classification results presented in the literature. An in-depth analysis of related computer vision literature and techniques is performed to identify and implement the possible solutions. A progressive process distribution approach is used for this project where the task of computer vision based fruit and vegetables classification is divided into pre-processing and classification techniques. Different classification techniques have been implemented and evaluated as possible solution for this problem. Both visual and non-visual features of fruit and vegetables are exploited to perform the classification. Novel classification techniques have been carefully developed to deal with the complex and highly variant physical features of fruit and vegetables while taking advantages of both visual and non-visual features. The capability of classification techniques is tested in individual and ensemble manner to achieved the higher effectiveness. Significant results have been obtained where it can be concluded that the fruit and vegetables classification is complex task with many challenges involved. It is also observed that a larger dataset can better comprehend the complex variant features of fruit and vegetables. Complex multidimensional features can be extracted from the larger datasets to generalise on higher number of classes. However, development of a larger multiclass dataset is an expensive and time consuming process. The effectiveness of classification techniques can be significantly improved by subtracting the background occlusions and complexities. It is also worth mentioning that ensemble of simple and less complicated classification techniques can achieve effective results even if applied to less number of features for smaller number of classes. The combination of visual and nonvisual features can reduce the struggle of a classification technique to deal with higher number of classes with similar physical features. Classification of fruit and vegetables with similar physical features (i.e. colour and texture) needs careful estimation and hyper-dimensional embedding of visual features. Implementing rigorous classification penalties as loss function can achieve this goal at the cost of time and computational requirements. There is a significant need to develop larger datasets for different fruit and vegetables related computer vision applications. Considering more sophisticated loss function penalties and discriminative hyper-dimensional features embedding techniques can significantly improve the effectiveness of the classification techniques for the fruit and vegetables applications

    An Adaptive Framework for Improving the Effectiveness of Virtual Enterprises in the Supply Chain

    Get PDF
    This thesis describes a research project that develops an adaptive framework for improving the effectiveness of virtual enterprises in the supply chains in Mongolia. The research takes empirical and quantitative approach to study the phenomenon of virtual enterprises. Based on a literature review, the factors that influence organisations to join in virtual enterprises are studied by a higher-order factor analysis. As a result, agility is identified as one of the main benefits organisations can gain by joining a virtual enterprise temporarily and changes in business performance are conceived as the measures of effectiveness. Next, a taxonomy of enterprises is developed with five distinguishing clusters that achieve differing levels of agility and business performance. This study suggests that enterprises that are monitoring changes in their business environment take most advantage of agility and achieve the best levels of performance. These findings then allow an adaptive framework based on common reference architectures to be developed as a main contribution of this study. The framework includes a breeding environment as a ‘pool’ of prepared enterprises with the ability to form temporary collaborations to react responsively, rapidly and effectively to the fast-changing opportunities. A structural equation model was used to examine the model fit with the supporting hypotheses, based on the observed data. Then, a powerful clustered expectation maximisation algorithm was applied to the analysis of the grouped enterprises. Finally, a simulation-based case study was conducted to validate the developed framework. The results provide rich empirical evidence of the beneficial impact of virtual enterprises on agile supply chains. The research provides rich empirical evidence of the beneficial impact of virtual enterprises on agile supply chains. It also provides theoretical and managerial insights that can be used to strengthen the drivers, enablers and capabilities that enhance the effectiveness of virtual enterprises collaboration in agile supply chains that can be translated to a global context. These are major contributions the ‘body of knowledge’ in themselves, but the research also adds usefully to the study of applied research methodologies in the area

    Discovery of Transport Operations from Geolocation Data

    Get PDF
    Os dados de geolocalização identificam a localização geográfica de pessoas ou objetos e são fundamentais para empresas que dependem de veículos, como empresas logísticas e de transportes. Com o avanço da tecnologia, a recolha de dados de geolocalização tornou-se cada vez mais acessível e económica, gerando novas oportunidades de inteligência empresarial. Este tipo de dados tem sido utilizado principalmente para caracterizar o veículo em termos de posicionamento e navegação, mas também pode ter um papel preponderante na avaliação de desempenho em relação às atividades e operações executadas. A abordagem proposta consiste numa metodologia com várias etapas que recebe dados de geolocalização como entrada e permite a análise do processo de negócio no final. Em primeiro lugar, a preparação dos dados é aplicada para lidar com uma série de questões relacionadas com ruído e erros nos dados. Depois, a identificação dos eventos estacionários é realizada com base nos estados estacionários dos veículos. Em seguida, é realizada a inferência de operações com base numa análise espacial, que permite descobrir os locais onde os eventos estacionários ocorrem com frequência. Finalmente, as operações identificadas são classificadas com base nas suas características, e a sequência de eventos pode ser estruturada. A aplicação de técnicas de process mining é então possível e a consequente extração de conhecimento do processo. As etapas da metodologia também podem ser utilizadas separadamente para enfrentar desafios específicos, dando mais flexibilidade à sua aplicação. Três estudos de caso distintos são apresentados para demonstrar a eficácia e transversalidade da solução. Fluxos de dados de geolocalização em tempo real de autocarros de duas redes distintas de transporte público são usados para demonstrar a detecção de operações relacionadas com os veículos e comparar as distintas abordagens propostas por este trabalho. As operações dos autocarros produzem uma sequência estruturada de eventos que descreve o comportamento dos mesmos. Esse comportamento é mapeado por meio da aplicação de técnicas de process mining, para descobrir oportunidades de análise e gargalos no processo. Complementarmente, os dados de geolocalização de uma empresa de logística internacional são explorados para a monitorização de processos logísticos, nomeadamente para detecção de operações de logística em tempo real, demonstrando a eficácia da solução proposta para resolver problemas específicos da indústria. Os resultados deste trabalho revelam novas possibilidades no uso de dados de geolocalização e o seu potencial para gerar conhecimento acerca do processo. A exploração de dados de geolocalização nos contextos logísticos e de transportes públicos apresenta-se como uma oportunidade para melhorar a monitorização e gestão das operações baseadas em veículos. Isso pode originar melhorias na eficiência do processo e, consequentemente, maior lucro e melhor qualidade do serviço.Geolocation data identifies the geographic location of people or objects, and is fundamental for businesses relying on vehicles such as logistics and transportation. With the advance of technology, collecting geolocation data has become increasingly accessible and affordable, raising new opportunities for business intelligence. This type of data has been used mainly for characterizing the vehicle in terms of positioning and navigation, but it can also showcase its performance regarding the executed activities and operations. The proposed approach consists on a multi-step methodology that receives geolocation data as an input and allows the analysis of the business process in the end. Firstly, the preparation of the data is applied to handle a number of issues related to outliers, data noise, and missing or erroneous information. Then, the identification of stationary events is performed based on the motionless states of the vehicles. Next, the inference of operations based on a spatial analysis is performed, which allows the discovery of the locations where stationary events occur frequently. Finally, the identified operations are classified based on their characteristics, and the sequence of events can be structured into an event log. The application of process mining techniques is then possible and the consequently extraction of process knowledge. The steps of the methodology can also be used separately to tackle specific challenges, giving more flexibility to its application. Three distinct case studies are presented to demonstrate the effectiveness and transversality of the solution. Real-time geolocation data streams of buses from two distinct public transport networks are used to demonstrate the detection of vehicle-based operations and compare the distinct approaches proposed by this work. The buses operations produce a structured sequence of events that describes the behaviour of the buses. This behaviour is mapped through the application of process mining techniques uncovering analysis opportunities and discovering bottlenecks in the process. Geolocation data from an international logistics company is exploited for monitoring logistics processes, namely for detecting vehicle-based operations in real time, showing the effectiveness of the proposed solution to solve specific industry problems. The results of this work reveal new possibilities for geolocation data and its potential to generate process knowledge. The exploitation of geolocation data in the public transport and logistics contexts poses as an opportunity for improving the monitoring and management of vehicle-based operations. This can lead to into improvements in the process efficiency and consequently higher profit and better service quality

    Ontologies learn by searching

    Get PDF
    Dissertation to obtain the Master degree in Electrical Engineering and Computer ScienceDue to the worldwide diversity of communities, a high number of ontologies representing the same segment of reality which are not semantically coincident have appeared. To solve this problem, a possible solution is to use a reference ontology to be the intermediary in the communications between the community enterprises and to outside. Since semantic mappings between enterprise‘s ontologies are established, this solution allows each of the enterprises to keep internally its own ontology and semantics unchanged. However information systems are not static, thus established mappings become obsoletes with time. This dissertation‘s objective is to identify a suitable method that combines semantic mappings with user‘s feedback, providing an automatic learning to ontologies & enabling auto-adaptability and dynamism to the information system
    corecore