8,689 research outputs found

    Data Mining Techniques to Understand Textual Data

    Get PDF
    More than ever, information delivery online and storage heavily rely on text. Billions of texts are produced every day in the form of documents, news, logs, search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Text understanding is a fundamental and essential task involving broad research topics, and contributes to many applications in the areas text summarization, search engine, recommendation systems, online advertising, conversational bot and so on. However, understanding text for computers is never a trivial task, especially for noisy and ambiguous text such as logs, search queries. This dissertation mainly focuses on textual understanding tasks derived from the two domains, i.e., disaster management and IT service management that mainly utilizing textual data as an information carrier. Improving situation awareness in disaster management and alleviating human efforts involved in IT service management dictates more intelligent and efficient solutions to understand the textual data acting as the main information carrier in the two domains. From the perspective of data mining, four directions are identified: (1) Intelligently generate a storyline summarizing the evolution of a hurricane from relevant online corpus; (2) Automatically recommending resolutions according to the textual symptom description in a ticket; (3) Gradually adapting the resolution recommendation system for time correlated features derived from text; (4) Efficiently learning distributed representation for short and lousy ticket symptom descriptions and resolutions. Provided with different types of textual data, data mining techniques proposed in those four research directions successfully address our tasks to understand and extract valuable knowledge from those textual data. My dissertation will address the research topics outlined above. Concretely, I will focus on designing and developing data mining methodologies to better understand textual information, including (1) a storyline generation method for efficient summarization of natural hurricanes based on crawled online corpus; (2) a recommendation framework for automated ticket resolution in IT service management; (3) an adaptive recommendation system on time-varying temporal correlated features derived from text; (4) a deep neural ranking model not only successfully recommending resolutions but also efficiently outputting distributed representation for ticket descriptions and resolutions

    Machine learning in incident categorization automation

    Get PDF
    IT incident management process requires a correct categorization to attribute incident tickets to the right resolution group and obtain an operational system as quickly as possible, having the lowest possible impact on the business and costumers. In this work, we introduce a module to automatically categorize incident tickets, turning the responsible teams for incident management more productive. This module can be integrated as an extension into an incident ticket system (ITS), which contributes to reduce the time wasted on incident ticket route and reduce the amount of errors on incident categorization. To automate the classification, we use a support vector machine (SVM), obtaining an accuracy of 89%, approximately, on a dataset of real-world incident tickets.info:eu-repo/semantics/acceptedVersio

    Market basket analysis in retail

    Get PDF
    En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)In this Master Thesis memory will be described a full end-to-end data science project performed in CleverData, a successful start-up specialized in machine learning techniques and analytics tools. Over all its capacities, it offers a huge variety of solutions to nowadays business needs from different domains. This project was performed for one of its client, an important retail company from Spain. It consist of analysing the market basket of customers. Thus, the main goal is to find which items are purchased together in their stores. Through the memory, the reader will see how, step by step, the project grows. Since the first step of defining objectives, until the last one of results delivery. Moreover, the reader will see one of the most promising tools used for machine learning as a service nowadays, BigML. At the end of the project, the reader will have a general idea how data science projects are structured, and how machine learning can be used to solve real problems in today’s companies

    Using Text Analytics to Derive Customer Service Management Benefits from Unstructured Data

    Get PDF
    Deriving value from structured data is now commonplace. The value of unstructured textual data, however, remains mostly untapped and often unrecognized. This article describes the text analytics journeys of three organizations in the customer service management area. Based on their experiences, we provide four lessons that can guide other organizations as they embark on their text analytics journeys.Click here for podcast summary (mp3)Click here for free 2-page executive summary (pdf)Click here for free presentation slides (pptx

    Intelligent Data Mining Techniques for Automatic Service Management

    Get PDF
    Today, as more and more industries are involved in the artificial intelligence era, all business enterprises constantly explore innovative ways to expand their outreach and fulfill the high requirements from customers, with the purpose of gaining a competitive advantage in the marketplace. However, the success of a business highly relies on its IT service. Value-creating activities of a business cannot be accomplished without solid and continuous delivery of IT services especially in the increasingly intricate and specialized world. Driven by both the growing complexity of IT environments and rapidly changing business needs, service providers are urgently seeking intelligent data mining and machine learning techniques to build a cognitive ``brain in IT service management, capable of automatically understanding, reasoning and learning from operational data collected from human engineers and virtual engineers during the IT service maintenance. The ultimate goal of IT service management optimization is to maximize the automation of IT routine procedures such as problem detection, determination, and resolution. However, to fully automate the entire IT routine procedure is still a challenging task without any human intervention. In the real IT system, both the step-wise resolution descriptions and scripted resolutions are often logged with their corresponding problematic incidents, which typically contain abundant valuable human domain knowledge. Hence, modeling, gathering and utilizing the domain knowledge from IT system maintenance logs act as an extremely crucial role in IT service management optimization. To optimize the IT service management from the perspective of intelligent data mining techniques, three research directions are identified and considered to be greatly helpful for automatic service management: (1) efficiently extract and organize the domain knowledge from IT system maintenance logs; (2) online collect and update the existing domain knowledge by interactively recommending the possible resolutions; (3) automatically discover the latent relation among scripted resolutions and intelligently suggest proper scripted resolutions for IT problems. My dissertation addresses these challenges mentioned above by designing and implementing a set of intelligent data-driven solutions including (1) constructing the domain knowledge base for problem resolution inference; (2) online recommending resolution in light of the explicit hierarchical resolution categories provided by domain experts; and (3) interactively recommending resolution with the latent resolution relations learned through a collaborative filtering model

    Building a Strong Undergraduate Research Culture in African Universities

    Get PDF
    Africa had a late start in the race to setting up and obtaining universities with research quality fundamentals. According to Mamdani [5], the first colonial universities were few and far between: Makerere in East Africa, Ibadan and Legon in West Africa. This last place in the race, compared to other continents, has had tremendous implications in the development plans for the continent. For Africa, the race has been difficult from a late start to an insurmountable litany of problems that include difficulty in equipment acquisition, lack of capacity, limited research and development resources and lack of investments in local universities. In fact most of these universities are very recent with many less than 50 years in business except a few. To help reduce the labor costs incurred by the colonial masters of shipping Europeans to Africa to do mere clerical jobs, they started training ―workshops‖ calling them technical or business colleges. According to Mamdani, meeting colonial needs was to be achieved while avoiding the ―Indian disease‖ in Africa -- that is, the development of an educated middle class, a group most likely to carry the virus of nationalism. Upon independence, most of these ―workshops‖ were turned into national ―universities‖, but with no clear role in national development. These national ―universities‖ were catering for children of the new African political elites. Through the seventies and eighties, most African universities were still without development agendas and were still doing business as usual. Meanwhile, governments strapped with lack of money saw no need of putting more scarce resources into big white elephants. By mid-eighties, even the UN and IMF were calling for a limit on funding African universities. In today‘s African university, the traditional curiosity driven research model has been replaced by a market-driven model dominated by a consultancy culture according to Mamdani (Mamdani, Mail and Guardian Online). The prevailing research culture as intellectual life in universities has been reduced to bare-bones classroom activity, seminars and workshops have migrated to hotels and workshop attendance going with transport allowances and per diems (Mamdani, Mail and Guardian Online). There is need to remedy this situation and that is the focus of this paper

    Predictive analysis of incidents based on software deployments

    Get PDF
    A high number of information technology organizations have several problems during and after deploying their services, this alongside with the high number of services that they provide daily, it makes Incident Management (IM) process quite demanding. An effective IM system needs to enable decision-makers to detect problems easily. Otherwise, the organizations can face unscheduled system downtime and/or unplanned costs. This study demonstrates that is possible to introduce a predictive process that may lead to an improvement of the response time to incidents and to the reduction of the number of incidents created by deployments. By predicting these problems, the decision-makers can better allocate resources and mitigate costs. Therefore, this research aims to investigate if machine learning algorithms can help to predict the number of incidents of a certain deployment. The results showed with some security, that it is possible to predict, if a certain deployment will have or not an incident in the future.Um número elevado de organizações de tecnologias de informação têm um grande número de problemas no momento e após lançarem os seus serviços, se juntarmos a isto o número elevado de serviços que estas organizações prestam diariamente, dificulta bastante o processo de Incident Management (IM). Um sistema de IM eficaz deve permitir aos decisores de negócio detetar facilmente estes problemas, caso contrário, as organizações podem ter de enfrentar imprevistos nos seus serviços (custos ou falhas). Esta tese irá demonstrar que é possível introduzir um processo de previsão que poderá levar a um melhoramento do tempo de resposta aos incidentes, assim como uma redução dos mesmo. Prevendo estes problemas estes podem alocar melhor os recursos assim como mitigar os incidentes. Como tal, esta tese irá analisar como prever esses incidentes, analisando os deployments feitos nos últimos anos e relacionando-os usando algoritmos de machine learning para prever os incidentes. Os resultados mostraram que é possível prever com confiança se um determinado deploymente vai ou não ter incidentes

    Automatization of incident categorization

    Get PDF
    To be able to keep up with the grow of the created incidents quantity in an organization nowadays, there was the need to increase the resources to ensure the management of all incidents. Incident Management is composed by several activities, being one of them, Incident Categorization. Merging Natural Language and Text Mining techniques and Machine Learning algorithms, we propose improve this activity, specifically the Incident Management Process. For that, we propose replace the manual sub-process of Categorization inherent to the Incident Management Process by an automatic sub-process, without any human interaction. The goal of this dissertation is to propose a solution to categorize correctly and automatically the incidents. For that, there are real data provided by a company, which due to privacy questions will not be mention along dissertation. The datasets are composed by incidents correctly categorized, which leverage us to apply supervised learning algorithms. It is supposed to obtain as output a developed method through the merge of Natural Language Processing techniques and classification algorithms with better performance on the data. At the end, the proposed method is assessed comparatively with the current categorization done to conclude if our proposal really improves the Incident Management Process and which are the advantages brought by the automation.De forma a acompanhar o crescimento da quantidade de incidentes criados no diaa-dia de uma organização, houve a necessidade de aumentar a quantidade de recursos, de maneira a assegurar a gestão de todos os incidentes. A gestão de incidentes é composta por várias atividades, sendo uma delas, a categorização de incidentes. Através da junção de técnicas de Linguagem Natural e Processamento de Texto e de Algoritmos de Aprendizagem Automática propomos melhorar esta atividade, especificamente o Processo de Gestão de Incidentes. Para tal, propomos a substituição do subprocesso manual de Categorização inerente ao Processo de Gestão de Incidentes por um subprocesso automatizado, sem qualquer interação humana. A dissertação tem como objetivo propor uma solução para categorizar corretamente e automaticamente incidentes. Para tal, temos dados reais de uma organização, que devido a questões de privacidade não será mencionada ao longo da dissertação. Os datasets são compostos por incidentes corretamente categorizados o que nos leva a aplicar algoritmos de aprendizagem supervisionada. Pretendemos ter como resultado final um método desenvolvido através da junção das diferentes técnicas de Linguagem Natural e dos algoritmos com melhor performance para classificar os dados. No final será avaliado o método proposto comparativamente à categorização que é realizada atualmente, de modo a concluir se a nossa proposta realmente melhora o Processo de Gestão de Incidentes e quais são as vantagens trazidas pela automatização
    • …