80 research outputs found

    Airbnb customer satisfaction through online reviews

    Get PDF
    With the development and better access to the Internet, mobile devices and social media, people began to post online their opinions and reviews of products and services. These comments influence new customer buying decisions and qualify companies to gain superior insight into their customers’ experience and satisfaction. Thus, it has become essential for companies to adopt methods capable of analyzing this information and extracting its value in order to better serve their customers’ unmet needs. The area of tourism and hospitality was one of the most affected by this trend. For this reason, this study will focus on the reviews of an online platform, Airbnb, so that it also studies the technological disruption in the mentioned industry. This new method of home-sharing has gained more and more followers for its advantages and differences compared to common hotels, which has triggered increasing researcher. Airbnb’s guest reviews describe each guest’s experiences (the positive and negative aspects of their stay) and will be studied through Text Mining. This consists of several methods capable of analyzing large amounts of unstructured information such as Big Data, in order to better understand overall customer satisfaction, including the factors that will influence it. Results show that distinct dimensions are valued by guests and they are different in different areas of Sintra.Com o desenvolvimento e maior acesso Ă  Internet, dispositivos mĂłveis e redes sociais, as pessoas começaram a publicar online as suas opiniĂ”es e avaliaçÔes de produtos e serviços. Estes comentĂĄrios influenciam as decisĂ”es de compra de novos clientes e permitem Ă s empresas obter um maior conhecimento sobre a experiĂȘncia e satisfação dos seus clientes. Assim, tornou-se imprescindĂ­vel para as estas, adotarem mĂ©todos capazes de analisar esta informação e extrair valor da mesma de modo a conseguirem atender de forma mais ajustada Ă s necessidades dos seus clientes. A ĂĄrea da hospitalidade foi uma das mais afetadas por esta tendĂȘncia. Por esse motivo, este estudo vai ser focado nas reviews de uma plataforma online, o Airbnb, juntando assim tambĂ©m uma disrupção tecnolĂłgica desta mesma ĂĄrea. Este novo mĂ©todo de alojamento partilhado tem ganho cada mais seguidores pelas suas vantagens e diferenças em relação a hotĂ©is mais comuns, mas tambĂ©m tem sido um assunto cada vez mais estudado por investigadores. Os comentĂĄrios estudados do Airbnb descrevem as experiĂȘncias de cada hĂłspede relativamente ao alojamento onde permaneceram e sĂŁo estudados atravĂ©s de Text Mining. Este consiste em vĂĄrios mĂ©todos capazes de analisar grandes volumes de informação nĂŁo estruturados como Big data para consequentemente compreender melhor a satisfação geral dos clientes, nomeadamente os fatores que a vĂŁo influenciar. Os resultados mostram que existem vĂĄrias dimensĂ”es valorizadas e diferentes para as zonas estudadas em Sintra

    Document Clustering as an approach to template extraction

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceA great part of customer support is done via the exchange of emails. As the number of emails exchanged daily is constantly increasing, companies need to find approaches to ensure its efficiency. One common strategy is the usage of template emails as an answer. These answers templates are usually found by a human agent through the repetitive usage of the same answer. In this work, we use a clustering approach to find these answer templates. Several clustering algorithms are researched in this work, with a focus on the k-means methodology, as well as other clustering components such as similarity measures and pre-processing steps. As we are dealing with text data, several text representation methods are also compared. Due to the peculiarity of the provided data, we are able to design methodologies to ensure the feasibility of this task and develop strategies to extract the answer templates from the clustering results

    The Application of Deep Learning and Cloud Technologies to Data Science

    Get PDF
    Machine Learning and Cloud Computing have become a staple to businesses and educational institutions over the recent years. The two forefronts of big data solutions have garnered technology giants to race for the superior implementation of both Machine Learning and Cloud Computing. The objective of this thesis is to test and utilize AWS SageMaker in three different applications: time-series forecasting with sentiment analysis, automated Machine Learning (AutoML), and finally anomaly detection. The first study covered is a sentiment-based LSTM for stock price prediction. The LSTM was created with two methods, the first being SQL Server Data Tools, and the second being an implementation of LSTM using the Keras library. These results were then evaluated using accuracy, precision, recall, f-1 score, mean absolute error (MAE), root mean squared error (RMSE), and symmetric mean absolute percentage error (SMAPE). The results of this project were that the sentiment models all outperformed the control LSTM. The public model for Facebook on SQL Server Data Tools performed the best overall with 0.9743 accuracy and 0.9940 precision. The second study covered is an application of AWS SageMaker AutoPilot which is an AutoML platform designed to make Machine Learning more accessible to those without programming backgrounds. The methodology of this study follows the application of AWS Data Wrangler and AutoPilot from beginning of the process to completion. The results were evaluated using the metrics of: accuracy, precision, recall, and f-1 score. The best accuracy is given to the LightGBM model on the AI4I Maintenance dataset with an accuracy of 0.983. This model also scored the best on precision, recall, and F1 Score. The final study covered is an anomaly detection system for cyber security intrusion detection system data. The Intrusion Detection Systems that have been rule based are able to catch most of the cyber threats that are prevalent in network traffic; however, the copious amounts of alerts are nearly impossible for humans to keep up with. The methodology of this study follows a typical taxonomy of: data collection, data processing, model creation, and model evaluation. Both Random Cut Forest and XGBoost are implemented using AWS SageMaker. The Supervised Learning Algorithm of XGBoost was able to have the highest accuracy of all models with Model 2 giving an accuracy of 0.6183. This model also showed a Precision of 0.5902, Recall of 0.9649, and F1 Score 0.7324

    Improving intrusion detection systems using data mining techniques

    Get PDF
    Recent surveys and studies have shown that cyber-attacks have caused a lot of damage to organisations, governments, and individuals around the world. Although developments are constantly occurring in the computer security field, cyber-attacks still cause damage as they are developed and evolved by hackers. This research looked at some industrial challenges in the intrusion detection area. The research identified two main challenges; the first one is that signature-based intrusion detection systems such as SNORT lack the capability of detecting attacks with new signatures without human intervention. The other challenge is related to multi-stage attack detection, it has been found that signature-based is not efficient in this area. The novelty in this research is presented through developing methodologies tackling the mentioned challenges. The first challenge was handled by developing a multi-layer classification methodology. The first layer is based on decision tree, while the second layer is a hybrid module that uses two data mining techniques; neural network, and fuzzy logic. The second layer will try to detect new attacks in case the first one fails to detect. This system detects attacks with new signatures, and then updates the SNORT signature holder automatically, without any human intervention. The obtained results have shown that a high detection rate has been obtained with attacks having new signatures. However, it has been found that the false positive rate needs to be lowered. The second challenge was approached by evaluating IP information using fuzzy logic. This approach looks at the identity of participants in the traffic, rather than the sequence and contents of the traffic. The results have shown that this approach can help in predicting attacks at very early stages in some scenarios. However, it has been found that combining this approach with a different approach that looks at the sequence and contents of the traffic, such as event- correlation, will achieve a better performance than each approach individually

    Player agency in interactive narrative: audience, actor & author

    Get PDF
    The question motivating this review paper is, how can computer-based interactive narrative be used as a constructivist learn- ing activity? The paper proposes that player agency can be used to link interactive narrative to learner agency in constructivist theory, and to classify approaches to interactive narrative. The traditional question driving research in interactive narrative is, ‘how can an in- teractive narrative deal with a high degree of player agency, while maintaining a coherent and well-formed narrative?’ This question derives from an Aristotelian approach to interactive narrative that, as the question shows, is inherently antagonistic to player agency. Within this approach, player agency must be restricted and manip- ulated to maintain the narrative. Two alternative approaches based on Brecht’s Epic Theatre and Boal’s Theatre of the Oppressed are reviewed. If a Boalian approach to interactive narrative is taken the conflict between narrative and player agency dissolves. The question that emerges from this approach is quite different from the traditional question above, and presents a more useful approach to applying in- teractive narrative as a constructivist learning activity

    An examination of the Asus WL-HDD 2.5 as a nepenthes malware collector

    No full text
    The Linksys WRT54g has been used as a host for network forensics tools for instance Snort for a long period of time. Whilst large corporations are already utilising network forensic tools, this paper demonstrates that it is quite feasible for a non-security specialist to track and capture malicious network traffic. This paper introduces the Asus Wireless Hard disk as a replacement for the popular Linksys WRT54g. Firstly, the Linksys router will be introduced detailing some of the research that was undertaken on the device over the years amongst the security community. It then briefly discusses malicious software and the impact this may have for a home user. The paper then outlines the trivial steps in setting up Nepenthes 0.1.7 (a malware collector) for the Asus WL-HDD 2.5 according to the Nepenthes and tests the feasibility of running the malware collector on the selected device. The paper then concludes on discussing the limitations of the device when attempting to execute Nepenthes

    Everything on the Table: Tabular, Graphic, and Interactive Approaches for Interpreting and Presenting Monte Carlo Simulation Data

    Get PDF
    Abstract Monte Carlo simulation studies (MCSS) form a cornerstone for quantitative methods research. They are frequently used to evaluate and compare the properties of statistical methods and inform both future research and current best practices. However, the presentation of results from MCSS often leaves much to be desired, with findings typically conveyed via a series of elaborate tables from which readers are expected to derive meaning. The goal of this dissertation is to explore, summarize, and describe a framework for the presentation of MCSS, and show how modern computing and visualization techniques improve their interpretability. Chapter One describes this problem by introducing the logic of MCSS, how they are conducted, what findings typically look like, and current practices for their presentation. Chapter Two demonstrates methods for improving the display of static tabular data, specifically via formatting, effects ordering, and rotation. Chapter Three delves into semi-graphic and graphical approaches for aiding the presentation of tabular data via shaded tables, and extensions to the tableplot and the hypothesis-error plot frameworks. Chapter Four describes the use of interactive computing applets to aid the exploration of complex tabular data, and why this is an ideal approach. Throughout this work, emphasis is placed on how such techniques improve our understanding of a particular dataset or model. Claims are supported with applied demonstrations. Implementation of the ideas from each chapter have been coded within the R language for statistical computing and are available for adoption by other researchers in a dedicated package (SimDisplay). It is hoped that these ideas might enhance our understanding of how to best present MCSS findings and be drawn upon in both applied and academic environments

    Search beyond traditional probabilistic information retrieval

    Get PDF
    "This thesis focuses on search beyond probabilistic information retrieval. Three ap- proached are proposed beyond the traditional probabilistic modelling. First, term associ- ation is deeply examined. Term association considers the term dependency using a factor analysis based model, instead of treating each term independently. Latent factors, con- sidered the same as the hidden variables of ""eliteness"" introduced by Robertson et al. to gain understanding of the relation among term occurrences and relevance, are measured by the dependencies and occurrences of term sequences and subsequences. Second, an entity-based ranking approach is proposed in an entity system named ""EntityCube"" which has been released by Microsoft for public use. A summarization page is given to summarize the entity information over multiple documents such that the truly relevant entities can be highly possibly searched from multiple documents through integrating the local relevance contributed by proximity and the global enhancer by topic model. Third, multi-source fusion sets up a meta-search engine to combine the ""knowledge"" from different sources. Meta-features, distilled as high-level categories, are deployed to diversify the baselines. Three modified fusion methods are employed, which are re- ciprocal, CombMNZ and CombSUM with three expanded versions. Through extensive experiments on the standard large-scale TREC Genomics data sets, the TREC HARD data sets and the Microsoft EntityCube Web collections, the proposed extended models beyond probabilistic information retrieval show their effectiveness and superiority.

    Recommender systems in industrial contexts

    Full text link
    This thesis consists of four parts: - An analysis of the core functions and the prerequisites for recommender systems in an industrial context: we identify four core functions for recommendation systems: Help do Decide, Help to Compare, Help to Explore, Help to Discover. The implementation of these functions has implications for the choices at the heart of algorithmic recommender systems. - A state of the art, which deals with the main techniques used in automated recommendation system: the two most commonly used algorithmic methods, the K-Nearest-Neighbor methods (KNN) and the fast factorization methods are detailed. The state of the art presents also purely content-based methods, hybridization techniques, and the classical performance metrics used to evaluate the recommender systems. This state of the art then gives an overview of several systems, both from academia and industry (Amazon, Google ...). - An analysis of the performances and implications of a recommendation system developed during this thesis: this system, Reperio, is a hybrid recommender engine using KNN methods. We study the performance of the KNN methods, including the impact of similarity functions used. Then we study the performance of the KNN method in critical uses cases in cold start situation. - A methodology for analyzing the performance of recommender systems in industrial context: this methodology assesses the added value of algorithmic strategies and recommendation systems according to its core functions.Comment: version 3.30, May 201

    Empirical studies on word representations

    Get PDF
    • 

    corecore