1,054 research outputs found
Recommender systems in industrial contexts
This thesis consists of four parts: - An analysis of the core functions and
the prerequisites for recommender systems in an industrial context: we identify
four core functions for recommendation systems: Help do Decide, Help to
Compare, Help to Explore, Help to Discover. The implementation of these
functions has implications for the choices at the heart of algorithmic
recommender systems. - A state of the art, which deals with the main techniques
used in automated recommendation system: the two most commonly used algorithmic
methods, the K-Nearest-Neighbor methods (KNN) and the fast factorization
methods are detailed. The state of the art presents also purely content-based
methods, hybridization techniques, and the classical performance metrics used
to evaluate the recommender systems. This state of the art then gives an
overview of several systems, both from academia and industry (Amazon, Google
...). - An analysis of the performances and implications of a recommendation
system developed during this thesis: this system, Reperio, is a hybrid
recommender engine using KNN methods. We study the performance of the KNN
methods, including the impact of similarity functions used. Then we study the
performance of the KNN method in critical uses cases in cold start situation. -
A methodology for analyzing the performance of recommender systems in
industrial context: this methodology assesses the added value of algorithmic
strategies and recommendation systems according to its core functions.Comment: version 3.30, May 201
Matching Possible Mitigations to Cyber Threats: A Document-Driven Decision Support Systems Approach
Cyber systems are ubiquitous in all aspects of society. At the same time, breaches to cyber systems continue to be front-page news (Calfas, 2018; Equifax, 2017) and, despite more than a decade of heightened focus on cybersecurity, the threat continues to evolve and grow, costing globally up to $575 billion annually (Center for Strategic and International Studies, 2014; Gosler & Von Thaer, 2013; Microsoft, 2016; Verizon, 2017). To address possible impacts due to cyber threats, information system (IS) stakeholders must assess the risks they face. Following a risk assessment, the next step is to determine mitigations to counter the threats that pose unacceptably high risks. The literature contains a robust collection of studies on optimizing mitigation selections, but they universally assume that the starting list of appropriate mitigations for specific threats exists from which to down-select. In current practice, producing this starting list is largely a manual process and it is challenging because it requires detailed cybersecurity knowledge from highly decentralized sources, is often deeply technical in nature, and is primarily described in textual form, leading to dependence on human experts to interpret the knowledge for each specific context. At the same time cybersecurity experts remain in short supply relative to the demand, while the delta between supply and demand continues to grow (Center for Cyber Safety and Education, 2017; Kauflin, 2017; Libicki, Senty, & Pollak, 2014). Thus, an approach is needed to help cybersecurity experts (CSE) cut through the volume of available mitigations to select those which are potentially viable to offset specific threats.
This dissertation explores the application of machine learning and text retrieval techniques to automate matching of relevant mitigations to cyber threats, where both are expressed as unstructured or semi-structured English language text. Using the Design Science Research Methodology (Hevner & March, 2004; Peffers, Tuunanen, Rothenberger, & Chatterjee, 2007), we consider a number of possible designs for the matcher, ultimately selecting a supervised machine learning approach that combines two techniques: support vector machine classification and latent semantic analysis. The selected approach demonstrates high recall for mitigation documents in the relevant class, bolstering confidence that potentially viable mitigations will not be overlooked. It also has a strong ability to discern documents in the non-relevant class, allowing approximately 97% of non-relevant mitigations to be excluded automatically, greatly reducing the CSE’s workload over purely manual matching. A false v positive rate of up to 3% prevents totally automated mitigation selection and requires the CSE to reject a few false positives.
This research contributes to theory a method for automatically mapping mitigations to threats when both are expressed as English language text documents. This artifact represents a novel machine learning approach to threat-mitigation mapping. The research also contributes an instantiation of the artifact for demonstration and evaluation. From a practical perspective the artifact benefits all threat-informed cyber risk assessment approaches, whether formal or ad hoc, by aiding decision-making for cybersecurity experts whose job it is to mitigate the identified cyber threats. In addition, an automated approach makes mitigation selection more repeatable, facilitates knowledge reuse, extends the reach of cybersecurity experts, and is extensible to accommodate the continued evolution of both cyber threats and mitigations. Moreover, the selection of mitigations applicable to each threat can serve as inputs into multifactor analyses of alternatives, both automated and manual, thereby bridging the gap between cyber risk assessment and final mitigation selection
A usability approach to improving the user experience in web directories
PhDWeb directories are hierarchically organised website collections that offer users subjectbased
access to the Web. They played a significant part in navigating the Web in the past
but their role has been weakened in recent years due to their cumbersome expanding
collections. This thesis presents a unified framework combining the advantages of
personalisation and redefined directory search for improving the usability of Web
directories.
The thesis begins with an examination of classification schemes that identifies the
rigidity of hierarchical classifications and their suitability for Web directories in contrast
to faceted classifications. This leads on to an Ontological Sketch Modelling (OSM) case
study which identifies the misfits affecting user navigation in Web directories from
known rigidity issues. The thesis continues with a review of personalisation techniques
and a discussion of the user search model of Web directories following the suggested
directions of improvement from the case study. A proposed user-centred framework to
improve the usability of Web directories which consists of an individual content-based
personalisation model and a redefined search model is then implemented as D-Persona
and D-Search respectively. The remainder of the thesis is concerned with a usability test
of D-Persona and D-Search aimed at discovering the efficiency, effectiveness and user
satisfaction of the solution. This involves an experimental design, test results and
discussions for the comparative user study.
This thesis extracts a formal definition of the rigidity of hierarchies from their
characteristics and justifies why hierarchies are still better suited than facets in
organising Web directories. Second, it identifies misfits causing poor usability in Web
directories based on the discovered rigidity of hierarchies. Third, it proposes a solution
to tackle the misfits and improve the usability of Web directories which has been
experimentally proved to be successful
A Usability Approach to Improving the User Experience in Web Directories
Submitted for the degree of Doctor of Philosophy, Queen Mary, University of Londo
Intelligent Information Systems for Web Product Search
Over the last few years, we have experienced an increase in online shopping. Consequently, there is a need for efficient and effective product search engines. The rapid growth of e-commerce, however, has also introduced some challenges. Studies show that users can get overwhelmed by the information and offerings presented online while searching for products. In an attempt to lighten this information overload burden on consumers, there are several product search engines that aggregate product descriptions and price information from the Web and allow the user to easily query this information. Most of these search engines expect to receive the data from the participating Web shops in a specific format, which means Web shops need to transform their data more than once, as each product search engine requires a different format. Because currently most product information aggregation services rely on Web shops to send them their data, there is a big opportunity for solutions that aim to tackle this problem using a more automated approach. This dissertation addresses key aspects of implementing such a system, including hierarchical product classification, entity resolution, ontology population and schema mapping, and lastly, the optimization of faceted user interfaces. The findings of this work show us how one can design Web product search engines that automatically aggregate product information while allowing users to perform effective and efficient queries
Semantic discovery and reuse of business process patterns
Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
- …