16,008 research outputs found

    Rank-based linkage I: triplet comparisons and oriented simplicial complexes

    Full text link
    Rank-based linkage is a new tool for summarizing a collection SS of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on SS. Rank-based linkage is applied to the KK-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected KK-nearest neighbor graph on SS. In ∣S∣K2|S| K^2 steps it builds an edge-weighted linkage graph (S,L,σ)(S, \mathcal{L}, \sigma) where σ({x,y})\sigma(\{x, y\}) is called the in-sway between objects xx and yy. Take Lt\mathcal{L}_t to be the links whose in-sway is at least tt, and partition SS into components of the graph (S,Lt)(S, \mathcal{L}_t), for varying tt. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not ``rip apart`` the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.Comment: 37 pages, 12 figure

    ENABLING EFFICIENT FLEET COMPOSITION SELECTION THROUGH THE DEVELOPMENT OF A RANK HEURISTIC FOR A BRANCH AND BOUND METHOD

    Get PDF
    In the foreseeable future, autonomous mobile robots (AMRs) will become a key enabler for increasing productivity and flexibility in material handling in warehousing facilities, distribution centers and manufacturing systems. The objective of this research is to develop and validate parametric models of AMRs, develop ranking heuristic using a physics-based algorithm within the framework of the Branch and Bound method, integrate the ranking algorithm into a Fleet Composition Optimization (FCO) tool, and finally conduct simulations under various scenarios to verify the suitability and robustness of the developed tool in a factory equipped with AMRs. Kinematic-based equations are used for computing both energy and time consumption. Multivariate linear regression, a data-driven method, is used for designing the ranking heuristic. The results indicate that the unique physical structures and parameters of each robot are the main factors contributing to differences in energy and time consumption. improvement on reducing computation time was achieved by comparing heuristic-based search and non-heuristic-based search. This research is expected to significantly improve the current nested fleet composition optimization tool by reducing computation time without sacrificing optimality. From a practical perspective, greater efficiency in reducing energy and time costs can be achieved.Ford Motor CompanyNo embargoAcademic Major: Aerospace Engineerin

    A Benchmark Framework for Data Compression Techniques

    Get PDF
    Lightweight data compression is frequently applied in main memory database systems to improve query performance. The data processed by such systems is highly diverse. Moreover, there is a high number of existing lightweight compression techniques. Therefore, choosing the optimal technique for a given dataset is non-trivial. Existing approaches are based on simple rules, which do not suffice for such a complex decision. In contrast, our vision is a cost-based approach. However, this requires a detailed cost model, which can only be obtained from a systematic benchmarking of many compression algorithms on many different datasets. A naïve benchmark evaluates every algorithm under consideration separately. This yields many redundant steps and is thus inefficient. We propose an efficient and extensible benchmark framework for compression techniques. Given an ensemble of algorithms, it minimizes the overall run time of the evaluation. We experimentally show that our approach outperforms the naïve approach

    Discovering the hidden structure of financial markets through bayesian modelling

    Get PDF
    Understanding what is driving the price of a financial asset is a question that is currently mostly unanswered. In this work we go beyond the classic one step ahead prediction and instead construct models that create new information on the behaviour of these time series. Our aim is to get a better understanding of the hidden structures that drive the moves of each financial time series and thus the market as a whole. We propose a tool to decompose multiple time series into economically-meaningful variables to explain the endogenous and exogenous factors driving their underlying variability. The methodology we introduce goes beyond the direct model forecast. Indeed, since our model continuously adapts its variables and coefficients, we can study the time series of coefficients and selected variables. We also present a model to construct the causal graph of relations between these time series and include them in the exogenous factors. Hence, we obtain a model able to explain what is driving the move of both each specific time series and the market as a whole. In addition, the obtained graph of the time series provides new information on the underlying risk structure of this environment. With this deeper understanding of the hidden structure we propose novel ways to detect and forecast risks in the market. We investigate our results with inferences up to one month into the future using stocks, FX futures and ETF futures, demonstrating its superior performance according to accuracy of large moves, longer-term prediction and consistency over time. We also go in more details on the economic interpretation of the new variables and discuss the created graph structure of the market.Open Acces

    An Improved eXplainable Point Cloud Classifier (XPCC)

    Get PDF
    Classification of objects from 3D point clouds has become an increasingly relevant task across many computer vision applications. However, few studies have investigated explainable methods. In this paper, a new prototype-based and explainable classification method called eXplainable Point Cloud Classifier (XPCC) is proposed. The XPCC method offers several advantages over previous explainable and non-explainable methods. First, the XPCC method uses local densities and global multivariate generative distributions. Therefore, the XPCC provides comprehensive and interpretable object-based classification. Furthermore, the proposed method is built on recursive calculations, thus, is computationally very efficient. Second, the model learns continuously without the need for complete re-training and is domain transferable. Third, the proposed XPCC expands on the underlying learning method, xDNN, and is specific to 3D. As such, three new layers are added to the original xDNN architecture: i) the 3D point cloud feature extraction, ii) the global compound prototype weighting, and iii) the SoftMax function. Experiments were performed with the ModelNet40 benchmark which demonstrated that XPCC is the only explainable point cloud classifier to increase classification accuracy relative to the base algorithm when applied to the same problem. Additionally, this paper proposes a novel prototype-based visual representation that provides model- and object-based explanations. The prototype objects are superimposed to create a prototypical class representation of their data density within the feature space, called the Compound Prototype Cloud. They allow a user to visualize the explainable aspects of the model and identify object regions that contribute to the classification in a human-understandable way

    Targeting Fusion Proteins of HIV-1 and SARS-CoV-2

    Get PDF
    Viruses are disease-causing pathogenic agents that require host cells to replicate. Fusion of host and viral membranes is critical for the lifecycle of enveloped viruses. Studying viral fusion proteins can allow us to better understand how they shape immune responses and inform the design of therapeutics such as drugs, monoclonal antibodies, and vaccines. This thesis discusses two approaches to targeting two fusion proteins: Env from HIV-1 and S from SARS-CoV-2. The first chapter of this thesis is an introduction to viruses with a specific focus on HIV-1 CD4 mimetic drugs and antibodies against SARS-CoV-2. It discusses the architecture of these viruses and fusion proteins and how small molecules, peptides, and antibodies can target these proteins successfully to treat and prevent disease. In addition, a brief overview is included of the techniques involved in structural biology and how it has informed the study of viruses. For the interested reader, chapter 2 contains a review article that serves as a more in-depth introduction for both viruses as well as how the use of structural biology has informed the study of viral surface proteins and neutralizing antibody responses to them. The subsequent chapters provide a body of work divided into two parts. The first part in chapter 3 involves a study on conformational changes induced in the HIV-1 Env protein by CD4-mimemtic drugs using single particle cryo-EM. The second part encompassing chapters 4 and 5 includes two studies on antibodies isolated from convalescent COVID-19 donors. The former involves classification of antibody responses to the SARS-CoV-2 S receptor-binding domain (RBD). The latter discusses an anti-RBD antibody class that binds to a conserved epitope on the RBD and shows cross-binding and cross-neutralization to other coronaviruses in the sarbecovirus subgenus.</p

    Data-to-text generation with neural planning

    Get PDF
    In this thesis, we consider the task of data-to-text generation, which takes non-linguistic structures as input and produces textual output. The inputs can take the form of database tables, spreadsheets, charts, and so on. The main application of data-to-text generation is to present information in a textual format which makes it accessible to a layperson who may otherwise find it problematic to understand numerical figures. The task can also automate routine document generation jobs, thus improving human efficiency. We focus on generating long-form text, i.e., documents with multiple paragraphs. Recent approaches to data-to-text generation have adopted the very successful encoder-decoder architecture or its variants. These models generate fluent (but often imprecise) text and perform quite poorly at selecting appropriate content and ordering it coherently. This thesis focuses on overcoming these issues by integrating content planning with neural models. We hypothesize data-to-text generation will benefit from explicit planning, which manifests itself in (a) micro planning, (b) latent entity planning, and (c) macro planning. Throughout this thesis, we assume the input to our generator are tables (with records) in the sports domain. And the output are summaries describing what happened in the game (e.g., who won/lost, ..., scored, etc.). We first describe our work on integrating fine-grained or micro plans with data-to-text generation. As part of this, we generate a micro plan highlighting which records should be mentioned and in which order, and then generate the document while taking the micro plan into account. We then show how data-to-text generation can benefit from higher level latent entity planning. Here, we make use of entity-specific representations which are dynam ically updated. The text is generated conditioned on entity representations and the records corresponding to the entities by using hierarchical attention at each time step. We then combine planning with the high level organization of entities, events, and their interactions. Such coarse-grained macro plans are learnt from data and given as input to the generator. Finally, we present work on making macro plans latent while incrementally generating a document paragraph by paragraph. We infer latent plans sequentially with a structured variational model while interleaving the steps of planning and generation. Text is generated by conditioning on previous variational decisions and previously generated text. Overall our results show that planning makes data-to-text generation more interpretable, improves the factuality and coherence of the generated documents and re duces redundancy in the output document

    Innovative Hybrid Approaches for Vehicle Routing Problems

    Get PDF
    This thesis deals with the efficient resolution of Vehicle Routing Problems (VRPs). The first chapter faces the archetype of all VRPs: the Capacitated Vehicle Routing Problem (CVRP). Despite having being introduced more than 60 years ago, it still remains an extremely challenging problem. In this chapter I design a Fast Iterated-Local-Search Localized Optimization algorithm for the CVRP, shortened to FILO. The simplicity of the CVRP definition allowed me to experiment with advanced local search acceleration and pruning techniques that have eventually became the core optimization engine of FILO. FILO experimentally shown to be extremely scalable and able to solve very large scale instances of the CVRP in a fraction of the computing time compared to existing state-of-the-art methods, still obtaining competitive solutions in terms of their quality. The second chapter deals with an extension of the CVRP called the Extended Single Truck and Trailer Vehicle Routing Problem, or simply XSTTRP. The XSTTRP models a broad class of VRPs in which a single vehicle, composed of a truck and a detachable trailer, has to serve a set of customers with accessibility constraints making some of them not reachable by using the entire vehicle. This problem moves towards VRPs including more realistic constraints and it models scenarios such as parcel deliveries in crowded city centers or rural areas, where maneuvering a large vehicle is forbidden or dangerous. The XSTTRP generalizes several well known VRPs such as the Multiple Depot VRP and the Location Routing Problem. For its solution I developed an hybrid metaheuristic which combines a fast heuristic optimization with a polishing phase based on the resolution of a limited set partitioning problem. Finally, the thesis includes a final chapter aimed at guiding the computational evaluation of new approaches to VRPs proposed by the machine learning community

    Investigating and mitigating the role of neutralisation techniques on information security policies violation in healthcare organisations

    Get PDF
    Healthcare organisations today rely heavily on Electronic Medical Records systems (EMRs), which have become highly crucial IT assets that require significant security efforts to safeguard patients’ information. Individuals who have legitimate access to an organisation’s assets to perform their day-to-day duties but intentionally or unintentionally violate information security policies can jeopardise their organisation’s information security efforts and cause significant legal and financial losses. In the information security (InfoSec) literature, several studies emphasised the necessity to understand why employees behave in ways that contradict information security requirements but have offered widely different solutions. In an effort to respond to this situation, this thesis addressed the gap in the information security academic research by providing a deep understanding of the problem of medical practitioners’ behavioural justifications to violate information security policies and then determining proper solutions to reduce this undesirable behaviour. Neutralisation theory was used as the theoretical basis for the research. This thesis adopted a mixed-method research approach that comprises four consecutive phases, and each phase represents a research study that was conducted in light of the results from the preceding phase. The first phase of the thesis started by investigating the relationship between medical practitioners’ neutralisation techniques and their intention to violate information security policies that protect a patient’s privacy. A quantitative study was conducted to extend the work of Siponen and Vance [1] through a study of the Saudi Arabia healthcare industry. The data was collected via an online questionnaire from 66 Medical Interns (MIs) working in four academic hospitals. The study found that six neutralisation techniques—(1) appeal to higher loyalties, (2) defence of necessity, (3) the metaphor of ledger, (4) denial of responsibility, (5) denial of injury, and (6) condemnation of condemners—significantly contribute to the justifications of the MIs in hypothetically violating information security policies. The second phase of this research used a series of semi-structured interviews with IT security professionals in one of the largest academic hospitals in Saudi Arabia to explore the environmental factors that motivated the medical practitioners to evoke various neutralisation techniques. The results revealed that social, organisational, and emotional factors all stimulated the behavioural justifications to breach information security policies. During these interviews, it became clear that the IT department needed to ensure that security policies fit the daily tasks of the medical practitioners by providing alternative solutions to ensure the effectiveness of those policies. Based on these interviews, the objective of the following two phases was to improve the effectiveness of InfoSec policies against the use of behavioural justification by engaging the end users in the modification of existing policies via a collaborative writing process. Those two phases were conducted in the UK and Saudi Arabia to determine whether the collaborative writing process could produce a more effective security policy that balanced the security requirements with daily business needs, thus leading to a reduction in the use of neutralisation techniques to violate security policies. The overall result confirmed that the involvement of the end users via a collaborative writing process positively improved the effectiveness of the security policy to mitigate the individual behavioural justifications, showing that the process is a promising one to enhance security compliance
    • …
    corecore