Search CORE

1,258 research outputs found

Bridge Networking Research and Internet Standardization: Case Study on Mobile Traffic Offloading and IPv6 Transition Technologies

Author: Aaron Yi Ding
Jon Crowcroft
Jouni Korhonen
Markku Kojo
Sasu Tarkoma
Teemu Savolainen
Publication venue
Publication date
Field of study

Abstract. The gap between networking research communities and Internet standardization organizations (SDOs) has been growing over the years, which has drawn attention from both academic and industrial sides due to its detrimental impact. The reason behind this widening gap is complex and typically beyond the mere technology ground. In this position paper we share our perspectives toward this challenge based on our hands-on experience obtained from joint projects with universities and companies. We highlight the lessons learned, covering both successful and under-performed cases, and further suggest viable solutions to bridge the gap between networking research and Internet standardization, aiming to promote and maximize the outcome of such collaborative endeavours.

CiteSeerX

Incremental Blocking for Entity Resolution over Web Streaming Data

Author: Araújo Tiago Brasileiro
da Nóbrega Thiago Pereira
Nummenmaa Jyrki
Santos Pires Carlos Eduardo
Stefanidis Kostas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/12/2019
Field of study

Crossref

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Knowledge-based design support and inductive learning

Author: Tang Ming Xi
Publication venue: The University of Edinburgh
Publication date: 01/01/1996
Field of study

Designing and learning are closely related activities in that design as an ill-structure problem involves identifying the problem of the design as well as finding its solutions. A knowledge-based design support system should support learning by capturing and reusing design knowledge. This thesis addresses two fundamental problems in computational support to design activities: the development of an intelligent design support system architecture and the integration of inductive learning techniques in this architecture.This research is motivated by the belief that (1) the early stage of the design process can be modelled as an incremental learning process in which the structure of a design problem or the product data model of an artefact is developed using inductive learning techniques, and (2) the capability of a knowledge-based design support system can be enhanced by accumulating and storing reusable design product and process information.In order to incorporate inductive learning techniques into a knowledge-based design model and an integrated knowledge-based design support system architecture, the computational techniques for developing a knowledge-based design support system architecture and the role of inductive learning in Al-based design are investigated. This investigation gives a background to the development of an incremental learning model for design suitable for a class of design tasks whose structures are not well known initially.This incremental learning model for design is used as a basis to develop a knowledge-based design support system architecture that can be used as a kernel for knowledge-based design applications. This architecture integrates a number of computational techniques to support the representation and reasoning of design knowledge. In particular, it integrates a blackboard control system with an assumption-based truth maintenance system in an object-oriented environment to support the exploration of multiple design solutions by supporting the exploration and management of design contexts.As an integral part of this knowledge-based design support architecture, a design concept learning system utilising a number of unsupervised inductive learning techniques is developed. This design concept learning system combines concept formation techniques with design heuristics as background knowledge to build a design concept tree from raw data or past design examples. The design concept tree is used as a conceptual structure for the exploration of new designs.The effectiveness of this knowledge-based design support architecture and the design concept learning system is demonstrated through a realistic design domain, the design of small-molecule drugs one of the key tasks of which is to identify a pharmacophore description (the structure of a design problem) from known molecule examples.In this thesis, knowledge-based design and inductive learning techniques are first reviewed. Based on this review, an incremental learning model and an integrated architecture for intelligent design support are presented. The implementation of this architecture and a design concept learning system is then described. The application of the architecture and the design concept learning system in the domain of small-molecule drug design is then discussed. The evaluation of the architecture and the design concept learning system within and beyond this particular domain, and future research directions are finally discussed

Edinburgh Research Archive

Motion Primitives and Planning for Robots with Closed Chain Systems and Changing Topologies

Author: Gray Steven Robert
Publication venue: ScholarlyCommons
Publication date: 01/01/2013
Field of study

When operating in human environments, a robot should use predictable motions that allow humans to trust and anticipate its behavior. Heuristic search-based planning offers predictable motions and guarantees on completeness and sub-optimality of solutions. While search-based planning on motion primitive-based (lattice-based) graphs has been used extensively in navigation, application to high-dimensional state-spaces has, until recently, been thought impractical. This dissertation presents methods we have developed for applying these graphs to mobile manipulation, specifically for systems which contain closed chains. The formation of closed chains in tasks that involve contacts with the environment may reduce the number of available degrees-of-freedom but adds complexity in terms of constraints in the high-dimensional state-space. We exploit the dimensionality reduction inherent in closed kinematic chains to get efficient search-based planning. Our planner handles changing topologies (switching between open and closed-chains) in a single plan, including what transitions to include and when to include them. Thus, we can leverage existing results for search-based planning for open chains, combining open and closed chain manipulation planning into one framework. Proofs regarding the framework are introduced for the application to graph-search and its theoretical guarantees of optimality. The dimensionality-reduction is done in a manner that enables finding optimal solutions to low-dimensional problems which map to correspondingly optimal full-dimensional solutions. We apply this framework to planning for opening and navigating through non-spring and spring-loaded doors using a Willow Garage PR2. The framework motivates our approaches to the Atlas humanoid robot from Boston Dynamics for both stationary manipulation and quasi-static walking, as a closed chain is formed when both feet are on the ground

ScholarlyCommons@Penn

Clustering Approaches for Multi-source Entity Resolution

Author: Saeedi Alieh
Publication venue
Publication date: 10/12/2021
Field of study

Entity Resolution (ER) or deduplication aims at identifying entities, such as specific customer or product descriptions, in one or several data sources that refer to the same real-world entity. ER is of key importance for improving data quality and has a crucial role in data integration and querying. The previous generation of ER approaches focus on integrating records from two relational databases or performing deduplication within a single database. Nevertheless, in the era of Big Data the number of available data sources is increasing rapidly. Therefore, large-scale data mining or querying systems need to integrate data obtained from numerous sources. For example, in online digital libraries or E-Shops, publications or products are incorporated from a large number of archives or suppliers across the world or within a specified region or country to provide a unified view for the user. This process requires data consolidation from numerous heterogeneous data sources, which are mostly evolving. By raising the number of sources, data heterogeneity and velocity as well as the variance in data quality is increased. Therefore, multi-source ER, i.e. finding matching entities in an arbitrary number of sources, is a challenging task. Previous efforts for matching and clustering entities between multiple sources (> 2) mostly treated all sources as a single source. This approach excludes utilizing metadata or provenance information for enhancing the integration quality and leads up to poor results due to ignorance of the discrepancy between quality of sources. The conventional ER pipeline consists of blocking, pair-wise matching of entities, and classification. In order to meet the new needs and requirements, holistic clustering approaches that are capable of scaling to many data sources are needed. The holistic clustering-based ER should further overcome the restriction of pairwise linking of entities by making the process capable of grouping entities from multiple sources into clusters. The clustering step aims at removing false links while adding missing true links across sources. Additionally, incremental clustering and repairing approaches need to be developed to cope with the ever-increasing number of sources and new incoming entities. To this end, we developed novel clustering and repairing schemes for multi-source entity resolution. The approaches are capable of grouping entities from multiple clean (duplicate-free) sources, as well as handling data from an arbitrary combination of clean and dirty sources. The multi-source clustering schemes exclusively developed for multi-source ER can obtain superior results compared to general purpose clustering algorithms. Additionally, we developed incremental clustering and repairing methods in order to handle the evolving sources. The proposed incremental approaches are capable of incorporating new sources as well as new entities from existing sources. The more sophisticated approach is able to repair previously determined clusters, and consequently yields improved quality and a reduced dependency on the insert order of the new entities. To ensure scalability, the parallel variation of all approaches are implemented on top of the Apache Flink framework which is a distributed processing engine. The proposed methods have been integrated in a new end-to-end ER tool named FAMER (FAst Multi-source Entity Resolution system). The FAMER framework is comprised of Linking and Clustering components encompassing both batch and incremental ER functionalities. The output of Linking part is recorded as a similarity graph where each vertex represents an entity and each edge maintains the similarity relationship between two entities. Such a similarity graph is the input of the Clustering component. The comprehensive comparative evaluations overall show that the proposed clustering and repairing approaches for both batch and incremental ER achieve high quality while maintaining the scalability

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

User data discovery and aggregation: the CS-UDD algorithm

Author: Aroyo
Batini
Berkovsky
Bleiholder
Brown
Bundy
Carmagnola
Carmagnola
Castillo
Cena
Dempster
Dubois
Elmagarmid
Fellegi
Finch
Francesca Carmagnola
Francesco Osborne
Ghazizadeh
Hay
Heckerman
Heckmann
Hong
Ilaria Torre
Ioannou
Irani
Jiexun
Kobsa
Kruppa
Köpcke
Lee
Levenshtein
McCarthy
Reiter
Salton
Shafer
Wang
Zadeh
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes

Crossref

Open Research Online (The Open University)

Archivio istituzionale della ricerca - Università di Genova

A Methodological Comparison of Agree/Disagree and Item-Specific Questions

Author: Höhne Jan Karem
Publication venue
Publication date: 30/11/2017
Field of study

Georg-August-University Göttingen

Working Notes from the 1992 AAAI Spring Symposium on Practical Approaches to Scheduling and Planning

Author: Drummond Mark
Fox Mark
Tate Austin
Zweben Monte
Publication venue
Publication date
Field of study

The symposium presented issues involved in the development of scheduling systems that can deal with resource and time limitations. To qualify, a system must be implemented and tested to some degree on non-trivial problems (ideally, on real-world problems). However, a system need not be fully deployed to qualify. Systems that schedule actions in terms of metric time constraints typically represent and reason about an external numeric clock or calendar and can be contrasted with those systems that represent time purely symbolically. The following topics are discussed: integrating planning and scheduling; integrating symbolic goals and numerical utilities; managing uncertainty; incremental rescheduling; managing limited computation time; anytime scheduling and planning algorithms, systems; dependency analysis and schedule reuse; management of schedule and plan execution; and incorporation of discrete event techniques

NASA Technical Reports Server

Clustering of Time Series Data: Measures, Methods, and Applications

Author: Ma Ruizhe
Publication venue: ScholarWorks @ Georgia State University
Publication date: 08/05/2020
Field of study

Clustering is an essential branch of data mining and statistical analysis that could help us explore the distribution of data and extract knowledge. With the broad accumulation and application of time series data, the study of its clustering is a natural extension of existing unsupervised learning heuristics. We discuss the components which configure the clustering of time series data, specifically, the similarity measure, the clustering heuristic, the evaluation of cluster quality, and the applications of said heuristics. Being the groundwork for the task of data analysis, we propose a scalable and efficient time series similarity measure: segmented-Dynamic Time Warping. For time series clustering, we formulate the Distance Density Clustering heuristic, a deterministic clustering algorithm that adopts concepts from both density and distance separation. In addition, we explored the characteristics and discussed the limitations of existing cluster evaluation methods. Finally, all components lead to the goal of real-world applications

ScholarWorks @ Georgia State University

Memory Clustering Using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-Starts

Author: Dinev Traiko
Havoutis Ioannis
Ivan Vladimir
Merkt Wolfgang Xaver
Vijayakumar Sethu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Shooting methods are an efficient approach to solving nonlinear optimal control problems. As they use local optimization, they exhibit favorable convergence when initialized with a good warm-start but may not converge at all if provided with a poor initial guess. Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space. However, in practice the solutions contain discontinuities introduced by system dynamics or the environment. Additionally, in many cases multiple equally suitable, i.e., multi-modal, solutions exist to solve a problem. Classic learning approaches smooth across the boundary of these discontinuities and thus generalize poorly. In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space. In particular, we introduce a method based on persistent homology to automatically cluster the dataset of precomputed solutions to obtain different candidate initial guesses. We then train a Mixture-of-Experts within each cluster to predict state and control trajectories to warm-start the optimal control solver and provide a comparison with modality-agnostic learning. We demonstrate our method on a cart-pole toy problem and a quadrotor avoiding obstacles, and show that clustering samples based on inherent structure improves the warm-start quality.Comment: 12 pages, 10 figures, accepted as a regular paper in IEEE Transactions on Robotics (T-RO). Supplementary video: https://youtu.be/lUULTWCFxY8 Code: https://github.com/wxmerkt/topological_memory_clustering The first two authors contributed equall

arXiv.org e-Print Archive

Edinburgh Research Explorer

Oxford University Research Archive