    Automatic Classification of Text Databases through Query Probing

    Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only databases. Recently, Yahoo-like directories have started to manually organize these databases into categories that users can browse to find these valuable resources. We propose a novel strategy to automate the classification of search-only text databases. Our technique starts by training a rule-based document classifier, and then uses the classifier's rules to generate probing queries. The queries are sent to the text databases, which are then classified based on the number of matches that they produce for each query. We report some initial exploratory experiments that show that our approach is promising to automatically characterize the contents of text databases accessible on the web.Comment: 7 pages, 1 figur

    A constraint specification approach to building flexible workflows

    Process support systems, such as workflows, are being used in a variety of domains. However, most areas of application have focused on traditional production-style processes, which are characterised by predictability and repetitiveness. Application in non-traditional domains with highly flexible process is still largely unexplored. Such flexible processes are characterised by lack of ability to completely predefine and/or an explosive number of alternatives. Accordingly we define flexibility as the ability of the process to execute on the basis of a partially defined model where the full specification is made at runtime and may be unique to each instance. In this paper, we will present an approach to building workflow models for such processes. We will present our approach in the context of a non-traditional domain for workflow, deployment, which is, degree programs in tertiary institutes. The primary motivation behind our approach is to provide the ability to model flexible processes without introducing non-standard modelling constructs. This ensures that the correctness and verification of the language is preserved. We propose to build workflow schemas from a standard set of modelling constructs and given process constraints. We identify the fundamental requirements for constraint specification and classify them into selection, termination and build constraints. We will detail the specification of these constraints in a relational model. Finally, we will demonstrate the dynamic building of instance specific workflow models on the basis of these constraints

    Mobile Ad-Hoc Networks

    Ad-hoc networks are a key in the evolution of wireless networks. Ad-hoc networks are typically composed of equal nodes, which communicate over wireless links without any central control. Ad-hoc wireless networks inherit the traditional problems of wireless and mobile communications, such as bandwidth optimisation, power control and transmission quality enhancement. In addition, the multi-hop nature and the lack of fixed infrastructure brings new research problems such as configuration advertising, discovery and maintenance, as well as ad-hoc addressing and self-routing. Many different approaches and protocols have been proposed and there are even multiple standardization efforts within the Internet Engineering Task Force, as well as academic and industrial projects. This chapter focuses on the state of the art in mobile ad-hoc networks. It highlights some of the emerging technologies, protocols, and approaches (at different layers) for realizing network services for users on the move in areas with possibly no pre-existing communications infrastructure

    Koostööäriprotsesside läbiviimine plokiahelal: süsteem

    Tänapäeval peavad organisatsioonid tegema omavahel koostööd, et kasutada ära üksteise täiendavaid võimekusi ning seeläbi pakkuda oma klientidele parimaid tooteid ja teenuseid. Selleks peavad organisatsioonid juhtima äriprotsesse, mis ületavad nende organisatsioonilisi piire. Selliseid protsesse nimetatakse koostööäriprotsessideks. Üks peamisi takistusi koostööäriprotsesside elluviimisel on osapooltevahelise usalduse puudumine. Plokiahel loob detsentraliseeritud pearaamatu, mida ei saa võltsida ning mis toetab nutikate lepingute täitmist. Nii on võimalik teha koostööd ebausaldusväärsete osapoolte vahel ilma kesksele asutusele tuginemata. Paraku on aga äriprotsesside läbiviimine selliseid madala taseme plokiahela elemente kasutades tülikas, veaohtlik ja erioskusi nõudev. Seevastu juba väljakujunenud äriprotsesside juhtimissüsteemid (Business Process Management System – BPMS) pakuvad käepäraseid abstraheeringuid protsessidele orienteeritud rakenduste kiireks arendamiseks. Käesolev doktoritöö käsitleb koostööäriprotsesside automatiseeritud läbiviimist plokiahela tehnoloogiat kasutades, kombineerides traditsioonliste BPMS- ide arendusvõimalused plokiahelast tuleneva suurendatud usaldusega. Samuti käsitleb antud doktoritöö küsimust, kuidas pakkuda tuge olukordades, milles uued osapooled võivad jooksvalt protsessiga liituda, mistõttu on vajalik tagada paindlikkus äriprotsessi marsruutimisloogika muutmise osas. Doktoritöö uurib tarkvaraarhitektuurilisi lähenemisviise ja modelleerimise kontseptsioone, pakkudes välja disainipõhimõtteid ja nõudeid, mida rakendatakse uudsel plokiahela baasil loodud äriprotsessi juhtimissüsteemil CATERPILLAR. CATERPILLAR-i süsteem toetab kahte lähenemist plokiahelal põhinevate protsesside rakendamiseks, läbiviimiseks ja seireks: kompileeritud ja tõlgendatatud. Samuti toetab see kahte kontrollitud paindlikkuse mehhanismi, mille abil saavad protsessis osalejad ühiselt otsustada, kuidas protsessi selle täitmise ajal uuendada ning anda ja eemaldada osaliste juurdepääsuõigusi.Nowadays, organizations are pressed to collaborate in order to take advantage of their complementary capabilities and to provide best-of-breed products and services to their customers. To do so, organizations need to manage business processes that span beyond their organizational boundaries. Such processes are called collaborative business processes. One of the main roadblocks to implementing collaborative business processes is the lack of trust between the participants. Blockchain provides a decentralized ledger that cannot be tamper with, that supports the execution of programs called smart contracts. These features allow executing collaborative processes between untrusted parties and without relying on a central authority. However, implementing collaborative business processes in blockchain can be cumbersome, error-prone and requires specialized skills. In contrast, established Business Process Management Systems (BPMSs) provide convenient abstractions for rapid development of process-oriented applications. This thesis addresses the problem of automating the execution of collaborative business processes on top of blockchain technology in a way that takes advantage of the trust-enhancing capabilities of this technology while offering the development convenience of traditional BPMSs. The thesis also addresses the question of how to support scenarios in which new parties may be onboarded at runtime, and in which parties need to have the flexibility to change the default routing logic of the business process. We explore architectural approaches and modelling concepts, formulating design principles and requirements that are implemented in a novel blockchain-based BPMS named CATERPILLAR. The CATERPILLAR system supports two methods to implement, execute and monitor blockchain-based processes: compiled and interpreted. It also supports two mechanisms for controlled flexibility; i.e., participants can collectively decide on updating the process during its execution as well as granting and revoking access to parties.https://www.ester.ee/record=b536494

    Die Sphere-Search-Suchmaschine zur graphbasierten Suche auf heterogenen, semistrukturierten Daten

    In dieser Arbeit wird die neuartige SphereSearch-Suchmaschine vorgestellt, die ein einheitliches ranglistenbasiertes Retrieval auf heterogenen XML- und Web-Daten ermöglicht. Ihre Fähigkeiten umfassen die Auswertung von vagen Struktur- und Inhaltsbedingungen sowie ein auf IR-Statistiken und einem graph-basierten Datenmodell basierendes Relevanz-Ranking. Web-Dokumente im HTML- und PDFFormat werden zunächst automatisch in ein XML-Zwischenformat konvertiert und anschließend mit Hilfe von Annotations-Tools durch zusätzliche Tags semantisch angereichtert. Die graph-basierte Suchmaschine bietet auf semi-strukturierten Daten vielfältige Suchmöglichkeiten, die von keiner herkömmlichen Web- oder XMLSuchmaschine ausgedrückt werden können: konzeptbewusste und kontextbewusste Suche, die sowohl die implizite Struktur von Daten als auch ihren Kontext berücksichtigt. Die Vorteile der SphereSearch-Suchmaschine werden durch Experimente auf verschiedenen Dokumentenkorpora demonstriert. Diese umfassen eine große, vielfältige Tags beinhaltende, nicht-schematische Enzyklopädie, die um externe Dokumente erweitert wurde, sowie einen Standard-XML-Benchmark.This thesis presents the novel SphereSearch Engine that provides unified ranked retrieval on heterogeneous XML andWeb data. Its search capabilities include vague structure and text content conditions, and relevance ranking based on IR statistics and a graph-based data model. Web pages in HTML or PDF are automatically converted into an intermediate XML format, with the option of generating semantic tags by means of linguistic annotation tools. For semi-structured data the graphbased query engine is leveraged to provide very rich search options that cannot be expressed in traditional Web or XML search engines: concept-aware and linkaware querying that takes into account the implicit structure and context of Web pages. The benefits of the SphereSearch engine are demonstrated by experiments with a large and richly tagged but non-schematic open encyclopedia extended with external documents and a standard XML benchmark

    Quality of Service and Optimization in Data Integration Systems

    This work presents techniques for the construction of a global data integrations system. Similar to distributed databases this system allows declarative queries in order to express user-specific information needs. Scalability towards global data integration systems and openness were major design goals for the architecture and techniques developed in this work. It is shown how service composition, extensibility and quality of service can be supported in an open system of providers for data, functionality for query processing operations, and computing power.Diese Arbeit präsentiert Techniken für den Aufbau eines globalen Datenintegrationssystems. Analog zu verteilten Datenbanken unterstützt dieses System deklarative Anfragen, mit denen Benutzer die gesuchte Information beschreiben können. Die Skalierbarkeit in einem globalen Kontext und die Offenheit waren hauptsächliche Entwicklungsziele der Architektur und der Techniken, die in dieser Arbeit entstanden sind. Es wird gezeigt wie Dienstekomposition, Erweiterbarkeit und Dienstgüte in einem offenen System von Anbietern für Daten, Anfrageverarbeitungsfunktionalität und Rechenleistung unterstützt werden können

    Enterprise modelling framework for dynamic and complex business environment: socio-technical systems perspective

    The modern business environment is characterised by dynamism and ambiguity. The causes include global economic change, rapid change requirements, shortened development life cycles and the increasing complexity of information technology and information systems (IT/IS). However, enterprises have been seen as socio-technical systems. The dynamic complex business environment cannot be understood without intensive modelling and simulation. Nevertheless, there is no single description of reality, which has been seen as relative to its context and point of view. Human perception is considered an important determinant for the subjectivist view of reality. Many scholars working in the socio-technical systems and enterprise modelling domains have conceived the holistic sociotechnical systems analysis and design possible using a limited number of procedural and modelling approaches. For instance, the ETHICS and Human-centred design approaches of socio-technical analysis and design, goal-oriented and process-oriented modelling of enterprise modelling perspectives, and the Zachman and DoDAF enterprise architecture frameworks all have limitations that can be improved upon, which have been significantly explained in this thesis. [Continues.