3,061 research outputs found

    Seventh Biennial Report : June 2003 - March 2005

    No full text

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    State-of-the-art on evolution and reactivity

    Get PDF
    This report starts by, in Chapter 1, outlining aspects of querying and updating resources on the Web and on the Semantic Web, including the development of query and update languages to be carried out within the Rewerse project. From this outline, it becomes clear that several existing research areas and topics are of interest for this work in Rewerse. In the remainder of this report we further present state of the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs; in Chapter 4 event-condition-action rules, both in the context of active database systems and in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks

    A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge

    Full text link
    We present the architecture and the evaluation of a new system for recognizing textual entailment (RTE). In RTE we want to identify automatically the type of a logical relation between two input texts. In particular, we are interested in proving the existence of an entailment between them. We conceive our system as a modular environment allowing for a high-coverage syntactic and semantic text analysis combined with logical inference. For the syntactic and semantic analysis we combine a deep semantic analysis with a shallow one supported by statistical models in order to increase the quality and the accuracy of results. For RTE we use logical inference of first-order employing model-theoretic techniques and automated reasoning tools. The inference is supported with problem-relevant background knowledge extracted automatically and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or other, more experimental sources with, e.g., manually defined presupposition resolutions, or with axiomatized general and common sense knowledge. The results show that fine-grained and consistent knowledge coming from diverse sources is a necessary condition determining the correctness and traceability of results.Comment: 25 pages, 10 figure

    A Survey on IT-Techniques for a Dynamic Emergency Management in Large Infrastructures

    Get PDF
    This deliverable is a survey on the IT techniques that are relevant to the three use cases of the project EMILI. It describes the state-of-the-art in four complementary IT areas: Data cleansing, supervisory control and data acquisition, wireless sensor networks and complex event processing. Even though the deliverableā€™s authors have tried to avoid a too technical language and have tried to explain every concept referred to, the deliverable might seem rather technical to readers so far little familiar with the techniques it describes

    Approximate model composition for explanation generation

    Get PDF
    This thesis presents a framework for the formulation of knowledge models to supĀ¬ port the generation of explanations for engineering systems that are represented by the resulting models. Such models are automatically assembled from instantiated generic component descriptions, known as modelfragments. The model fragments are of suffiĀ¬ cient detail that generally satisfies the requirements of information content as identified by the user asking for explanations. Through a combination of fuzzy logic based evidence preparation, which exploits the history of prior user preferences, and an approximate reasoning inference engine, with a Bayesian evidence propagation mechanism, different uncertainty sources can be hanĀ¬ dled. Model fragments, each representing structural or behavioural aspects of a comĀ¬ ponent of the domain system of interest, are organised in a library. Those fragments that represent the same domain system component, albeit with different representation detail, form parts of the same assumption class in the library. Selected fragments are assembled to form an overall system model, prior to extraction of any textual inforĀ¬ mation upon which to base the explanations. The thesis proposes and examines the techniques that support the fragment selection mechanism and the assembly of these fragments into models. In particular, a Bayesian network-based model fragment selection mechanism is deĀ¬ scribed that forms the core of the work. The network structure is manually determined prior to any inference, based on schematic information regarding the connectivity of the components present in the domain system under consideration. The elicitation of network probabilities, on the other hand is completely automated using probability elicitation heuristics. These heuristics aim to provide the information required to select fragments which are maximally compatible with the given evidence of the fragments preferred by the user. Given such initial evidence, an existing evidence propagation algorithm is employed. The preparation of the evidence for the selection of certain fragments, based on user preference, is performed by a fuzzy reasoning evidence fabĀ¬ rication engine. This engine uses a set of fuzzy rules and standard fuzzy reasoning mechanisms, attempting to guess the information needs of the user and suggesting the selection of fragments of sufficient detail to satisfy such needs. Once the evidence is propagated, a single fragment is selected for each of the domain system compoĀ¬ nents and hence, the final model of the entire system is constructed. Finally, a highly configurable XML-based mechanism is employed to extract explanation content from the newly formulated model and to structure the explanatory sentences for the final explanation that will be communicated to the user. The framework is illustratively applied to a number of domain systems and is compared qualitatively to existing compositional modelling methodologies. A further empirical assessment of the performance of the evidence propagation algorithm is carried out to determine its performance limits. Performance is measured against the number of fragĀ¬ ments that represent each of the components of a large domain system, and the amount of connectivity permitted in the Bayesian network between the nodes that stand for the selection or rejection of these fragments. Based on this assessment recommendaĀ¬ tions are made as to how the framework may be optimised to cope with real world applications
    • ā€¦
    corecore