23,713 research outputs found

    Intelligent Self-Repairable Web Wrappers

    Get PDF
    The amount of information available on the Web grows at an incredible high rate. Systems and procedures devised to extract these data from Web sources already exist, and different approaches and techniques have been investigated during the last years. On the one hand, reliable solutions should provide robust algorithms of Web data mining which could automatically face possible malfunctioning or failures. On the other, in literature there is a lack of solutions about the maintenance of these systems. Procedures that extract Web data may be strictly interconnected with the structure of the data source itself; thus, malfunctioning or acquisition of corrupted data could be caused, for example, by structural modifications of data sources brought by their owners. Nowadays, verification of data integrity and maintenance are mostly manually managed, in order to ensure that these systems work correctly and reliably. In this paper we propose a novel approach to create procedures able to extract data from Web sources -- the so called Web wrappers -- which can face possible malfunctioning caused by modifications of the structure of the data source, and can automatically repair themselves.\u

    Toward a dynamic perspective on exploative and exploitative innovation activities: a longitudianl study of innovation in the wind blade industry

    Get PDF
    Innovation requires a combination of explorative and exploitative innovation\ud activities. Previous studies have provided valuable insights in the antecedents of investing in explorative and exploitative activities, the structural governance of exploration and exploitation and the performance implications of engaging in exploration and exploitation. These studies are dominated by cross-sectional research, largely ignoring the evolution of exploration and exploitation over time. Several scholars, however, provide first indications that the allocation of time and resources across exploration and exploitation might change over time. In order to examine the dynamics of explorative and exploitative innovation activities, we conducted an indepth case study in one particular company in the wind blade industry, applying a novel approach to measure the evolution of the amount of R&D resources allocated to explorative and exploitative activities over a 5 year time period. Our results show that the relative amount of resources and time invested in exploration versus exploitation is not static, but changes over time. The pattern of the evolution of exploration and exploitation at our case company shows phases in which exploration and exploitation activities are well balanced, and phases where one type of innovation dominates innovation activities. Based on additional qualitative data we found first indications of antecedents of the dynamics of exploration and exploitation. Together, our findings provide an interesting starting point for future research on the antecedents, structural governance and performance implications of the evolution of exploration and exploitation over time

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    Design of Automatically Adaptable Web Wrappers

    Get PDF
    Nowadays, the huge amount of information distributed through the Web motivates studying techniques to\ud be adopted in order to extract relevant data in an efļ¬cient and reliable way. Both academia and enterprises\ud developed several approaches of Web data extraction, for example using techniques of artiļ¬cial intelligence or\ud machine learning. Some commonly adopted procedures, namely wrappers, ensure a high degree of precision\ud of information extracted from Web pages, and, at the same time, have to prove robustness in order not to\ud compromise quality and reliability of data themselves.\ud In this paper we focus on some experimental aspects related to the robustness of the data extraction process\ud and the possibility of automatically adapting wrappers. We discuss the implementation of algorithms for\ud ļ¬nding similarities between two different version of a Web page, in order to handle modiļ¬cations, avoiding\ud the failure of data extraction tasks and ensuring reliability of information extracted. Our purpose is to evaluate\ud performances, advantages and draw-backs of our novel system of automatic wrapper adaptation

    Theory of reliable systems

    Get PDF
    An attempt was made to refine the current notion of system reliability by identifying and investigating attributes of a system which are important to reliability considerations. Techniques which facilitate analysis of system reliability are included. Special attention was given to fault tolerance, diagnosability, and reconfigurability characteristics of systems

    Leveraging Decision Making in Cyber Security Analysis through Data Cleaning

    Get PDF
    Security Operations Centers (SOCs) have been built in many institutions for intrusion detection and incident response. A SOC employs various cyber defense technologies to continually monitor and control network traffic. Given the voluminous monitoring data, cyber security analysts need to identify suspicious network activities to detect potential attacks. As the network monitoring data are generated at a rapid speed and contain a lot of noise, analysts are so bounded by tedious and repetitive data triage tasks that they can hardly concentrate on in-depth analysis for further decision making. Therefore, it is critical to employ data cleaning methods in cyber situational awareness. In this paper, we investigate the main characteristics and categories of cyber security data with a special emphasis on its heterogeneous features. We also discuss how cyber analysts attempt to understand the incoming data through the data analytical process. Based on this understanding, this paper discusses five categories of data cleaning methods for heterogeneous data and addresses the main challenges for applying data cleaning in cyber situational awareness. The goal is to create a dataset that contains accurate information for cyber analysts to work with and thus achieving higher levels of data-driven decision making in cyber defense

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    How significant is durability in vernacular house construction?

    Get PDF
    For centuries, dynamic vernacular society has experienced repairing, or demolishing old houses, and constructing new houses. The time interval between those actions probably became more sparsely spread as modern and more durable technologies and materials offered wider range of options in the vernacular construction through globalization. The first objective of this paper was to compare the durability of distinctively old and new construction materials and technologies used in the vernacular houses in a context of northern Iran. The second objective was to search implicit values behind making decisions regarding durability. The ā€˜oldā€™ construction technologies were ā€˜Kaliā€™, Mud Houses, and Lar deh eeā€, and, while Load-bearing wall, Concrete, and Steel structure were the ā€˜newā€™ categories. A questionnaire-based survey was conducted among 167 residents of different vernacular houses and 18 of them were selected for interview. Besides, a number of 20 experts also participated in a mailing survey for validating the data. Perception of users on durability of structure was assessed and compared through 5 elements namely foundation, floor, wall, roof, and attachment through the structured-questionnaire, while the implicit values were revealed from data collected through the open-ended interview. Results showed that residents tend to rate the old houses higher, rather controversially. Commercialization might be gradually inclining usersā€™ choices towards new houses, but responses also showed that a house is likely to be redundant after only a couple of generations, even though it still has a long durable lifetime to spare, thus making the durability issue less significant

    Helping people to help themselves : policy lessons from a study of deprived urban neighbourhoods in Southampton

    Get PDF
    The aim of this paper is draw out some policy lessons from a study of self-help activity amongst 200 households in deprived urban neighbourhoods of Southampton. Commencing with a critique of the popular prejudice that promoting self-help should be opposed in case it leads to a demise of formal welfare provision, the paper then interrogates the empirical evidence to understand and explain the nature and extent of such work in deprived neighbourhoods. Finding that self-help is a crucial component of household coping practices, but that no-earner households are unable to benefit from this work to the same extent as employed households, the paper proposes both bottom-up and top-down solutions to tackle the barriers to participation in self-help amongst unemployed households. In particular, it calls for a modification to Working Families Tax Credit and the creation of Community Enterprise so as to recognise and value much of the self-help activity that currently takes place but remains unrecognised and unvalued
    • ā€¦
    corecore