Search CORE

7,696 research outputs found

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Privacy-Preserving Reengineering of Model-View-Controller Application Architectures Using Linked Data

Author: Dodero Beardo Juan Manuel
Palomo Duarte Manuel
Rodríguez García María Mercedes
Ruiz Rube Iván
Publication venue: 'River Publishers'
Publication date: 01/01/2019
Field of study

When a legacy system’s software architecture cannot be redesigned, implementing additional privacy requirements is often complex, unreliable and costly to maintain. This paper presents a privacy-by-design approach to reengineer web applications as linked data-enabled and implement access control and privacy preservation properties. The method is based on the knowledge of the application architecture, which for the Web of data is commonly designed on the basis of a model-view-controller pattern. Whereas wrapping techniques commonly used to link data of web applications duplicate the security source code, the new approach allows for the controlled disclosure of an application’s data, while preserving non-functional properties such as privacy preservation. The solution has been implemented and compared with existing linked data frameworks in terms of reliability, maintainability and complexity

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

Database integrated analytics using R : initial experiences with SQL-Server + R

Author: Berral Josep Ll.
Poggi Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases. Such process requires to fetch data, process it, then store again, and such process tends to be done outside the DB, in often complex data-flows. Recently, database service providers have decided to integrate “R-as-a-Service” in their DB solutions. The analytics engine is called directly from the SQL query tree, and results are returned as part of the same query. Here we show a first taste of such technology by testing the portability of our ALOJA-ML analytics framework, coded in R, to Microsoft SQL-Server 2016, one of the SQL+R solutions released recently. In this work we discuss some data-flow schemes for porting a local DB + analytics engine architecture towards Big Data, focusing specially on the new DB Integrated Analytics approach, and commenting the first experiences in usability and performance obtained from such new services and capabilities.Peer ReviewedPostprint (author's final draft

Resolving Architectural Mismatches of COTS Through Architectural Reconciliation

Author: C. Hofmeister
C. Szyperski
J. Bosch
J. Bosch
N. Medvidovic
P. Avgeriou
P. Clements
P. Clements
Publication venue
Publication date: 01/01/2005
Field of study

The integration of COTS components into a system under development entails architectural mismatches. These have been tackled, so far, at the component level, through component adaptation techniques, but they also must be tackled at an architectural level of abstraction. In this paper we propose an approach for resolving architectural mismatches, with the aid of architectural reconciliation. The approach consists of designing and subsequently reconciling two architectural models, one that is forward-engineered from the requirements and another that is reverse-engineered from the COTS-based implementation. The final reconciled model is optimally adapted both to the requirements and to the actual COTS-based implementation. The contribution of this paper lies in the application of architectural reconciliation in the context of COTS-based software development. Architectural modeling is based upon the UML 2.0 standard, while the reconciliation is performed by transforming the two models, with the help of architectural design decisions.

Proceedings - University of Groningen

University of Groningen Digital Archive

Dissertations of the University of Groningen

Evaluations of annular Khovanov--Rozansky homology

Author: Gorsky Eugene
Wedrich Paul
Publication venue
Publication date: 08/04/2019
Field of study

We describe the universal target of annular Khovanov-Rozansky link homology functors as the homotopy category of a free symmetric monoidal category generated by one object and one endomorphism. This categorifies the ring of symmetric functions and admits categorical analogues of plethystic transformations, which we use to characterize the annular invariants of Coxeter braids. Further, we prove the existence of symmetric group actions on the Khovanov-Rozansky invariants of cabled tangles and we introduce spectral sequences that aid in computing the homologies of generalized Hopf links. Finally, we conjecture a characterization of the horizontal traces of Rouquier complexes of Coxeter braids in other types.Comment: 41 page

arXiv.org e-Print Archive

eScholarship - University of California

Runtime Adaptation of Scientific Service Workflows

Author: Juhnke Ernst
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2014
Field of study

Software landscapes are rather subject to change than being complete after having been built. Changes may be caused by a modified customer behavior, the shift to new hardware resources, or otherwise changed requirements. In such situations, several challenges arise. New architectural models have to be designed and implemented, existing software has to be integrated, and, finally, the new software has to be deployed, monitored, and, where appropriate, optimized during runtime under realistic usage scenarios. All of these situations often demand manual intervention, which causes them to be error-prone. This thesis addresses these types of runtime adaptation. Based on service-oriented architectures, an environment is developed that enables the integration of existing software (i.e., the wrapping of legacy software as web services). A workflow modeling tool that aims at an easy-to-use approach by separating the role of the workflow expert and the role of the domain expert. After the development of workflows, tools that observe the executing infrastructure and perform automatic scale-in and scale-out operations are presented. Infrastructure-as-a-Service providers are used to scale the infrastructure in a transparent and cost-efficient way. The deployment of necessary middleware tools is automatically done. The use of a distributed infrastructure can lead to communication problems. In order to keep workflows robust, these exceptional cases need to treated. But, in this way, the process logic of a workflow gets mixed up and bloated with infrastructural details, which yields an increase in its complexity. In this work, a module is presented that can deal automatically with infrastructural faults and that thereby allows to keep the separation of these two layers. When services or their components are hosted in a distributed environment, some requirements need to be addressed at each service separately. Although techniques as object-oriented programming or the usage of design patterns like the interceptor pattern ease the adaptation of service behavior or structures. Still, these methods require to modify the configuration or the implementation of each individual service. On the other side, aspect-oriented programming allows to weave functionality into existing code even without having its source. Since the functionality needs to be woven into the code, it depends on the specific implementation. In a service-oriented architecture, where the implementation of a service is unknown, this approach clearly has its limitations. The request/response aspects presented in this thesis overcome this obstacle and provide a SOA-compliant and new methods to weave functionality into the communication layer of web services. The main contributions of this thesis are the following: Shifting towards a service-oriented architecture: The generic and extensible Legacy Code Description Language and the corresponding framework allow to wrap existing software, e.g., as web services, which afterwards can be composed into a workflow by SimpleBPEL without overburdening the domain expert with technical details that are indeed handled by a workflow expert. Runtime adaption: Based on the standardized Business Process Execution Language an automatic scheduling approach is presented that monitors all used resources and is able to automatically provision new machines in case a scale-out becomes necessary. If the resource's load drops, e.g., because of less workflow executions, a scale-in is also automatically performed. The scheduling algorithm takes the data transfer between the services into account in order to prevent scheduling allocations that eventually increase the workflow's makespan due to unnecessary or disadvantageous data transfers. Furthermore, a multi-objective scheduling algorithm that is based on a genetic algorithm is able to additionally consider cost, in a way that a user can define her own preferences rising from optimized execution times of a workflow and minimized costs. Possible communication errors are automatically detected and, according to certain constraints, corrected. Adaptation of communication: The presented request/response aspects allow to weave functionality into the communication of web services. By defining a pointcut language that only relies on the exchanged documents, the implementation of services must neither be known nor be available. The weaving process itself is modeled using web services. In this way, the concept of request/response aspects is naturally embedded into a service-oriented architecture

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

MARKETING EVOLUTION: E-MARKETING - QUALITATIVE AND QUANTITATIVE RESEARCH TECHNIQUES

Author: Assoc. Prof. Ph.D Ciora Liviu Ion
Assoc. Prof. Ph.D Popa Sorin
Lect. Ph.D. Buligiu Ion
Publication venue
Publication date
Field of study

E-marketing is a generally accepted concept, due to its advantages compared to other marketing mechanisms: it is faster, more efficient, more intelligent and less expensive. The option for e-marketing is also enforced by its flexibility with which it addresses potential clients. Moreover, e-marketing is the environment which leads to quick results, allowing complex calculus in order to analyze request and market evolution as pertinent as possible. Access to new market segments and gaining the existing clients’ trust and loyalty through the products’ quality and price is mostly due to the e-marketing campaigns.e-marketing, market research, Internet, e-marketing campaigns