4,298 research outputs found

    Information Aggregation using the Cameleon# Web Wrapper

    Get PDF
    Cameleon# is a web data extraction and management tool that provides information aggregation with advanced capabilities that are useful for developing value-added applications and services for electronic business and electronic commerce. To illustrate its features, we use an airfare aggregation example that collects data from eight online sites, including Travelocity, Orbitz, and Expedia. This paper covers the integration of Cameleon# with commercial database management systems, such as MS SQL Server, and XML query languages, such as XQuery

    Framework for a Hospitality Big Data Warehouse: The Implementation of an Efficient Hospitality Business Intelligence System

    Get PDF
    order to increase the hotel's competitiveness, to maximize its revenue, to meliorate its online reputation and improve customer relationship, the information about the hotel's business has to be managed by adequate information systems (IS). Those IS should be capable of returning knowledge from a necessarily large quantity of information, anticipating and influencing the consumer's behaviour. One way to manage the information is to develop a Big Data Warehouse (BDW), which includes information from internal sources (e.g., Data Warehouse) and external sources (e.g., competitive set and customers' opinions). This paper presents a framework for a Hospitality Big Data Warehouse (HBDW). The framework includes a (1) Web crawler that periodically accesses targeted websites to automatically extract information from them, and a (2) data model to organize and consolidate the collected data into a HBDW. Additionally, the usefulness of this HBDW to the development of the business analytical tools is discussed, keeping in mind the implementation of the business intelligence (BI) concepts.SRM QREN IDT [38962]FCT projects LARSyS [UID/EEA/50009/2013]CIAC [PEstOE/EAT/UI4019/2013]CEFAGE [PEst-C/EGE/UI4007/2013]CEG-IST - Universidade de Lisboainfo:eu-repo/semantics/publishedVersio

    Using Semantic-Based User Profile Modeling for Context-Aware Personalised Place Recommendations

    Get PDF
    Place Recommendation Systems (PRS's) are used to recommend places to visit to World Wide Web users. Existing PRS's are still limited by several problems, some of which are the problem of recommending similar set of places to different users (Lack of Personalization) and no diversity in the set of recommended items (Content Overspecialization). One of the main objectives in the PRS's or Contextual suggestion systems is to fill the semantic gap among the queries and suggestions and going beyond keywords matching. To address these issues, in this study we attempt to build a personalized context-aware place recommender system using semantic-based user profile modeling to address the limitations of current user profile building techniques and to improve the retrieval performance of personalized place recommender system. This approach consists of building a place ontology based on the Open Directory Project (ODP), a hierarchical ontology scheme for organizing websites. We model a semantic user profile from the place concepts extracted from place ontology and weighted according to their semantic relatedness to user interests. The semantic user profile is then exploited to devise a personalized recommendation by re-ranking process of initial search results for improving retrieval performance. We evaluate this approach on dataset obtained using Google Paces API. Results show that our proposed approach significantly improves the retrieval performance compare to classic keyword-based place recommendation model

    A teachable semi-automatic web information extraction system based on evolved regular expression patterns

    Get PDF
    This thesis explores Web Information Extraction (WIE) and how it has been used in decision making and to support businesses in their daily operations. The research focuses on a WIE system based on Genetic Programming (GP) with an extensible model to enhance the automatic extractor. This uses a human as a teacher to identify and extract relevant information from the semi-structured HTML webpages. Regular expressions, which have been chosen as the pattern matching tool, are automatically generated based on the training data to provide an improved grammar and lexicon. This particularly benefits the GP system which may need to extend its lexicon in the presence of new tokens in the web pages. These tokens allow the GP method to produce new extraction patterns for new requirements

    An Ontology Based Approach Towards A Universal Description Framework for Home Networks

    Get PDF
    Current home networks typically involve two or more machines sharing network resources. The vision for the home network has grown from a simple computer network, to every day appliances embedded with network capabilities. In this environment devices and services within the home can interoperate, regardless of protocol or platform. Network clients can discover required resources by performing network discovery over component descriptions. Common approaches to this discovery process involve simple matching of keywords or attribute/value pairings. Interest emerging from the Semantic Web community has led to ontology languages being applied to network domains, providing a logical and semantically rich approach to both describing and discovering network components. In much of the existing work within this domain, developers have focused on defining new description frameworks in isolation from existing protocol frameworks and vocabularies. This work proposes an ontology-based description framework which takes the ontology approach to the next step, where existing description frameworks are in- corporated into the ontology-based framework, allowing discovery mechanisms to cover multiple existing domains. In this manner, existing protocols and networking approaches can participate in semantically-rich discovery processes. This framework also includes a system architecture developed for the purpose of reconciling existing home network solutions with the ontology-based discovery process. This work also describes an implementation of the approach and is deployed within a home-network environment. This implementation involves existing home networking frameworks, protocols and components, allowing the claims of this work to be examined and evaluated from a ‘real-world’ perspective

    News Analytics for Financial Decision Support

    Get PDF
    This PhD thesis contributes to the newly emerged, growing body of scientific work on the use of News Analytics in Finance. Regarded as the next significant development in Automated Trading, News Analytics extends trading algorithms to incorporate information extracted from textual messages, by translating it into actionable, valuable knowledge. The thesis addresses one main theme: the incorporation of news into trading algorithms. This relates to three main tasks: i) the extraction of the information contained in news, ii) the representation of the information contained in news, and iii) the aggregation of this information into actionable knowledge. We validate our approach by designing and implementing three semantic systems: a system for the computational content analysis of European Central Bank statements, a system for incorporating news in stock trading strategies, and a time-aware system for trading based on analyst recommendations. The approach we choose for addressing these tasks is an interdisciplinary one. For the extraction of information from news we rely on approaches borrowed from Computer Science and Linguistics. The representation of the information contained in news is realized by using, and extending, the state-of-the-art in Semantic Web technology. We do this by bringing together insights from Logics, Metaphysics, and Computational Semantics. The aggregation of information is done by using techniques and results from Computational Intelligence and Financ

    Creating ontology-based metadata by annotation for the semantic web

    Get PDF
    • …
    corecore