17 research outputs found

    Deployment of RDFa, Microdata, and Microformats on the Web – A Quantitative Analysis

    Get PDF
    More and more websites embed structured data describing for instance products, reviews, blog posts, people, organizations, events, and cooking recipes into their HTML pages using markup standards such as Microformats, Microdata and RDFa. This development has accelerated in the last two years as major Web companies, such as Google, Facebook, Yahoo!, and Microsoft, have started to use the embedded data within their applications. In this paper, we analyze the adoption of RDFa, Microdata, and Microformats across the Web. Our study is based on a large public Web crawl dating from early 2012 and consisting of 3 billion HTML pages which originate from over 40 million websites. The analysis reveals the deployment of the different markup standards, the main topical areas of the published data as well as the different vocabularies that are used within each topical area to represent data. What distinguishes our work from earlier studies, published by the large Web companies, is that the analyzed crawl as well as the extracted data are publicly available. This allows our findings to be verified and to be used as starting points for further domain-specific investigations as well as for focused information extraction endeavors

    Complete Semantics to empower Touristic Service Providers

    Full text link
    The tourism industry has a significant impact on the world's economy, contributes 10.2% of the world's gross domestic product in 2016. It becomes a very competitive industry, where having a strong online presence is an essential aspect for business success. To achieve this goal, the proper usage of latest Web technologies, particularly schema.org annotations is crucial. In this paper, we present our effort to improve the online visibility of touristic service providers in the region of Tyrol, Austria, by creating and deploying a substantial amount of semantic annotations according to schema.org, a widely used vocabulary for structured data on the Web. We started our work from Tourismusverband (TVB) Mayrhofen-Hippach and all touristic service providers in the Mayrhofen-Hippach region and applied the same approach to other TVBs and regions, as well as other use cases. The rationale for doing this is straightforward. Having schema.org annotations enables search engines to understand the content better, and provide better results for end users, as well as enables various intelligent applications to utilize them. As a direct consequence, the region of Tyrol and its touristic service increase their online visibility and decrease the dependency on intermediaries, i.e. Online Travel Agency (OTA).Comment: 18 pages, 6 figure

    KAJIAN KELENGKAPAN METADATA HALAMAN KATALOG PRODUK PADA PASAR ONLINE DI INDONESIA

    Get PDF
    Penelitian ini berfokus kepada pembahasan terkait metadata dari beberapa pasar online yang paling sering dikunjungi di Indonesia. Terdapat banyak metadata yang digunakan oleh pasar online di Indonesia, dan setiap pasar online kemungkinan memiliki metadata yang berbeda satu sama lainnya. Penelitian ini akan membahas metadata yang digunakan oleh pasar-pasar online di Indonesia beserta atribut-atribut yang terkait dengan masing-masing metadata. Penelitian ini akan dibagi menjadi beberapa bagian kajian. Kajian yang pertama dilakukan adalah kajian terhadap metadata yang dimiliki pasar online dan schema.org. Kajian yang dilakukan berikutnya merupakan kajian terhadap banyaknya metadata masing-masing pasar online yang terdapat pada schema.org. Kajian terakhir yang dilakukan adalah kajian terkait metadata yang paling sering digunakan oleh pasar online di Indonesia. Dengan mengkaji metadata dari pasar-pasar online yang paling sering dikunjungi di Indonesia, diharapkan mampu menghasilkan gambaran metadata yang paling banyak digunakan oleh pasar-pasar online di Indonesia dan juga menghasilkan usulan-usulan metadata yang membantu melengkapi metadata pada masing-masing pasar online. ================================================================= This research is focused on metadata from some of the most visited online marketplaces in Indonesia. There is a lot of metadata used by Indonesia’s online marketplaces, and each of them can have different metadata from the other. This research will discuss metadata used by online marketplaces in Indonesia and attributes that come along with the metadata. This research will be separated into some parts. The first part is the study of metadata on online marketplaces and schema.org. The next part is the study of the amount of metadata on each marketplace contained on schema.org. The last part is the study of metadata most used by online marketplaces in Indonesia. By studying metadata from the most visited online marketplaces in Indonesia, it is expected to produce an image of metadata most used by Indonesia’s online marketplaces as well as produce metadata suggestions that could help complete the metadata on each online marketplace

    End-User Development of Voice User Interfaces based on Web content

    Get PDF
    Voice Assistants, and particularly the latest gadgets called smart speakers, allow end users to interact with applications by means of voice commands. As usual, end users are able to install applications (also called skills) that are available in repositories and fulfill multiple purposes. In this work we present an end-user environment to define skills for voice assistants based on the extraction of Web content and their organization into different voice navigation patterns. We describe the approach, the end-user development environment, and finally we present some case studies based on Alexa and Amazon Echo

    End-User Development of Voice User Interfaces based on Web content

    Get PDF
    Voice Assistants, and particularly the latest gadgets called smart speakers, allow end users to interact with applications by means of voice commands. As usual, end users are able to install applications (also called skills) that are available in repositories and fulfill multiple purposes. In this work we present an end-user environment to define skills for voice assistants based on the extraction of Web content and their organization into different voice navigation patterns. We describe the approach, the end-user development environment, and finally we present some case studies based on Alexa and Amazon Echo.Publicado en Lecture Notes in Computer Science book series (LNCS, volume 11553).Laboratorio de Investigación y Formación en Informática Avanzad

    Abstracting and Structuring Web Contents for Supporting Personal Web Experiences

    Get PDF
    This paper presents a novel approach for supporting abstraction and structuring mechanisms of Web contents. The goal of this approach is to enable users to create/extract Web contents in the form of objects that they can manipulate to create Personal Web experiences. We present an architecture that not only allows the user interaction with individual objects but also supports the integration of many objects found in diverse Web sites. We claim that once Web contents have been organized as objects it is possible to create many types of Personal Web interactions. The approach involves end-users and developers and it is fully supported by dedicated tools. We show how end-users can use our tools to identify contents and transform them into objects stored in our platform. We show how developers can use of objects to create Personal Web applications.Publicado en Lecture Notes in Computer Science book series (LNCS, vol. 9671).Laboratorio de Investigación y Formación en Informática Avanzad

    A Quantitative Analysis of the Use of Microdata for Semantic Annotations on Educational Resources

    Get PDF
    A current trend in the semantic web is the use of embedded markup formats aimed to semantically enrich web content by making it more understandable to search engines and other applications. The deployment of Microdata as a markup format has increased thanks to the widespread of a controlled vocabulary provided by Schema.org. Recently, a set of properties from the Learning Resource Metadata Initiative (LRMI) specification, which describes educational resources, was adopted by Schema.org. These properties, in addition to those related to accessibility and the license of resources included in Schema.org, would enable search engines to provide more relevant results in searching for educational resources for all users, including users with disabilities. In order to obtain a reliable evaluation of the use of Microdata properties related to the LRMI specification, accessibility, and the license of resources, this research conducted a quantitative analysis of the deployment of these properties in large-scale web corpora covering two consecutive years. The corpora contain hundreds of millions of web pages. The results further our understanding of this deployment in addition to highlighting the pending issues and challenges concerning the use of such properties
    corecore