155 research outputs found

    Google Search and the Law on Dominance in the EU:An Assessment of the Compatibility of Current Methodology with Multi-Sided Platforms in Online Search

    Get PDF
    Business platforms that utilise, or are based upon, internet technology are omnipresent in consumers daily lives. Since the dawn of the World Wide Web, the amount of web content has increased greatly. Simultaneously, business interests have sparked, meeting the arisen demand for particular online services. As a consequence, economists have defined a novel market in these sectors, namely that of multi-sided platform markets. To an important extent, these markets experience network effects, which can strengthen a platform operator’s position in relation to competitors. In turn, competition authorities have witnessed various dominant undertakings emerging. The focus of this article is on one particular internet sector, to wit, that of World Wide Web Search, and on one firm in particular, Google Incorporated. It critically analyses how the Google Search algorithms are shaped from a technological perspective, how these are or can be categorised in accordance with the economic theory of multi-sided platform markets, and how these perform under current dominance law analysis in the European Union, more specifically Art. 102 TFEU. To that end, it will also take into account the recent Google Commitments procedure by the European Commission

    Google Search and the Law on Dominance in the EU:An Assessment of the Compatibility of Current Methodology with Multi-Sided Platforms in Online Search

    Get PDF
    Business platforms that utilise, or are based upon, internet technology are omnipresent in consumers daily lives. Since the dawn of the World Wide Web, the amount of web content has increased greatly. Simultaneously, business interests have sparked, meeting the arisen demand for particular online services. As a consequence, economists have defined a novel market in these sectors, namely that of multi-sided platform markets. To an important extent, these markets experience network effects, which can strengthen a platform operator’s position in relation to competitors. In turn, competition authorities have witnessed various dominant undertakings emerging. The focus of this article is on one particular internet sector, to wit, that of World Wide Web Search, and on one firm in particular, Google Incorporated. It critically analyses how the Google Search algorithms are shaped from a technological perspective, how these are or can be categorised in accordance with the economic theory of multi-sided platform markets, and how these perform under current dominance law analysis in the European Union, more specifically Art. 102 TFEU. To that end, it will also take into account the recent Google Commitments procedure by the European Commission

    Opal: In Vivo Based Preservation Framework for Locating Lost Web Pages

    Get PDF
    We present Opal, a framework for interactively locating missing web pages (http status code 404). Opal is an example of in vivo preservation: harnessing the collective behavior of web archives, commercial search engines, and research projects for the purpose of preservation. Opal servers learn from their experiences and are able to share their knowledge with other Opal servers using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Using cached copies that can be found on the web, Opal creates lexical signatures which are then used to search for similar versions of the web page. Using the OAI-PMH to facilitate inter-Opal learning extends the utilization of OAI-PMH in a novel manner. We present the architecture of the Opal framework, discuss a reference implementation of the framework, and present a quantitative analysis of the framework that indicates that Opal could be effectively deployed

    A Deep Search Architecture for Capturing Product Ontologies

    Get PDF
    This thesis describes a method to populate very large product ontologies quickly. We discuss a deep search architecture to text-mine online e-commerce market places and build a taxonomy of products and their corresponding descriptions and parent categories. The goal is to automatically construct an open database of products, which are aggregated from different online retailers. The database contains extensive metadata on each object, which can be queried and analyzed. Such a public database currently does not exist; instead the information currently resides siloed within various organizations. In this thesis, we describe the tools, data structures and software architectures that allowed aggregating, structuring, storing and searching through several gigabytes of product ontologies and their associated metadata. We also describe solutions to some computational puzzles in trying to mine data on large scale. We implemented the product capture architecture and, using this implementation, we built product ontologies corresponding to two major retailers: Wal-Mart and Target. The ontology data is analyzed to explore structural complexity and similarities and differences between the retailers. A broad product ontology has several uses, from comparison shopping applications that already exist to situation aware computing of tomorrow where computers are aware of the objects in their surroundings and these objects interact together to help humans in everyday tasks

    A Multidisciplinary Approach to the Reuse of Open Learning Resources

    Get PDF
    Educational standards are having a significant impact on e-Learning. They allow for better exchange of information among different organizations and institutions. They simplify reusing and repurposing learning materials. They give teachers the possibility of personalizing them according to the student’s background and learning speed. Thanks to these standards, off-the-shelf content can be adapted to a particular student cohort’s context and learning needs. The same course content can be presented in different languages. Overall, all the parties involved in the learning-teaching process (students, teachers and institutions) can benefit from these standards and so online education can be improved. To materialize the benefits of standards, learning resources should be structured according to these standards. Unfortunately, there is the problem that a large number of existing e-Learning materials lack the intrinsic logical structure required, and further, when they have the structure, they are not encoded as required. These problems make it virtually impossible to share these materials. This thesis addresses the following research question: How to make the best use of existing open learning resources available on the Internet by taking advantage of educational standards and specifications and thus improving content reusability?In order to answer this question, I combine different technologies, techniques and standards that make the sharing of publicly available learning resources possible in innovative ways. I developed and implemented a three-stage tool to tackle the above problem. By applying information extraction techniques and open e-Learning standards to legacy learning resources the tool has proven to improve content reusability. In so doing, it contributes to the understanding of how these technologies can be used in real scenarios and shows how online education can benefit from them. In particular, three main components were created which enable the conversion process from unstructured educational content into a standard compliant form in a systematic and automatic way. An increasing number of repositories with educational resources are available, including Wikiversity and the Massachusetts Institute of Technology OpenCourseware. Wikivesity is an open repository containing over 6,000 learning resources in several disciplines and for all age groups [1]. I used the OpenCourseWare repository to evaluate the effectiveness of my software components and ideas. The results show that it is possible to create standard compliant learning objects from the publicly available web pages, improving their searchability, interoperability and reusability

    An evaluation of non-relational database management systems as suitable storage for user generated text-based content in a distributed environment

    Get PDF
    Non-relational database management systems address some of the limitations relational database management systems have when storing large volumes of unstructured, user generated text-based data in distributed environments. They follow different approaches through the data model they use, their ability to scale data storage over distributed servers and the programming interface they provide. An experimental approach was followed to measure the capabilities these alternative database management systems present in their approach to address the limitations of relational databases in terms of their capability to store unstructured text-based data, data warehousing capabilities, ability to scale data storage across distributed servers and the level of programming abstraction they provide. The results of the research highlighted the limitations of relational database management systems. The different database management systems do address certain limitations, but not all. Document-oriented databases provide the best results and successfully address the need to store large volumes of user generated text-based data in a distributed environmentSchool of ComputingM. Sc. (Computer Science

    Contexts and Contributions: Building the Distributed Library

    Get PDF
    This report updates and expands on A Survey of Digital Library Aggregation Services, originally commissioned by the DLF as an internal report in summer 2003, and released to the public later that year. It highlights major developments affecting the ecosystem of scholarly communications and digital libraries since the last survey and provides an analysis of OAI implementation demographics, based on a comparative review of repository registries and cross-archive search services. Secondly, it reviews the state-of-practice for a cohort of digital library aggregation services, grouping them in the context of the problem space to which they most closely adhere. Based in part on responses collected in fall 2005 from an online survey distributed to the original core services, the report investigates the purpose, function and challenges of next-generation aggregation services. On a case-by-case basis, the advances in each service are of interest in isolation from each other, but the report also attempts to situate these services in a larger context and to understand how they fit into a multi-dimensional and interdependent ecosystem supporting the worldwide community of scholars. Finally, the report summarizes the contributions of these services thus far and identifies obstacles requiring further attention to realize the goal of an open, distributed digital library system

    Using the Web Infrastructure for Real Time Recovery of Missing Web Pages

    Get PDF
    Given the dynamic nature of the World Wide Web, missing web pages, or 404 Page not Found responses, are part of our web browsing experience. It is our intuition that information on the web is rarely completely lost, it is just missing. In whole or in part, content often moves from one URI to another and hence it just needs to be (re-)discovered. We evaluate several methods for a \justin- time approach to web page preservation. We investigate the suitability of lexical signatures and web page titles to rediscover missing content. It is understood that web pages change over time which implies that the performance of these two methods depends on the age of the content. We therefore conduct a temporal study of the decay of lexical signatures and titles and estimate their half-life. We further propose the use of tags that users have created to annotate pages as well as the most salient terms derived from a page\u27s link neighborhood. We utilize the Memento framework to discover previous versions of web pages and to execute the above methods. We provide a work ow including a set of parameters that is most promising for the (re-)discovery of missing web pages. We introduce Synchronicity, a web browser add-on that implements this work ow. It works while the user is browsing and detects the occurrence of 404 errors automatically. When activated by the user Synchronicity offers a total of six methods to either rediscover the missing page at its new URI or discover an alternative page that satisfies the user\u27s information need. Synchronicity depends on user interaction which enables it to provide results in real time
    • …
    corecore