3,045 research outputs found

    Linking Data Across Universities: An Integrated Video Lectures Dataset

    Get PDF
    This paper presents our work and experience interlinking educational information across universities through the use of Linked Data principles and technologies. More specifically this paper is focused on selecting, extracting, structuring and interlinking information of video lectures produced by 27 different educational institutions. For this purpose, selected information from several websites and YouTube channels have been scraped and structured according to well-known vocabularies, like FOAF 1, or the W3C Ontology for Media Resources 2. To integrate this information, the extracted videos have been categorized under a common classification space, the taxonomy defined by the Open Directory Project 3. An evaluation of this categorization process has been conducted obtaining a 98% degree of coverage and 89% degree of correctness. As a result of this process a new Linked Data dataset has been released containing more than 14,000 video lectures from 27 different institutions and categorized under a common classification scheme

    FAIRness and Usability for Open-access Omics Data Systems

    Get PDF
    Omics data sharing is crucial to the biological research community, and the last decade or two has seen a huge rise in collaborative analysis systems, databases, and knowledge bases for omics and other systems biology data. We assessed the FAIRness of NASAs GeneLab Data Systems (GLDS) along with four similar kinds of systems in the research omics data domain, using 14 FAIRness metrics. The range of overall FAIRness scores was 6-12 (out of 14), average 10.1, and standard deviation 2.4. The range of Pass ratings for the metrics was 29-79%, Partial Pass 0-21%, and Fail 7-50%. The systems we evaluated performed the best in the areas of data findability and accessibility, and worst in the area of data interoperability. Reusability of metadata, in particular, was frequently not well supported. We relate our experiences implementing semantic integration of omics data from some of the assessed systems for federated querying and retrieval functions, given their shortcomings in data interoperability. Finally, we propose two new principles that Big Data system developers, in particular, should consider for maximizing data accessibility

    Improving information retrieval-based concept location using contextual relationships

    Get PDF
    For software engineers to find all the relevant program elements implementing a business concept, existing techniques based on information retrieval (IR) fall short in providing adequate solutions. Such techniques usually only consider the conceptual relations based on lexical similarities during concept mapping. However, it is also fundamental to consider the contextual relationships existing within an application’s business domain to aid in concept location. As an example, this paper proposes to use domain specific ontological relations during concept mapping and location activities when implementing business requirements

    Don\u27t Confuse Metatags with Initial Interest Confusion

    Get PDF
    This Comment focuses on whether the legal doctrine of initial interest confusion should be applied in metatag related trademark infringement cases. The Comment agues that because initial interest confusion does not improve or clarify the existing process of legal inquiry in a trademark infringement litigation, the doctrine is a superfluous legal tool and may even be harmful from a public policy perspective

    Legal, Factual and Other Internet Sites for Attorneys and Others

    Get PDF
    This listing of Internet sites for legal, factual, and other research presents a variety of sources for attorneys, law students, librarians, and others who use the Web. Initially developed for an Advanced Legal Research course and a continuing education session for legal assistants, the listing includes sites for primary authorities, both federal and state, as well as URLs for other types of information such as names of possible expert witnesses and biographical and background information (including social security numbers in some instances) about individuals

    Search Interfaces on the Web: Querying and Characterizing

    Get PDF
    Current-day web search engines (e.g., Google) do not crawl and index a significant portion of theWeb and, hence, web users relying on search engines only are unable to discover and access a large amount of information from the non-indexable part of the Web. Specifically, dynamic pages generated based on parameters provided by a user via web search forms (or search interfaces) are not indexed by search engines and cannot be found in searchers’ results. Such search interfaces provide web users with an online access to myriads of databases on the Web. In order to obtain some information from a web database of interest, a user issues his/her query by specifying query terms in a search form and receives the query results, a set of dynamic pages that embed required information from a database. At the same time, issuing a query via an arbitrary search interface is an extremely complex task for any kind of automatic agents including web crawlers, which, at least up to the present day, do not even attempt to pass through web forms on a large scale. In this thesis, our primary and key object of study is a huge portion of the Web (hereafter referred as the deep Web) hidden behind web search interfaces. We concentrate on three classes of problems around the deep Web: characterization of deep Web, finding and classifying deep web resources, and querying web databases. Characterizing deep Web: Though the term deep Web was coined in 2000, which is sufficiently long ago for any web-related concept/technology, we still do not know many important characteristics of the deep Web. Another matter of concern is that surveys of the deep Web existing so far are predominantly based on study of deep web sites in English. One can then expect that findings from these surveys may be biased, especially owing to a steady increase in non-English web content. In this way, surveying of national segments of the deep Web is of interest not only to national communities but to the whole web community as well. In this thesis, we propose two new methods for estimating the main parameters of deep Web. We use the suggested methods to estimate the scale of one specific national segment of the Web and report our findings. We also build and make publicly available a dataset describing more than 200 web databases from the national segment of the Web. Finding deep web resources: The deep Web has been growing at a very fast pace. It has been estimated that there are hundred thousands of deep web sites. Due to the huge volume of information in the deep Web, there has been a significant interest to approaches that allow users and computer applications to leverage this information. Most approaches assumed that search interfaces to web databases of interest are already discovered and known to query systems. However, such assumptions do not hold true mostly because of the large scale of the deep Web – indeed, for any given domain of interest there are too many web databases with relevant content. Thus, the ability to locate search interfaces to web databases becomes a key requirement for any application accessing the deep Web. In this thesis, we describe the architecture of the I-Crawler, a system for finding and classifying search interfaces. Specifically, the I-Crawler is intentionally designed to be used in deepWeb characterization studies and for constructing directories of deep web resources. Unlike almost all other approaches to the deep Web existing so far, the I-Crawler is able to recognize and analyze JavaScript-rich and non-HTML searchable forms. Querying web databases: Retrieving information by filling out web search forms is a typical task for a web user. This is all the more so as interfaces of conventional search engines are also web forms. At present, a user needs to manually provide input values to search interfaces and then extract required data from the pages with results. The manual filling out forms is not feasible and cumbersome in cases of complex queries but such kind of queries are essential for many web searches especially in the area of e-commerce. In this way, the automation of querying and retrieving data behind search interfaces is desirable and essential for such tasks as building domain-independent deep web crawlers and automated web agents, searching for domain-specific information (vertical search engines), and for extraction and integration of information from various deep web resources. We present a data model for representing search interfaces and discuss techniques for extracting field labels, client-side scripts and structured data from HTML pages. We also describe a representation of result pages and discuss how to extract and store results of form queries. Besides, we present a user-friendly and expressive form query language that allows one to retrieve information behind search interfaces and extract useful data from the result pages based on specified conditions. We implement a prototype system for querying web databases and describe its architecture and components design.Siirretty Doriast

    Legal, Factual and Other Internet Sites for Attorneys and Legal Professionals

    Get PDF
    This listing of Internet sites for legal, factual, and other research presents a variety of sources for attorneys, law students, law librarians, and others who use the Web. Initially developed for an Advanced Legal Research course and a continuing education session for legal assistants and paralegals, the listing includes sites for primary authorities, both federal and state, as well as URLs for other types of information such as names of possible expert witnesses and biographical and background information about individuals.

    Finding Legal, Factual, and Other Information in a Digital World

    Get PDF
    This updated listing of Internet sites for legal, factual, and other research offers a combination of more established sites and newer sites developed since the publication of the previous listing. The article began as a comprehensive bibliography of research and other sites for an Advanced Legal Research course and a series of continuing education sessions for legal assistants and paralegals.1 The current version includes sites for primary authorities, both federal and state, as well as URLs for other types of information, such as sites that assist in finding expert witnesses and biographical and background information about individuals
    • …
    corecore