6,175 research outputs found

    A Look Back on the XML Benchmark Project

    Get PDF
    The XML Benchmark Project was started to provide a framework for evaluating the interplay of XML technologies and Database Management Systems. The benchmark lays emphasis on engineering aspects as well as on performance of the query processor. In this chapter the authors present a quick overview of the benchmark and point at some of the experience they gathered during the design of the benchmark and while running it on a variety of platforms. Since the benchmark was designed early in the evolution of XML, our experiences also reflect how the perception of XML changed during the three years that have passed since we started working on the subject. The chapter comprises an overview of the benchmark as well as discussions of some lessons learned

    Automated functional testing of online search services

    Get PDF
    Search services are the main interface through which people discover information on the Internet. A fundamental challenge in testing search services is the lack of oracles. The sheer volume of data on the Internet prohibits testers from verifying the results. Furthermore, it is difficult to objectively assess the ranking quality because different assessors can have very different opinions on the relevance of a Web page to a query. This paper presents a novel method for automatically testing search services without the need of a human oracle. The experimental findings reveal that some commonly used search engines, including Google, Yahoo!, and Live Search, are not as reliable as what most users would expect. For example, they may fail to find pages that exist in their own repositories, or rank pages in a way that is logically inconsistent. Suggestions are made for search service providers to improve their service quality. Copyright © 2010 John Wiley & Sons, Ltd. A novel method for automatically testing search services without the need of a human oracle is presented. The experimental findings reveal that some commonly used search engines, including Google, Yahoo!, and Live Search, are not as reliable as what most users would expect. For example, they may fail to find pages that exist in their own repositories, or rank pages in a way that is logically inconsistent. Suggestions are made for search service providers to improve their service quality. Copyright © 2010 John Wiley & Sons, Ltd.link_to_subscribed_fulltex

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Linked Research on the Decentralised Web

    Get PDF
    This thesis is about research communication in the context of the Web. I analyse literature which reveals how researchers are making use of Web technologies for knowledge dissemination, as well as how individuals are disempowered by the centralisation of certain systems, such as academic publishing platforms and social media. I share my findings on the feasibility of a decentralised and interoperable information space where researchers can control their identifiers whilst fulfilling the core functions of scientific communication: registration, awareness, certification, and archiving. The contemporary research communication paradigm operates under a diverse set of sociotechnical constraints, which influence how units of research information and personal data are created and exchanged. Economic forces and non-interoperable system designs mean that researcher identifiers and research contributions are largely shaped and controlled by third-party entities; participation requires the use of proprietary systems. From a technical standpoint, this thesis takes a deep look at semantic structure of research artifacts, and how they can be stored, linked and shared in a way that is controlled by individual researchers, or delegated to trusted parties. Further, I find that the ecosystem was lacking a technical Web standard able to fulfill the awareness function of research communication. Thus, I contribute a new communication protocol, Linked Data Notifications (published as a W3C Recommendation) which enables decentralised notifications on the Web, and provide implementations pertinent to the academic publishing use case. So far we have seen decentralised notifications applied in research dissemination or collaboration scenarios, as well as for archival activities and scientific experiments. Another core contribution of this work is a Web standards-based implementation of a clientside tool, dokieli, for decentralised article publishing, annotations and social interactions. dokieli can be used to fulfill the scholarly functions of registration, awareness, certification, and archiving, all in a decentralised manner, returning control of research contributions and discourse to individual researchers. The overarching conclusion of the thesis is that Web technologies can be used to create a fully functioning ecosystem for research communication. Using the framework of Web architecture, and loosely coupling the four functions, an accessible and inclusive ecosystem can be realised whereby users are able to use and switch between interoperable applications without interfering with existing data. Technical solutions alone do not suffice of course, so this thesis also takes into account the need for a change in the traditional mode of thinking amongst scholars, and presents the Linked Research initiative as an ongoing effort toward researcher autonomy in a social system, and universal access to human- and machine-readable information. Outcomes of this outreach work so far include an increase in the number of individuals self-hosting their research artifacts, workshops publishing accessible proceedings on the Web, in-the-wild experiments with open and public peer-review, and semantic graphs of contributions to conference proceedings and journals (the Linked Open Research Cloud). Some of the future challenges include: addressing the social implications of decentralised Web publishing, as well as the design of ethically grounded interoperable mechanisms; cultivating privacy aware information spaces; personal or community-controlled on-demand archiving services; and further design of decentralised applications that are aware of the core functions of scientific communication

    The Commoditization of IT: Evidence from a Longitudinal Text Mining Study

    Get PDF
    While Information Technology (IT) has been identified by researchers as a source of strategic advantage for businesses, commentators have argued that this reality may not endure. These commentators argue that the growing ubiquity of IT makes it a commodity input rather than a scarce and valuable resource. We examine CEOs’ Letters to Shareholders, one of the primary statements of corporate strategy, using both content analysis and latent semantic analysis, a text mining technique. Examining these letters allows us to investigate whether IT may be declining in strategic importance over time. We examine 160 annual reports from firms in the healthcare industry, covering a ten-year span of time, from 1997 through 2006. Our results indicate that the strategic emphasis placed on IT may be increasing, but its association with firm performance is declining. Our findings imply that as markets become more competitive, IT management capabilities and the strategic use of IT take on increasing importance. Our findings also imply that CEOs’ perception of the importance of IT is a necessary but not a sufficient condition for improved firm performance. This article makes two primary contributions. First, we present an empirical examination of the issue of IT commoditization as a complement to existing anecdotal discussions. Second, we demonstrate the use of latent semantic analysis (LSA), a relatively new methodology for analyzing textual data, one that is evolving into an alternative to the well-known content analysis technique

    Beyond {NED}: {F}ast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases

    Get PDF

    Effecting Data Quality Through Data Governance: a Case Study in the Financial Services Industry

    Get PDF
    One of the most significant challenges faced by senior management today is implementing a data governance program to ensure that data is an asset to an organization\u27s mission. New legislation aligned with continual regulatory oversight, increasing data volume growth, and the desire to improve data quality for decision making are driving forces behind data governance initiatives. Data governance involves reshaping existing processes and the way people view data along with the information technology required to create a consistent, secure and defined processes for handling the quality of an organization\u27s data. In examining attempts to move towards making data an asset in organizations, the term data governance helps to conceptualize the break with existing ad hoc, siloed and improper data management practices. This research considers a case study of large financial services company to examine data governance policies and procedures. It seeks to bring some information to bare on the drivers of data governance, the processes to ensure data quality, the technologies and people involved to aid in the processes as well as the use of data governance in decision making. This research also addresses some core questions surrounding data governance, such as the viability of a golden source record, ownership and responsibilities for data, and the optimum placement of a data governance department. The findings will provide a model for financial services companies hoping to take the initial steps towards better data quality and ultimately a data governance capability

    Detecting fake news and disinformation using artificial intelligence and machine learning to avoid supply chain disruptions

    Get PDF
    Fake news and disinformation (FNaD) are increasingly being circulated through various online and social networking platforms, causing widespread disruptions and influencing decision-making perceptions. Despite the growing importance of detecting fake news in politics, relatively limited research efforts have been made to develop artificial intelligence (AI) and machine learning (ML) oriented FNaD detection models suited to minimize supply chain disruptions (SCDs). Using a combination of AI and ML, and case studies based on data collected from Indonesia, Malaysia, and Pakistan, we developed a FNaD detection model aimed at preventing SCDs. This model based on multiple data sources has shown evidence of its effectiveness in managerial decision-making. Our study further contributes to the supply chain and AI-ML literature, provides practical insights, and points to future research directions
    • …
    corecore