90 research outputs found

    The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives

    Full text link
    The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.Comment: SIGIR 2023 resource paper, 13 page

    Users, Queries, and Bad Abandonment in Web Search

    Get PDF
    After a user submits a query and receives a list of search results, the user may abandon their query without clicking on any of the search results. A bad query abandonment is when a searcher abandons the SERP because they were dissatisfied with the quality of the search results, often making the user reformulate their query in the hope of receiving better search results. As we move closer to understanding when and what causes a user to abandon their query under different qualities of search results, we move forward in an overall understanding of user behavior with search engines. In this thesis, we describe three user studies to investigate bad query abandonment. First, we report on a study to investigate the rate and time at which users abandon their queries at different levels of search quality. We had users search for answers to questions, but showed users manipulated SERPs that contain one relevant document placed at different ranks. We show that as the quality of search results decreases, the probability of abandonment increases, and that users quickly decide to abandon their queries. Users make their decisions fast, but not all users are the same. We show that there appear to be two types of users that behave differently, with one group more likely to abandon their query and are quicker in finding answers than the group less likely to abandon their query. Second, we describe an eye-tracking experiment that focuses on understanding possible causes of users' willingness to examine SERPs and what motivates users to continue or discontinue their examination. Using eye-tracking data, we found that a user deciding to abandon a query is best understood by the user's examination pattern not including a relevant search result. If a user sees a relevant result, they are very likely to click it. However, users' examination of results are different and may be influenced by other factors. The key factors we found are the rank of search results, the user type, and the query quality. For example, we show that regardless of where the relevant document is placed in the SERP, the type of query submitted affects examination, and if a user enters an ambiguous query, they are likely to examine fewer results. Third, we show how the nature of non-relevant material affects users' willingness to further explore a ranked list of search results. We constructed and showed participants manipulated SERPs with different types of non-relevant documents. We found that user examination of search results and time to query abandonment is influenced by the coherence and type of non-relevant documents included in the SERP. For SERPs coherent on off-topic results, users spend the least amount of time before abandoning and are less likely to request to view more results. The time they spend increases as the SERP quality improves, and users are more likely to request to view more results when the SERP contains diversified non-relevant results on multiple subtopics

    AspectJML: modular specification and runtime checking for crosscutting contracts

    Get PDF
    Aspect-oriented programming (AOP) is a popular technique for modularizing crosscutting concerns. In this context, researchers have found that the realization of design by contract (DbC) is crosscutting and fares better when modularized by AOP. However, previous efforts aimed at supporting crosscutting contract modularity might actually compromise the main DbC principles. For example, in AspectJ-style, reasoning about the correctness of a method call may require a whole-program analysis to determine what advice applies and what that advice does relative to DbC implementation and checking. Also, when contracts are separated from classes a programmer may not know about them and may break them inadvertently. In this paper we solve these problems with AspectJML, a new specification language that supports crosscutting contracts for Java code. We also show how AspectJML supports the main DbC principles of modular reasoning and contracts as documentation

    Using a Dynamic Domain-Specific Modeling Language for the Model-Driven Development of Cross-Platform Mobile Applications

    Get PDF
    There has been a gradual but steady convergence of dynamic programming languages with modeling languages. One area that can benefit from this convergence is modeldriven development (MDD) especially in the domain of mobile application development. By using a dynamic language to construct a domain-specific modeling language (DSML), it is possible to create models that are executable, exhibit flexible type checking, and provide a smaller cognitive gap between business users, modelers and developers than more traditional model-driven approaches. Dynamic languages have found strong adoption by practitioners of Agile development processes. These processes often rely on developers to rapidly produce working code that meets business needs and to do so in an iterative and incremental way. Such methodologies tend to eschew “throwaway” artifacts and models as being wasteful except as a communication vehicle to produce executable code. These approaches are not readily supported with traditional heavyweight approaches to model-driven development such as the Object Management Group’s Model-Driven Architecture approach. This research asks whether it is possible for a domain-specific modeling language written in a dynamic programming language to define a cross-platform model that can produce native code and do so in a way that developer productivity and code quality are at least as effective as hand-written code produced using native tools. Using a prototype modeling tool, AXIOM (Agile eXecutable and Incremental Objectoriented Modeling), we examine this question through small- and mid-scale experiments and find that the AXIOM approach improved developer productivity by almost 400%, albeit only after some up-front investment. We also find that the generated code can be of equal if not better quality than the equivalent hand-written code. Finally, we find that there are significant challenges in the synthesis of a DSML that can be used to model applications across platforms as diverse as today’s mobile operating systems, which point to intriguing avenues of subsequent research

    Enabling lock-free concurrent workers over temporal graphs composed of multiple time-series

    Get PDF
    Time series are commonly used to store temporal data, e.g., sensor measurements. However, when it comes to complex analytics and learning tasks, these measurements have to be combined with structural context data. Temporal graphs, connecting multiple time- series, have proven to be very suitable to organize such data and ultimately empower analytic algorithms. Computationally intensive tasks often need to be distributed and parallelized among different workers. For tasks that cannot be split into independent parts, several workers have to concurrently read and update these shared temporal graphs. This leads to inconsistency risks, especially in the case of frequent updates. Distributed locks can mitigate these risks but come with a very high-performance cost. In this paper, we present a lock-free approach allowing to concurrently modify temporal graphs. Our approach is based on a composition operator able to do online reconciliation of concurrent modifications of temporal graphs. We evaluate the efficiency and scalability of our approach compared to lock-based approaches

    JML\u27s Rich, Inherited Specifications for Behavioral Subtypes

    Get PDF
    The Java Modeling Language (JML) is used to specify detailed designs for Java classes and interfaces. It has a particularly rich set of features for specifying methods. This paper describes those features, with particular emphasis on the features related to specification inheritance. It shows how specification inheritance in JML forces behavioral subtyping, through a discussion of semantics and examples. It also describes a notion of modular reasoning based on static type information, supertype abstraction, which is made valid in JML by methodological restrictions on invariants, history constraints, and initially clauses and by behavioral subtyping
    • …
    corecore