148,094 research outputs found

    What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries

    Full text link
    We analyze the question queries submitted to a large commercial web search engine to get insights about what people ask, and to better tailor the search results to the users’ needs. Based on a dataset of about one billion question queries submitted during the year 2012, we investigate askers’ querying behavior with the support of automatic query categorization. While the importance of question queries is likely to increase, at present they only make up 3–4% of the total search traffic. Since questions are such a small part of the query stream and are more likely to be unique than shorter queries, clickthrough information is typically rather sparse. Thus, query categorization methods based on the categories of clicked web documents do not work well for questions. As an alternative, we propose a robust question query classification method that uses the labeled questions from a large community question answering platform (CQA) as a training set. The resulting classifier is then transferred to the web search questions. Even though questions on CQA platforms tend to be different to web search questions, our categorization method proves competitive with strong baselines with respect to classification accuracy. To show the scalability of our proposed method we apply the classifiers to about one billion question queries and discuss the trade-offs between performance and accuracy that different classification models offer. Our findings reveal what people ask a search engine and also how this contrasts behavior on a CQA platform

    Characterization of the seismic environment at the Sanford Underground Laboratory, South Dakota

    Get PDF
    An array of seismometers is being developed at the Sanford Underground Laboratory, the former Homestake mine, in South Dakota to study the properties of underground seismic fields and Newtonian noise, and to investigate the possible advantages of constructing a third-generation gravitational-wave detector underground. Seismic data were analyzed to characterize seismic noise and disturbances. External databases were used to identify sources of seismic waves: ocean-wave data to identify sources of oceanic microseisms, and surface wind-speed data to investigate correlations with seismic motion as a function of depth. In addition, sources of events contributing to the spectrum at higher frequencies are characterized by studying the variation of event rates over the course of a day. Long-term observations of spectral variations provide further insight into the nature of seismic sources. Seismic spectra at three different depths are compared, establishing the 4100-ft level as a world-class low seismic-noise environment.Comment: 29 pages, 16 figure
    corecore