58 research outputs found

    MapReduce for information retrieval evaluation: "Let's quickly test this on 12 TB of data"

    Get PDF
    We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion pages showing that sequential scanning is a viable approach to running large-scale information retrieval experiments with little effort. The code is available to other researchers at: http://mirex.sourceforge.net

    Searching strategies for the Bulgarian language

    Get PDF
    This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For this language we propose a general light stemmer and demonstrate that it can be quite effective, producing significantly better MAP (around + 34%) than an approach not applying stemming. We implement the GL2 model derived from the Divergence from Randomness paradigm and find its retrieval effectiveness better than other probabilistic, vector-space and language models. The resulting MAP is found to be about 50% better than the classical tf idf approach. Moreover, increasing the query size enhances the MAP by around 10% (from T to TD). In order to compare the retrieval effectiveness of our suggested stopword list and the light stemmer developed for the Bulgarian language, we conduct a set of experiments on another stopword list and also a more complex and aggressive stemmer. Results tend to indicate that there is no statistically significant difference between these variants and our suggested approach. This paper evaluates other indexing strategies such as 4-gram indexing and indexing based on the automatic decompounding of compound words. Finally, we analyze certain queries to discover why we obtained poor results, when indexing Bulgarian documents using the suggested word-based approac

    How effective is stemming and decompounding for German text retrieval?

    Get PDF
    Erworben im Rahmen der Schweizer Nationallizenzen (http://www.nationallizenzen.ch

    AMC: Attention guided Multi-modal Correlation Learning for Image Search

    Full text link
    Given a user's query, traditional image search systems rank images according to its relevance to a single modality (e.g., image content or surrounding text). Nowadays, an increasing number of images on the Internet are available with associated meta data in rich modalities (e.g., titles, keywords, tags, etc.), which can be exploited for better similarity measure with queries. In this paper, we leverage visual and textual modalities for image search by learning their correlation with input query. According to the intent of query, attention mechanism can be introduced to adaptively balance the importance of different modalities. We propose a novel Attention guided Multi-modal Correlation (AMC) learning method which consists of a jointly learned hierarchy of intra and inter-attention networks. Conditioned on query's intent, intra-attention networks (i.e., visual intra-attention network and language intra-attention network) attend on informative parts within each modality; a multi-modal inter-attention network promotes the importance of the most query-relevant modalities. In experiments, we evaluate AMC models on the search logs from two real world image search engines and show a significant boost on the ranking of user-clicked images in search results. Additionally, we extend AMC models to caption ranking task on COCO dataset and achieve competitive results compared with recent state-of-the-arts.Comment: CVPR 201

    App-based Data Collection to Characterize Latent Transportation Demand within Marginalized and Underserved Populations

    Get PDF
    Our interdisciplinary team refined an app prototype, MyAmble, to gather data related to quantity of transportation disadvantage and latent demand, and to identify psycho-social-economic corollaries. MyAmble utilizes a traditional travel diary format but expands the type of trips measured to include 1) completed trips, 2) missed trips, and 3) latent travel demand. The app also measures the real-time perceived impact of transportation behaviors (realized and latent) on participants’ physical health, mental health, social engagement, and employment/academics. Finally, the app has a text-messaging feature, Travel Buddy, that is used to increase participant engagement and retention over longitudinal data collection. The project had several phases including focus groups to help inform app refinement. We deployed the MyAmble prototype through community-engaged research strategies in Dallas, TX, Tucson, AZ, and Knoxville, TN. Recruiting through community partners and snowball sampling resulted in a sample of 77 participants. The majority of participants were female (74.7%) and the average age of participants was 38.41 (SD 13.61) years old. In terms of race and ethnicity, the majority of participants were white (45.5 %) followed by Black/African American (28.6%), Hispanic or Latinx (10.4 %), and American Indian or Alaska Native (5.2%). The prototype testing shows promise in capturing latent travel demand data among typically underserved populations. The study generated critical feedback for continued improvements to MyAmble. Participants expressed positive feedback about MyAmble in the usability survey and offered recommendations for improving the app during the follow-up focus group. Transportation professionals offered recommendations for implementation planning and future studies using MyAmble

    Glimmerglass Volume 12 Number 10 (1953)

    Get PDF
    Official Student Newspaper Issue is 8 pages long

    The BG News March 10, 1982

    Get PDF
    The BGSU campus student newspaper March 10, 1982.https://scholarworks.bgsu.edu/bg-news/4969/thumbnail.jp

    The BG News March 10, 1982

    Get PDF
    The BGSU campus student newspaper March 10, 1982.https://scholarworks.bgsu.edu/bg-news/4969/thumbnail.jp
    • …
    corecore