1,777 research outputs found

    Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

    Get PDF
    A cross-disciplinary examination of the user behaviours involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data. Two analytical frameworks rooted in information retrieval and science technology studies are used to identify key similarities in practices as a first step toward developing a model describing data retrieval

    Masked by Trust: Bias in Library Discovery

    Get PDF
    The rise of Google and its integration into nearly every aspect of our lives has pushed libraries to adopt similar Google-like search tools, called discovery systems. Because these tools are provided by libraries and search scholarly materials rather than the open web, we often assume they are more accurate or reliable than their general-purpose peers like Google or Bing. But discovery systems are still software written by people with prejudices and biases, library software vendors are subject to strong commercial pressures that are often hidden behind diffuse collection-development contracts and layers of administration, and they struggle to integrate content from thousands of different vendors and their collective disregard for consistent metadata. Library discovery systems struggle with accuracy, relevance, and human biases, and these shortcomings have the potential to shape the academic research and worldviews of the students and faculty who rely on them. While human bias, commercial interests, and problematic metadata have long affected researchers\u27 access to information, algorithms in library discovery systems increase the scale of the negative effects on users, while libraries continue to promote their objective and neutral search tools

    Supporting Internet Search by Search-Log Publishing

    Get PDF
    Antud väitekiri on osa jätkuvast kollektiivsest uurimistööst, laiema eesmärgiga eeskätt parandada Internetiotsingu tuge keeruliste ja aeganõudvate ning tihti uurimusliku loomuga otsinguülesannete kiiremaks ja efektiivsemaks läbiviimiseks. Töö peamine uurimisprobleem on uut tüüpi otsinguülesannete logimise ja Internetis jagamise raamistiku väljatöötamine, olles alternatiiviks brauseri pistikprogrammide põhistele olemasolevatele meetoditele. Tegu oli keerulise insenertehnilise ülesandega, mille käigus tuli autoril täita mitmesuguseid programmeerimise, planeerimise, süsteemi komponentide integreerimise ja konfigureerimisega seotud ülesandeid. Püstitatud eesmärk sai edukalt täidetud. Väitekirjas pakuti välja proksipõhine meetod kasutajate otsingukäitumise logimiseks, mis on ühtlasi lihtsasti kohaldatav erinevatele veebilehitsejatele ning operatsioonisüsteemidele. Lahendust võrreldi varasemate sarnaste süsteemidega. Meetod sündis reaalsest vajadusest leida kergemalt hallatav ning porditav asendus varem väljatöötatud tarkvarale, mis kujutas endast pistikprogrammi Mozilla Firefox veebilehitsejale, kuid mida tuli parandada pärast iga uue brauseri versiooni väljatulekut. Teostus koosneb kahest suuremast komponendist, millest esimene ja tehniliselt keerulisem, otsinguülesannete logide koostamise ja jagamise süsteem, paikneb VirtualBox'i virtuaalses masinas. Teine on WordPress'il põhinev otsingulogide repositoorium, võimaldades lisaks kasutaja poolt annoteeritud logide avaldamise ka neist lihtsamaid otsinguid teostada. Süsteeme on põhjalikult testitud, kuid neid pole veel rakendatud Internetiotsinguga seotud kasutajauurimustesse. Autorile on teada, et selline huvi on olemas nii Tartu Ülikooli sees kui ka ühe välismaise partnerülikooli poolt. Lokaalselt paiknev otsinguülesannete koostamise ja jagamise süsteem koosneb kolmest võrdselt tähtsast alamkomponendist. Nendeks on Python'i keeles realiseeritud otsinguülesande logija; peamiselt PHP'd ja HTML'i kasutav veebiliides, mis muuhulgas võimaldab kasutajal eelpoolmainitud logijat sisse ja välja lülitada, aga ka kõiki otsinguülesandega seotud andmeid käsitsi muuta ja täiendada; ja antud ülesandeks spetsiaalselt konfigureeritud Privoxy veebiproksi server. Töös antakse põhjalik ülevaade olemasolevast tarkvarast, teaduspublikatsioonidest ja teoreetilistest alustest seoses väitekirja uurimisprobleemiga. Võrreldes olemasolevate meetoditega eristub autori pakutud proksipõhine otsinguülesannete logimise ja jagamise raamistik peamiselt kahel põhjusel. Esiteks, meetod tagab platvormist ja brauserist sõltumatuse, olles ühtlasi väga stabiilne. Teiseks, kasutajatele antav vabadus oma otsinguülesannet vabalt defineerida ning annoteerida on oluliseks uueks tähiseks. Väitekirja viimases peatükis käsitletakse tööga seotud tulevikuväljavaateid ja avatud probleeme. Üks neist on väljapakutavaga võrreldes muudetud arhitektuur, mis võimaldaks korraldada väiksema vaeva ja ajakuluga laborieksperimente. Internetiotsingu logimise süsteemi saab edasi arendada, lisades tuge enamatele JavaScript'i sündmustele. Otsingulogide repositoorium, olles veel üsna algeline, pakub hulgaliselt võimalusi täiendusteks tulevikuks.The main research problem of my thesis was engineering a new type of search task logging and publishing framework which would provide a better alternative for existing browser plug-in based methods. Right from the start, the proxy-based search task reporting system has been a complex engineering challenge involving code written in multiple programming languages, interactions planned across many software modules (some of which have already been existing large projects themselves), and a Linux operating system configured to ease the set-up process for the user. This was the decision process to make sure that this solution is reliable, extendible and maintainable in the future. My research goal was completed successfully. In my thesis, I proposed a proxy-based method for logging user search behaviour across different browsers and operating systems. I also compared it with an existing plug-in based Search Logger for Mozilla Firefox and other similar solutions. The idea of developing a proxy-based search task logging and publishing solution came from out of necessity, because the existing logging solution had significant problems with maintainability. The logs created by my solution are subsequently annotated by the user and made publicly available on a dedicated Internet blog called the Search Task Repository. Users can search against the already annotated and published Internet search logs. Ideally this would mean reduced complexity of search tasks for the users which in turn saves time. User studies to confirm this are still pending but there is confirmed interest from Tartu researchers as well as from one foreign university to use my solution in their search experiments. The proposed solution is comprised of two large units, which are the search task repository and the search task logging and publishing unit. The search task repository is a remote component, essentially a fairly simple WordPress blog, which enables search stories to be published automatically over XML-RPC protocol, search queries to be served, and search task logs to be displayed to the searcher. My logging system is configured as a VirtualBox virtual machine. It is much more complex, consisting of three sub-components: the main Web interface, the search task logger, and the Privoxy Web proxy specially configured for my needs. Logging can be started and stopped at a user's will in the main Web interface. What is more, this sub-component also gives them absolute control over what gets published online by providing an editing and annotating functionality for all search task data, both implicitly and explicitly logged. A comprehensive theoretical overview was given in my thesis about the state of the art, explaining basic related concepts in Information Retrieval and recent developments in Exploratory Search and search task logging systems. In contrast with existing browser plug-in based search task logging methods, my proposed proxy-based approach ensures platform and browser independence while also being very stable. By giving searcher's the opportunity to freely define and annotate their own search tasks, my search support solution is setting a new standard. In the final chapter, I conducted a thorough analysis about future work and presented my own vision about the future opportunities for this search support methodology. A modified architecture for more convenient laboratory experiments was outlined as an important task for the future. In conclusion, my proxy-based search task logging, editing and publishing framework can be extended further to log more JavaScript events. The search task repository is a large open area with lots of opportunities for future extensions

    Opal: In Vivo Based Preservation Framework for Locating Lost Web Pages

    Get PDF
    We present Opal, a framework for interactively locating missing web pages (http status code 404). Opal is an example of in vivo preservation: harnessing the collective behavior of web archives, commercial search engines, and research projects for the purpose of preservation. Opal servers learn from their experiences and are able to share their knowledge with other Opal servers using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Using cached copies that can be found on the web, Opal creates lexical signatures which are then used to search for similar versions of the web page. Using the OAI-PMH to facilitate inter-Opal learning extends the utilization of OAI-PMH in a novel manner. We present the architecture of the Opal framework, discuss a reference implementation of the framework, and present a quantitative analysis of the framework that indicates that Opal could be effectively deployed

    Polar: proxies collaborating to achieve anonymous web browsing

    Get PDF
    User tracking and profiling is a growing threat to online privacy. Whilst Internet users can choose to withhold their personal information, their Internet usage can still be traced back to a unique IP address. This study considers anonymity as a strong and useful form of privacy protection. More specifically, we examine how current anonymity solutions suffer from a number of deficiencies: they are not commonly used, are vulnerable to a host of attacks or are impractical or too cumbersome for daily use. Most anonymity solutions are centralised or partially centralised and require trust in the operators. It is additionally noted how current solutions fail to promote anonymity for common Web activities such as performing online search queries and general day-to-day Web browsing. A primary objective of this research is to develop an anonymising Web browsing protocol which aims to be (1) fully distributed, (2) offer adequate levels of anonymity and (3) enable users to browse the Internet anonymously without overly complex mixing techniques. Our research has led to an anonymising protocol called Polar. Polar is a peer-to-peer network which relays Web requests amongst peers before forwarding it to a Web server, thus protecting the requester's identity. This dissertation presents the Polar model. Design choices and enhancements to the model are discussed. The author's implementation of Polar is also presented demonstrating that an implementation of Polar is feasible.Dissertation (MSc (Computer Science))--University of Pretoria, 2007.Computer Scienceunrestricte

    The Effect of internet searches on Afforestation: The case of a Green Search Engine

    Get PDF
    Ecosia is an Internet search engine that plants trees with the income obtained from advertising. This study explored the factors that affect the adoption of Ecosia.org from the perspective of technology adoption and trust. This was done by using the Unified Theory of Acceptance and Use of Technology (UTAUT2) and then analyzing the results with PLS-SEM (Partial Least Squares-Structural Equation Modeling). Subsequently, a survey was conducted with a structured questionnaire on search engines, which yielded the following results: (1) the idea of a company helping to mitigate the effects of climate change by planting trees is well received by Internet users. However, few people accept the idea of changing their habits from using traditional search engines; (2) Ecosia is a search engine believed to have higher compatibility rates, and needing less hardware resources, and (3) ecological marketing is an appropriate and future strategy that can increase the intentiontouseatechnologicalproduct. Basedontheresultsobtained,thisstudyshowsthatasearch engineorotherserviceprovidedbytheInternet,whichcanbeaudited(visits,searches,files,etc.),can also contribute to curb the effects of deforestation and climate change. In addition, companies, and especially technological start-ups, are advised to take into account that users feel better using these tools. Finally, this study urges foundations and non-governmental organizations to fight against the effects of deforestation by supporting these initiatives. The study also urges companies to support technological services, and follow the behavior of Ecosia.org in order to positively influence user satisfaction by using ecological marketing strategies
    corecore