181,942 research outputs found
Improving the evaluation of web search systems
Linkage analysis as an aid to web search has been assumed to be of significant benefit and we know that it is being implemented by many major Search Engines. Why then have few TREC participants been able to scientifically prove the benefits of linkage analysis over the past three years? In this paper we put forward reasons why disappointing results have been found and we identify the linkage density requirements of a dataset to faithfully support experiments into linkage analysis. We also report a series of linkage-based retrieval experiments on a more densely linked dataset culled from the TREC web documents
Evaluation of linkage-based web discovery systems
In recent years, the widespread use of the WWW has brought information retrieval systems into the homes o f many millions people. Today, we have access to many billions o f documents (web pages) and have (free-of-charge) access to powerful, fast and highly efficient search facilities over these documents provided by search engines such as Google. The "first generation" of web search engines addressed the engineering problems o f web spidering and efficient searching for large numbers o f both users and documents, but they did not innovate much in the approaches taken to searching.
Recently, however, linkage analysis has been incorporated into search engine ranking strategies. Anecdotally, linkage analysis appears to have improved retrieval effectiveness o f web search, yet there is little scientific evidence in support o f the claims for better quality retrieval, which is surprising. Participants in the three most recent TREC conferences (1999, 2000 and 2001) have been invited to perform benchmarking o f information retrieval systems on web data and have had the option o f using linkage information as part of their retrieval strategies. The general consensus from the experiments of these participants is that linkage information has not yet been successfully incorporated into conventional retrieval strategies.
In this thesis, we present our research into the field o f linkage-based retrieval of web documents. We illustrate that (moderate) improvements in retrieval performance is possible if the undedying test collection contains a higher link density than the test collections used in the three most recent TREC conferences. We examine the linkage structure o f live data from the WWW and coupled with our findings from crawling sections o f the WWW we present a list o f five requirements for a test collection which is to faithfully support experiments into linkage-based retrieval o f documents from the WWW. We also present some o f our own, new, vanants on linkage-based web retrieval and evaluate their performance in comparison to the approaches o f others
Replicating web structure in small-scale test collections
Linkage analysis as an aid to web search has been assumed to be of significant benefit and we know that it is being implemented by many major Search Engines. Why then have few TREC participants been able to scientifically prove the benefits of linkage analysis in recent years? In this paper we put forward reasons why many disappointing results have been found in TREC experiments and we identify the linkage density requirements of a dataset to faithfully support experiments into linkage-based retrieval by examining the linkage structure of the WWW. Based on these requirements we report on methodologies for synthesising such a test collection
Predator and detritivore niche width helps to explain biocomplexity of experimental detritus-based food webs in four aquatic and terrestrial ecosystems
In the study of food webs, the existence and explanation of recurring patterns, such as the scale
invariance of linkage density, predatorâprey ratios and mean chain length, constitute long-standing
issues. Our study focused on litter-associated food webs and explored the influence of detritivore and
predator niche width (as d13C range) on web topological structure. To compare patterns within and
between aquatic and terrestrial ecosystems and take account of intra-habitat variability, we constructed
42 macroinvertebrate patch-scale webs in four different habitats (lake, lagoon, beech forest and
cornfield), using an experimental approach with litterbags. The results suggest that although web
differences exist between ecosystems, patterns are more similar within than between aquatic and
terrestrial web types. In accordance with optimal foraging theory, we found that the niche width of
predators and prey increased with the number of predators and prey taxa as a proportion of total taxa in
the community. The tendency was more marked in terrestrial ecosystems and can be explained by a
lower per capita food level than in aquatic ecosystems, particularly evident for predators. In accordance
with these results, the number of links increased with the number of species but with a significantly
sharper regression slope for terrestrial ecosystems. As a consequence, linkage density, which was found
to be directly correlated to niche width, increased with the total number of species in terrestrial webs,
whereas it did not change significantly in aquatic ones, where connectance scaled negatively with the
total number of species. In both types of ecosystem, web robustness to rare species removal increased
with connectance and the niche width of predators. In conclusion, although limited to litter-associated
macroinvertebrate assemblages, this study highlights structural differences and similarities between
aquatic and terrestrial detrital webs, providing field evidence of the central role of niche width in
determining the structure of detritus-based food webs and posing foraging optimisation constraints as a
general mechanistic explanation of food web complexity differences within and between ecosystem
types
Internet-based CBT for depression with and without telephone tracking in a national helpline: randomised controlled trial
BACKGROUND Telephone helplines are frequently and repeatedly used by individuals with chronic mental health problems and web interventions may be an effective tool for reducing depression in this population. AIM To evaluate the effectiveness of a 6 week, web-based cognitive behaviour therapy (CBT) intervention with and without proactive weekly telephone tracking in the reduction of depression in callers to a helpline service. METHOD 155 callers to a national helpline service with moderate to high psychological distress were recruited and randomised to receive either Internet CBT plus weekly telephone follow-up; Internet CBT only; weekly telephone follow-up only; or treatment as usual. RESULTS Depression was lower in participants in the web intervention conditions both with and without telephone tracking compared to the treatment as usual condition both at post intervention and at 6 month follow-up. Telephone tracking provided by a lay telephone counsellor did not confer any additional advantage in terms of symptom reduction or adherence. CONCLUSIONS A web-based CBT program is effective both with and without telephone tracking for reducing depression in callers to a national helpline. TRIAL REGISTRATION Controlled-Trials.comISRCTN93903959.Funding for the trial was provided by an Australian Research Council Linkage Project Grant (LP0667970) (http://www.arc.gov.au/). LF is supported by an
Australian Postgraduate Award Industry scholarship. KG is supported by a National Health and Medical Research Council Fellowship (No. 525413) and HC is
supported by a National Health and Medical Research Council Fellowship (No. 525411)
Towards a Cloud-Based Service for Maintaining and Analyzing Data About Scientific Events
We propose the new cloud-based service OpenResearch for managing and
analyzing data about scientific events such as conferences and workshops in a
persistent and reliable way. This includes data about scientific articles,
participants, acceptance rates, submission numbers, impact values as well as
organizational details such as program committees, chairs, fees and sponsors.
OpenResearch is a centralized repository for scientific events and supports
researchers in collecting, organizing, sharing and disseminating information
about scientific events in a structured way. An additional feature currently
under development is the possibility to archive web pages along with the
extracted semantic data in order to lift the burden of maintaining new and old
conference web sites from public research institutions. However, the main
advantage is that this cloud-based repository enables a comprehensive analysis
of conference data. Based on extracted semantic data, it is possible to
determine quality estimations, scientific communities, research trends as well
the development of acceptance rates, fees, and number of participants in a
continuous way complemented by projections into the future. Furthermore, data
about research articles can be systematically explored using a content-based
analysis as well as citation linkage. All data maintained in this
crowd-sourcing platform is made freely available through an open SPARQL
endpoint, which allows for analytical queries in a flexible and user-defined
way.Comment: A completed version of this paper had been accepted in SAVE-SD
workshop 2017 at WWW conferenc
netgwas: An R Package for Network-Based Genome-Wide Association Studies
Graphical models are powerful tools for modeling and making statistical
inferences regarding complex associations among variables in multivariate data.
In this paper we introduce the R package netgwas, which is designed based on
undirected graphical models to accomplish three important and interrelated
goals in genetics: constructing linkage map, reconstructing linkage
disequilibrium (LD) networks from multi-loci genotype data, and detecting
high-dimensional genotype-phenotype networks. The netgwas package deals with
species with any chromosome copy number in a unified way, unlike other
software. It implements recent improvements in both linkage map construction
(Behrouzi and Wit, 2018), and reconstructing conditional independence network
for non-Gaussian continuous data, discrete data, and mixed
discrete-and-continuous data (Behrouzi and Wit, 2017). Such datasets routinely
occur in genetics and genomics such as genotype data, and genotype-phenotype
data. We demonstrate the value of our package functionality by applying it to
various multivariate example datasets taken from the literature. We show, in
particular, that our package allows a more realistic analysis of data, as it
adjusts for the effect of all other variables while performing pairwise
associations. This feature controls for spurious associations between variables
that can arise from classical multiple testing approach. This paper includes a
brief overview of the statistical methods which have been implemented in the
package. The main body of the paper explains how to use the package. The
package uses a parallelization strategy on multi-core processors to speed-up
computations for large datasets. In addition, it contains several functions for
simulation and visualization. The netgwas package is freely available at
https://cran.r-project.org/web/packages/netgwasComment: 32 pages, 9 figures; due to the limitation "The abstract field cannot
be longer than 1,920 characters", the abstract appearing here is slightly
shorter than that in the PDF fil
- âŠ