130 research outputs found
Recommended from our members
Indexing Proximity-based Dependencies for Information Retrieval
Research into term dependencies for information retrieval has demonstrated that dependency retrieval models are able to consistently improve retrieval effectiveness over bag-of-words models. However, the computation of term dependency statistics is a major efficiency bottleneck in the execution of these retrieval models. This thesis investigates the problem of improving the efficiency of dependency retrieval models without compromising the effectiveness benefits of the term dependency features.
Despite the large number of published comparisons between dependency models and bag-of-words approaches, there has been a lack of direct comparisons between alternate dependency models. We provide this comparison and investigate different types of proximity features. Several bi-term and many-term dependency models over a range of TREC collections, for both short (title) and long (description) queries, are compared to determine the strongest benchmark models. We observe that the weighted sequential dependence model is the most effective model studied. Additionally, we observe that there is some potential in many-term dependencies, but more selective methods are required to exploit these features.
We then investigate two novel index structures to directly index the proximitybased dependencies used in the sequential dependence model and weighted sequential dependence model. The frequent index and the sketch index data structures can both provide efficient access to collection and document level statistics for all indexed term dependencies, while minimizing space costs, relative to a full inverted index of term dependencies. We test whether these structures can improve retrieval efficiency without incurring large space requirements, or degrading retrieval effectiveness significantly. A secondary requirement is that each data structure must be able to be constructed for an input text collection in a scalable and distributed manner.
Based on the observation that the vast majority of term dependencies extracted from queries are relatively frequent in the collection, the “frequent” index of term dependencies omits data for infrequent term dependencies. The sketch index of term dependencies uses techniques from sketch data structures to store probabilisticallybounded estimates of the required statistics. We present analyses of these data structures that include construction and space costs, retrieval efficiency and investigation of any degradation of retrieval effectiveness.
Finally, we investigate the application of these data structures to the execution of the strongest performing dependency models identified. We compare the retrieval efficiency of each of these structures across two query processing algorithms, and across both short and long queries, using two large web collections. We observe that these newly proposed data structures allow the execution of queries considerably faster than when using positional indexes, and as fast as a full index of term dependencies, but with lowered storage overhead
Retrieving opinions from discussion forums
Understanding the landscape of opinions on a given topic or issue is important for policy makers, sociologists, and intelligence analysts. The first step in this process is to retrieve relevant opinions. Discussion forums are potentially a good source of this information, but comes with a unique set of retrieval challenges. In this short paper, we test a range of existing techniques for forum retrieval and develop new retrieval models to differentiate between opinionated and factual forum posts. We are able to demonstrate some significant performance improvements over the baseline retrieval models, demonstrating that this as a promising avenue for further study. Copyright is held by the owner/author(s).EI
Cultural value orientations, internalized homophobia, and accommodation in romantic relationships
In the present study, we examined the impact of cultural value orientations (i.e., the personally oriented value of individualism, and the socially oriented values of collectivism, familism, romanticism, and spiritualism) on accommodation (i.e., voice and loyalty, rather than exit and neglect, responses to partners' anger or criticism) in heterosexual and gay relationships; and we examined the impact of internalized homophobia (i.e., attitudes toward self, other, and disclosure) on accommodation specifically in gay relationships. A total of 262 heterosexuals (102 men and 162 women) and 857 gays (474 men and 383 women) participated in the present study. Consistent with hypotheses, among heterosexuals and gays, socially oriented values were significantly and positively related to accommodation (whereas the personally oriented value of individualism was unrelated to accommodation); and among gays in particular, internalized homophobia was significantly and negatively related to accommodation. Implications for the study of heterosexual and gay relationships are discussed. © 2005 by The Haworth Press, Inc. All rights reserved
Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning
Many protein engineering problems involve finding mutations that produce proteins
with a particular function. Computational active learning is an attractive
approach to discover desired biological activities. Traditional active learning
techniques have been optimized to iteratively improve classifier accuracy, not
to quickly discover biologically significant results. We report here a novel
active learning technique, Most Informative Positive (MIP), which is tailored to
biological problems because it seeks novel and informative positive results. MIP
active learning differs from traditional active learning methods in two ways:
(1) it preferentially seeks Positive (functionally active) examples; and (2) it
may be effectively extended to select gene regions suitable for high throughput
combinatorial mutagenesis. We applied MIP to discover mutations in the tumor
suppressor protein p53 that reactivate mutated p53 found in human cancers. This
is an important biomedical goal because p53 mutants have been
implicated in half of all human cancers, and restoring active p53 in tumors
leads to tumor regression. MIP found Positive (cancer rescue) p53 mutants
in silico using 33% fewer experiments than
traditional non-MIP active learning, with only a minor decrease in classifier
accuracy. Applying MIP to in vivo experimentation yielded
immediate Positive results. Ten different p53 mutations found in human cancers
were paired in silico with all possible single amino acid
rescue mutations, from which MIP was used to select a Positive Region predicted
to be enriched for p53 cancer rescue mutants. In vivo assays
showed that the predicted Positive Region: (1) had significantly more
(p<0.01) new strong cancer rescue mutants than control regions (Negative,
and non-MIP active learning); (2) had slightly more new strong cancer rescue
mutants than an Expert region selected for purely biological considerations; and
(3) rescued for the first time the previously unrescuable p53 cancer mutant
P152L
Intraspecies Variation in the Emergence of Hyperinfectious Bacterial Strains in Nature
Salmonella is a principal health concern because of its endemic prevalence in food and water supplies, the rise in incidence of multi-drug resistant strains, and the emergence of new strains associated with increased disease severity. Insights into pathogen emergence have come from animal-passage studies wherein virulence is often increased during infection. However, these studies did not address the prospect that a select subset of strains undergo a pronounced increase in virulence during the infective process- a prospect that has significant implications for human and animal health. Our findings indicate that the capacity to become hypervirulent (100-fold decreased LD50) was much more evident in certain S. enterica strains than others. Hyperinfectious salmonellae were among the most virulent of this species; restricted to certain serotypes; and more capable of killing vaccinated animals. Such strains exhibited rapid (and rapidly reversible) switching to a less-virulent state accompanied by more competitive growth ex vivo that may contribute to maintenance in nature. The hypervirulent phenotype was associated with increased microbial pathogenicity (colonization; cytotoxin production; cytocidal activity), coupled with an altered innate immune cytokine response within infected cells (IFN-β; IL-1β; IL-6; IL-10). Gene expression analysis revealed that hyperinfectious strains display altered transcription of genes within the PhoP/PhoQ, PhoR/PhoB and ArgR regulons, conferring changes in the expression of classical virulence functions (e.g., SPI-1; SPI-2 effectors) and those involved in cellular physiology/metabolism (nutrient/acid stress). As hyperinfectious strains pose a potential risk to human and animal health, efforts toward mitigation of these potential food-borne contaminants may avert negative public health impacts and industry-associated losses
Simultaneous evaluation of physical and social environmental correlates of physical activity in adults: A systematic review
Background:
Ecological models of physical activity posit that social and physical environmental features exert independent and interactive influences on physical activity, but previous research has focussed on independent influences. This systematic review aimed to synthesise the literature investigating how features of neighbourhood physical and social environments are associated with physical activity when both levels of influence are simultaneously considered, and to assess progress in the exploration of interactive effects of social and physical environmental correlates on physical activity.
Methods:
A systematic literature search was conducted in February 2016. Articles were included if they used an adult (≥15 years) sample, simultaneously considered at least one physical and one social environmental characteristic in a single statistical model, used self-reported or objectively-measured physical activity as a primary outcome, reported findings from quantitative, observational analyses and were published in a peer-reviewed journal. Combined measures including social and physical environment items were excluded as they didn’t permit investigation of independent and interactive social and physical effects. Forty-six studies were identified.
Results:
An inconsistent evidence base for independent environmental correlates of physical activity was revealed, with some support for specific physical and social environment correlates. Most studies found significant associations between physical activity and both physical and social environmental variables. There was preliminary evidence that physical and social environmental variables had interactive effects on activity, although only 4 studies examined interactive effects.
Conclusions:
Inconsistent evidence of independent associations between environmental variables and physical activity could be partly due to unmeasured effect modification (e.g. interactive effects) creating unaccounted variance in relationships between the environment and activity. Results supported multiple levels of environmental influence on physical activity. It is recommended that further research uses simultaneous or interaction analyses to gain insight into complex relationships between neighbourhood social and physical environments and physical activity, as there is currently limited research in this area
Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector
A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements
Measurement of the Ratio of b Quark Production Cross Sections in Antiproton-Proton Collisions at 630 GeV and 1800 GeV
We report a measurement of the ratio of the bottom quark production cross
section in antiproton-proton collisions at 630 GeV to 1800 GeV using bottom
quarks with transverse momenta greater than 10.75 GeV identified through their
semileptonic decays and long lifetimes. The measured ratio
sigma(630)/sigma(1800) = 0.171 +/- .024 +/- .012 is in good agreement with
next-to-leading order (NLO) quantum chromodynamics (QCD)
- …