71 research outputs found
Cross-language high similarity search using a conceptual thesaurus
This work addresses the issue of cross-language high similarity and
near-duplicates search, where, for the given document, a highly similar one is to
be identified from a large cross-language collection of documents. We propose
a concept-based similarity model for the problem which is very light in computation
and memory. We evaluate the model on three corpora of different nature
and two language pairs English-German and English-Spanish using the Eurovoc
conceptual thesaurus. Our model is compared with two state-of-the-art models
and we find, though the proposed model is very generic, it produces competitive
results and is significantly stable and consistent across the corpora.This work was done in the framework of the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems and it has been partially
funded by the European Commission as part of the WIQ-EI IRSES project (grant no.
269180) within the FP 7 Marie Curie People Framework, and by the Text-Enterprise
2.0 research project (TIN2009-13391-C04-03). The research work of the second author
is supported by the CONACyT 192021/302009 grantGupta, P.; Barrón Cedeño, LA.; Rosso, P. (2012). Cross-language high similarity search using a conceptual thesaurus. En Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. Springer Verlag (Germany). 7488:67-75. https://doi.org/10.1007/978-3-642-33247-0_8S6775748
Simple estimators of the intensity of seasonal occurrence
<p>Abstract</p> <p>Background</p> <p>Edwards's method is a widely used approach for fitting a sine curve to a time-series of monthly frequencies. From this fitted curve, estimates of the seasonal intensity of occurrence (i.e., peak-to-low ratio of the fitted curve) can be generated.</p> <p>Methods</p> <p>We discuss various approaches to the estimation of seasonal intensity assuming Edwards's periodic model, including maximum likelihood estimation (MLE), least squares, weighted least squares, and a new closed-form estimator based on a second-order moment statistic and non-transformed data. Through an extensive Monte Carlo simulation study, we compare the finite sample performance characteristics of the estimators discussed in this paper. Finally, all estimators and confidence interval procedures discussed are compared in a re-analysis of data on the seasonality of monocytic leukemia.</p> <p>Results</p> <p>We find that Edwards's estimator is substantially biased, particularly for small numbers of events and very large or small amounts of seasonality. For the common setting of rare events and moderate seasonality, the new estimator proposed in this paper yields less finite sample bias and better mean squared error than either the MLE or weighted least squares. For large studies and strong seasonality, MLE or weighted least squares appears to be the optimal analytic method among those considered.</p> <p>Conclusion</p> <p>Edwards's estimator of the seasonal relative risk can exhibit substantial finite sample bias. The alternative estimators considered in this paper should be preferred.</p
Quantification of selection bias in studies of risk factors for birth defects among livebirths
Background: Risk factors for birth defects are frequently investigated using data limited to liveborn infants. By conditioning on survival, results of such studies may be distorted by selection bias, also described as “livebirth bias.” However, the implications of livebirth bias on risk estimation remain poorly understood. Objectives: We sought to quantify livebirth bias and to investigate the conditions under which it arose. Methods: We used data on 3994 birth defects cases and 11 829 controls enrolled in the National Birth Defects Prevention Study to compare odds ratio (OR) estimates of the relationship between three established risk factors (antiepileptic drug use, smoking, and multifetal pregnancy) and four birth defects (anencephaly, spina bifida, omphalocele, and cleft palate) when restricted to livebirths as compared to among livebirths, stillbirths, and elective terminations. Exposures and birth defects represented varying strengths of association with livebirth; all controls were liveborn. We performed a quantitative bias analysis to evaluate the sensitivity of our results to excluding terminated and stillborn controls. Results: Cases ranged from 33% liveborn (anencephaly) to 99% (cleft palate). Smoking and multifetal pregnancy were associated with livebirth among anencephaly (crude OR [cOR] 0.61 and cOR 3.15, respectively) and omphalocele cases (cOR 2.22 and cOR 5.22, respectively). For analyses of the association between exposures and birth defects, restricting to livebirths produced negligible differences in estimates except for anencephaly and multifetal pregnancy, which was twofold higher among livebirths (adjusted OR [aOR] 4.93) as among all pregnancy outcomes (aOR 2.44). Within tested scenarios, bias analyses suggested that results were not sensitive to the restriction to liveborn controls. Conclusions: Selection bias was generally limited except for high mortality defects in the context of exposures strongly associated with livebirth. Findings indicate that substantial livebirth bias is unlikely to affect studies of risk factors for most birth defects
The national birth defects prevention study: A review of the methods: NBDPS METHODS REVIEW
The National Birth Defects Prevention Study (NBDPS) is a large population-based multi-center case-control study of major birth defects in the United States
Maternal Exposure to Criteria Air Pollutants and Congenital Heart Defects in Offspring: Results from the National Birth Defects Prevention Study
Background: Epidemiologic literature suggests that exposure to air pollutants is associated with fetal development
- …