54 research outputs found

    Mathematical Philology: Entropy Information in Refining Classical Texts' Reconstruction, and Early Philologists' Anticipation of Information Theory

    Get PDF
    Philologists reconstructing ancient texts from variously miscopied manuscripts anticipated information theorists by centuries in conceptualizing information in terms of probability. An example is the editorial principle difficilior lectio potior (DLP): in choosing between otherwise acceptable alternative wordings in different manuscripts, “the more difficult reading [is] preferable.” As philologists at least as early as Erasmus observed (and as information theory's version of the second law of thermodynamics would predict), scribal errors tend to replace less frequent and hence entropically more information-rich wordings with more frequent ones. Without measurements, it has been unclear how effectively DLP has been used in the reconstruction of texts, and how effectively it could be used. We analyze a case history of acknowledged editorial excellence that mimics an experiment: the reconstruction of Lucretius's De Rerum Natura, beginning with Lachmann's landmark 1850 edition based on the two oldest manuscripts then known. Treating words as characters in a code, and taking the occurrence frequencies of words from a current, more broadly based edition, we calculate the difference in entropy information between Lachmann's 756 pairs of grammatically acceptable alternatives. His choices average 0.26±0.20 bits higher in entropy information (95% confidence interval, P = 0.005), as against the single bit that determines the outcome of a coin toss, and the average 2.16±0.10 bits (95%) of (predominantly meaningless) entropy information if the rarer word had always been chosen. As a channel width, 0.26±0.20 bits/word corresponds to a 0.790.79+0.09−0.15 likelihood of the rarer word being the one accepted in the reference edition, which is consistent with the observed 547/756 = 0.72±0.03 (95%). Statistically informed application of DLP can recover substantial amounts of semantically meaningful entropy information from noise; hence the extension copiosior informatione lectio potior, “the reading richer in information [is] preferable.” New applications of information theory promise continued refinement in the reconstruction of culturally fundamental texts

    Proof of concept of a method that assesses the spread of microbial infections with spatially explicit and non-spatially explicit data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A method that assesses bacterial spatial dissemination was explored. It measures microbial genotypes (defined by electrophoretic patterns or EP), host, location (farm), interfarm Euclidean distance, and time. Its proof of concept (construct and internal validity) was evaluated using a dataset that included 113 <it>Staphylococcus aureus </it>EPs from 1126 bovine milk isolates collected on 23 farms between 1988 and 2005.</p> <p>Results</p> <p>Construct validity was assessed by comparing results based on the interfarm Euclidean distance (a spatially explicit measure) and those produced by the (non-spatial) interfarm number of isolates reporting the same EP. The distance associated with EP spread correlated with the interfarm number of isolates/EP (<it>r </it>= .59, <it>P </it>< 0.02). Internal validity was estimated by comparing results obtained with different versions of the same indices. Concordance was observed between: (a) EP distance (estimated microbial dispersal over space) and EP speed (distance/year, <it>r </it>= .72, <it>P </it>< 0.01), and (b) the interfarm number of isolates/EP (when measured on the basis of non-repeated cow testing) and the same measure as expressed by repeated testing of the same animals (<it>r </it>= .87, <it>P </it>< 0.01). Three EPs (2.6% of all EPs) appeared to be super-spreaders: they were found in 26.75% of all isolates. Various indices differentiated local from spatially disseminated infections and, within the local type, infections suspected to be farm-related were distinguished from cow-related ones.</p> <p>Conclusion</p> <p>Findings supported both construct and internal validity. Because 3 EPs explained 12 times more isolates than expected and at least twice as many isolates as other EPs did, false negative results associated with the remaining EPs (those erroneously identified as lacking spatial dispersal when, in fact, they disseminated spatially), if they occurred, seemed to have negligible effects. Spatial analysis of laboratory data may support disease surveillance systems by generating hypotheses on microbial dispersal ability.</p

    Risk-Based Consumption Advice for Farmed Atlantic and Wild Pacific Salmon Contaminated with Dioxins and Dioxin-like Compounds

    Get PDF
    We reported recently that several organic contaminants occurred at elevated concentrations in farmed Atlantic salmon compared with concentrations of the same contaminants in wild Pacific salmon [Hites et al. Science 303:226–229 (2004)]. We also found that polychlorinated biphenyls (PCBs), toxaphene, dieldrin, dioxins, and polybrominated diphenyl ethers occurred at higher concentrations in European farm-raised salmon than in farmed salmon from North and South America. Health risks (based on a quantitative cancer risk assessment) associated with consumption of farmed salmon contaminated with PCBs, toxaphene, and dieldrin were higher than risks associated with exposure to the same contaminants in wild salmon. Here we present information on cancer and noncancer health risks of exposure to dioxins in farmed and wild salmon. The analysis is based on a tolerable intake level for dioxin-like compounds established by the World Health Organization and on risk estimates for human exposure to dioxins developed by the U.S. Environmental Protection Agency. Consumption of farmed salmon at relatively low frequencies results in elevated exposure to dioxins and dioxin-like compounds with commensurate elevation in estimates of health risk

    Optimization of Epidemiologic Interventions: Evaluation of Spatial and Non-Spatial Methods That Identify Johne’s Disease-Infected Subpopulations Targeted for Intervention

    Get PDF
    The potential costs and/or benefits associated with two epidemiological methods were compared. Using the same epidemiologic dataset (74 Israeli dairy herds tested for bovine paratuberculosis of which 57 farms were regarded to be infected, and 619 non-tested herds), the efficacy associated with the identification of the target population where control or preventive measures could be applied was evaluated by: 1) A method that applied geographical information systems (GIS), spatial statistics, network analysis (infective spatial links or ISL); and 2) A method that only partially applied spatial techniques. Based on the herd size of tested and non-tested farms, the geographical area of influence of each infected farm was estimated. Using the Euclidean distance between tested farms (distances between 2701 farm pairs), the ISL method calculated two measures of spatial connectivity: the number of links/farm and the ISL index. These measures are analogous to the number of roads connecting a city (links/farm) and the width of a road (index). The more links and/or the greater the average index ( width ), the greater the chances of an infected farm to disseminate an infection (especially to neighboring farms). While not reaching statistical significance, positive indices of Moran\u27s I test for some spatial lags prompted the additional investigation of a subset of 547 farm pairs. This subset included 33 farm pairs (16 individual farms) which displayed \u3e 2 links/farm, and ISL indices \u3e7.5 times greater than average (high ISL farms). Regarding as cost the number of infected cows selected to receive an intervention, and as benefit the number of susceptible cows within the area of influence of an infected farm, hypothetical interventions implemented on the 16 high ISL farms yielded 39 % greater benefits and occupied a territory 9.5% smaller than decisions based on the 16 farms showing the highest prevalence. The analysis on spatial infective connectivity may lead to earlier, farm-specific and more beneficial, decisions than methods based only on outcomes (later data), such as prevalence

    A Model for the Information Content of Earnings Announcements

    Full text link
    28 pages, 1 article*A Model for the Information Content of Earnings Announcements* (Schwager, Steven J.; Richardson, Gordon D.) 28 page

    Stagewise Discrimination Algorithms for Selecting a Subset of Groups of Discriminant Variables

    Full text link
    18 pages, 1 article*Stagewise Discrimination Algorithms for Selecting a Subset of Groups of Discriminant Variables* (Evans, John C.; Robson, Douglas S.; Schwager, Steven J.) 18 page

    Identifying A Reduced Set of Salient Attributes That influence Consumers' Choice Among Whole, Low-Fat, and Skim Milk for Beverage Use

    Full text link
    R.B. 94-6Fishbein's Theory of Reasoned Action models behavior as based on beliefs and evaluations on a small set of salient attributes. Two methods of reducing large sets of potentially salient attributes into a smaller set of salient attributes are proposed. The methods are based on expectancy valuation analysis and logistic regression analysis. When applied to consumer beliefs and evaluations on 59 attributes over three milk types (whole, low-fat, and skim milk), both methods identify reduced sets of attributes. The reduced attribute sets are then used to model whether or not respondents drink a particular milk type. Results indicate that the reduced models are statistically significant in explaining choice of milk type although there is some loss of information as compared to models with 59 attributes. Furthermore, the data indicate that statistically-imputed evaluation ratings differ from self-stated evaluation ratings
    corecore