2,985 research outputs found

    Foundational principles for large scale inference: Illustrations through correlation mining

    Full text link
    When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number nn of acquired samples (statistical replicates) is far fewer than the number pp of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data." Sample complexity however has received relatively less attention, especially in the setting when the sample size nn is fixed, and the dimension pp grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. We demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks

    Large Scale Correlation Screening

    Full text link
    This paper treats the problem of screening for variables with high correlations in high dimensional data in which there can be many fewer samples than variables. We focus on threshold-based correlation screening methods for three related applications: screening for variables with large correlations within a single treatment (autocorrelation screening); screening for variables with large cross-correlations over two treatments (cross-correlation screening); screening for variables that have persistently large auto-correlations over two treatments (persistent-correlation screening). The novelty of correlation screening is that it identifies a smaller number of variables which are highly correlated with others, as compared to identifying a number of correlation parameters. Correlation screening suffers from a phase transition phenomenon: as the correlation threshold decreases the number of discoveries increases abruptly. We obtain asymptotic expressions for the mean number of discoveries and the phase transition thresholds as a function of the number of samples, the number of variables, and the joint sample distribution. We also show that under a weak dependency condition the number of discoveries is dominated by a Poisson random variable giving an asymptotic expression for the false positive rate. The correlation screening approach bears tremendous dividends in terms of the type and strength of the asymptotic results that can be obtained. It also overcomes some of the major hurdles faced by existing methods in the literature as correlation screening is naturally scalable to high dimension. Numerical results strongly validate the theory that is presented in this paper. We illustrate the application of the correlation screening methodology on a large scale gene-expression dataset, revealing a few influential variables that exhibit a significant amount of correlation over multiple treatments.Comment: 33 pages, 7 figures; Changes in version 2: There are no changes in the technical material in this revised version. The only changes are correcting typographical errors and referencing related work in the area. There is also material in the introduction where more context to the correlation screening problem is given (especially in terms of relationships to other testing methods

    1-(4-Bromo­phen­yl)ferrocene

    Get PDF
    In the title compound, [Fe(C5H5)(C11H8Br)], the distance of the Fe atom from the centroids of the unsubstituted and substituted cyclo­penta­dienyl (Cp) rings is 1.644 (1) and 1.643 (1) Å, respectively. The ferrocenyl moiety deviates from an eclipsed geometry, with marginally tilted Cp rings and an inter­planar angle between the Cp and benzene rings of 13.0 (4)°. The crystal structure is stabilized by C—H⋯π inter­actions between a cyclo­penta­dienyl H atom and the cyclo­penta­dienyl ring of a neighbouring mol­ecule

    Lubricating Effects of Cocoa Butter and Coconut Oil in Conventional Paracetamol Tablets

    Get PDF
    Background: Due to chemical instability of some Active Pharmaceutical Ingredients often caused by magnesium stearate and its impurities, it is expedient to research into some other materials especially of natural origin, which would probably exhibit better lubricating activity, chemically inactive, less bioactive and less prohibitive.Objective: This work is designed to examine the lubricating properties of cocoa butter and coconut oil as alternative lubricants in comparison with conventional lubricant - magnesium stearate at different concentrations in paracetamol tablets.Materials and Methods: Cocoa butter was extracted from the seeds of Theobroma cacao and coconut oil from the meat of matured coconuts harvested from the coconuts palm (Cocos nucifera). Physicochemical evaluation was carried out on the extracted oils. Thirteen different formulations were prepared using different lubricants; magnesium stearate, cocoa butter and coconut oil at 0 – 4 %w/w concentrations. The prepared granules were evaluated for various pre-compression characteristics (bulk density, tapped density, angle of repose, Hausner’s quotient and Carr’s index) and post-compression characteristics (weight variation, friability, hardness, disintegration and dissolution times).Discussion: Most of the values obtained from the evaluation of pre- and post- compression characteristics correlate with the pharmacopoeial limits. The values of disintegration time were observed to increase as the lubricant concentration increased but no direct relationship with dissolution time. Tablet hardness values decreased while friability increased as the lubricant concentration increased for all the batches. From the study, cocoa butter and coconut oil at 2 – 4 % exhibited effective lubricating effect in the formulation of paracetamol tablet with respect to their values of weight variation, friability, hardness, disintegration and dissolution times.Conclusion: Cocoa butter and coconut oil could be employed as good alternative lubricants to the conventional ones in pharmaceutical tablet formulation.Keywords: Lubricants, Cocoa butter, Coconut oil, Magnesium stearate

    Bromido-1κBr-tricarbonyl-2κ3 C-(2η5-cyclo­penta­dien­yl)molybdenum(I)tungsten(I)(W—Mo)

    Get PDF
    The title compound, [WMoBr(C5H5)(CO)3], is built up from a pseudo-square-pyramidal piano-stool coordination around the Mo atom, the important geometry being Mo—W = 2.6872 (7) Å, W—Br = 2.5591 (9) Å and Mo—W—Br = 158.35 (3)°

    Fine structure of K\mathrm{K}-excitons in multilayers of transition metal dichalcogenides

    Full text link
    Reflectance and magneto-reflectance experiments together with theoretical modelling based on the kp\mathbf{k\cdot p} approach have been employed to study the evolution of direct bandgap excitons in MoS2_2 layers with a thickness ranging from mono- to trilayer. The extra excitonic resonances observed in MoS2_2 multilayers emerge as a result of the hybridization of Bloch states of each sub-layer due to the interlayer coupling. The properties of such excitons in bi- and trilayers are classified by the symmetry of corresponding crystals. The inter- and intralayer character of the reported excitonic resonances is fingerprinted with the magneto-optical measurements: the excitonic gg-factors of opposite sign and of different amplitude are revealed for these two types of resonances. The parameters describing the strength of the spin-orbit interaction are estimated for bi- and trilayer MoS2_2.Comment: 14 pages, 10 figure

    Electrical Resistivity Survey on Two Waste Dumpsites at Nguru, Potiskum, Yobe State, Nigeria to Determine the Effect of Leachates on Ground Water Aquifer

    Get PDF
    The research intends to bring out the contribution of leachate on groundwater in two dumpsites in Nguru and Potiskum all in Yobe state, Nigeria. A total of seven (7) and eight (8) VES by Schlumberger electrode with the use of Wenner electrode configuration. The results were interpreted by the use of WinRESIST for VES and IPWIN2INV for ERT. The study pointed out that, the area in question is comprised of four layers of geoelectric such as the topsoil, clay, sand, sandy clay and sand. The range of the first resistivity layer was from 6.16 Ωm to 332 Ωm in the first geo-electric layer and its thickness range from 2.77 m to 37.7 m and a depth range of 2.77 m to 37.7 m. the range of the second resistivity layer was from 16.5 Ωm to 37.9 Ωm which has the range of its thickness from 4.1 m to 10.7 m. The range of the third resistivity layer was from 101.2 Ωm to 288.2 Ωm which has the range of its thickness from 38.9 m to 99.7 m, and the first aquifer in the area. The range of the first resistivity layer was from 100.7 Ωm to 214.3 Ωm which has the range of its thickness from 28.5 m to 94 m. The fifth layer which is the second aquifer and has resistivity from 254 Ωm to 350 Ωm with a very large thickness. The range of the first resistivity aquifer is from 101.2 Ωm to 288.2 Ωm and the range of the second resistivity aquifer is from 253.8 Ωm to 350.1 Ωm. The 2D ERT profiles unveiled areas with low resistant zones and later discussed as zones penetrated by contaminants originated from dumpsites whereas high resistant zones represent areas of low or non-conductive materials in the area. Data obtained from four dumpsites indicated that leachate of the waste dumpsites penetrated into aquifers and polluted the groundwater. The existence of contaminants in the water was noted by a decrease in the formation resistant values. It is seen, from the results of the survey (geophysical) that the water in the area is polluted and it accounts for the prevalence of any disease related to water that are common in the are

    A comparative study of computational procedures for the resource constrained project scheduling problem

    Get PDF
    Performance of two new integer programming based heuristics together with some special purpose algorithms for project scheduling are tested from a computational point of view. The objective of the study is to compare the quality of solutions obtained by using these algorithms and reach conclusions about their relative merits on this specific problem. © 1994

    Isolation and characterisation of microorganisms contaminating herbal infusion sold in Minna, Nigeria

    Get PDF
    The microbiological assessment of ten herbal infusion samples from ten different locations in Minna, Niger State was investigated. The assessment of the microbial contamination on the herbal products was carried out, using standard methods. Pour plate method was used to cultivate serially diluted portions of the medicinal plant infusion samples. The results revealed that all the herbal preparations had the presence of microbial contaminants. The total heterotrophic counts of the different herbal samples ranged from 0 cfu/mL to 25.0 × 108cfu/mL while the total fungal counts ranged from 3.0×106cfu/mL to 3.5×108cfu/mL. The total viable bacteria counts showed that the highest counts of 25.0 × 108cfu/mL was recorded in the sample from Bosso and the least counts of 0 cfu/mL from Kasuwan-Gwari while the total fungal counts showed that the highest count of 3.5×108cfu/mL was found in the sample obtained from FUT campus and the least counts of 3.0×106cfu/mL in the sample from Mai-Kunkele. One way analysis of variance (ANOVA) showed that there was significant difference (p<0.05) in the microbial load of the herbal infusions from each location. The microbial isolates identified were E. coli, Staphylococcus aureus, Shigella sp, Klebsiella sp, Pseudomonas sp, Micrococcus sp, Salmonella sp, Aspergillus sp, Penicillium sp and Saccharomyces cerevisaie. Members of the genus Aspergillus were found to be predominant. This suggests that the herbal infusion harbors microorganisms that could be hazardous to human health and hence producers should maintain the highest possible level of hygiene during the processing and packaging of the products in order to ensure safety of the products
    corecore