39 research outputs found
On the multiresolution structure of Internet traffic traces
Internet traffic on a network link can be modeled as a stochastic process.
After detecting and quantifying the properties of this process, using
statistical tools, a series of mathematical models is developed, culminating in
one that is able to generate ``traffic'' that exhibits --as a key feature-- the
same difference in behavior for different time scales, as observed in real
traffic, and is moreover indistinguishable from real traffic by other
statistical tests as well. Tools inspired from the models are then used to
determine and calibrate the type of activity taking place in each of the time
scales. Surprisingly, the above procedure does not require any detailed
information originating from either the network dynamics, or the decomposition
of the total traffic into its constituent user connections, but rather only the
compliance of these connections to very weak conditions.Comment: 57 pages, color figures. Figures are of low quality due to space
consideration
How Good Are Standard Copulas Anyway?
First, we will raise a question: How good are standard copulas in capturing the dependency structure? To this end we will offer a series of simulated/numerical examples demonstrating that, more often than not, standard model copulas do not capture the underlying dependency structure. We believe that copula models, unlike other statistical tools, are too readily accepted by practitioners. Rigorous, goodness-of-fit tests are commonly replaced by off-hand statements like: “it works well”. To this end, the second part of the talk offers a theoretical result, an umbrella type theorem tailored for creating numerous Goodness of Fit tests for copulas
A direct bootstrapping technique and its application to a novel goodness of fit test
AbstractWe prove general theorems that characterize situations in which we could have asymptotic closeness between the original statistics Hn and its bootstrap version Hn∗, without stipulating the existence of weak limits. As one possible application we introduce a novel goodness of fit test based on the modification of Total Variation metric. This new statistic is more sensitive than the Kolmogorov–Smirnov statistic, it applies to higher dimensions, and it does not converge weakly; but we show that it can be bootstrapped
Comparison of a Label-Free Quantitative Proteomic Method Based on Peptide Ion Current Area to the Isotope Coded Affinity Tag Method
Recently, several research groups have published methods for the determination of proteomic expression profiling by mass spectrometry without the use of exogenously added stable isotopes or stable isotope dilution theory. These so-called label-free, methods have the advantage of allowing data on each sample to be acquired independently from all other samples to which they can later be compared in silico for the purpose of measuring changes in protein expression between various biological states. We developed label free software based on direct measurement of peptide ion current area (PICA) and compared it to two other methods, a simpler label free method known as spectral counting and the isotope coded affinity tag (ICAT) method. Data analysis by these methods of a standard mixture containing proteins of known, but varying, concentrations showed that they performed similarly with a mean squared error of 0.09. Additionally, complex bacterial protein mixtures spiked with known concentrations of standard proteins were analyzed using the PICA label-free method. These results indicated that the PICA method detected all levels of standard spiked proteins at the 90% confidence level in this complex biological sample. This finding confirms that label-free methods, based on direct measurement of the area under a single ion current trace, performed as well as the standard ICAT method. Given the fact that the label-free methods provide ease in experimental design well beyond pair-wise comparison, label-free methods such as our PICA method are well suited for proteomic expression profiling of large numbers of samples as is needed in clinical analysis
THE RISK MODEL OF DEVELOPING SCHIZOPHRENIA BASED ON TEMPERAMENT AND CHARACTER
Introduction: Cloninger\u27s psychological model of temperament and character confirms that the personality development is influenced by biological and psychological processes. The aim of this study is to examine personality dimensions and to determine which variable separates the healthy from the ill in the best way and could be a possible psychological marker for the presence of the illness.
Methods: This research included 152 subjects, 76 patients with schizophrenia and 76 healthy controls, selected on the basis of medical interviews, random population sampling model from a wider social community using the independent T-Tests. The Temperament and Character Inventory (TCI), which compared personality traits of the patients with schizophrenia and the healthy control group, was used. Dependence of variables in these categories was assessed using the Chi-square and Fisher\u27s tests, and the impact of variables on schizophrenia was tested using univariate and multivariate binary logistic regression. The same method was used for making the mathematical model.
Results: Unlike the control group, patients with schizophrenia exhibited higher Harm avoidance (HA) and Self - transcendence (ST) scores as well as lower Self - directedness (SD) and Cooperativeness (C) scores. Multivariate binary logistic regression showed that Responsibility, Purposefulness, Resourcefulness, Cooperativeness and Compassion dimensions were significantly more present in the patients with schizophrenia. The new variable Model (area=0.896, p<0.0005) is composed of five TCI parameters. It proved to be a reliable marker for separation the healthy from the ill ones (area=0.896, p<0.0005). It has a good sensitivity (80%) and specificity (84%).
Conclusions: Research has emphasized variables in the temperament and character inventory, which are the best markers for distinguishing between the healthy and the ill, thus making the mathematical model
Genetic Variation Shapes Protein Networks Mainly through Non-transcriptional Mechanisms
Variation in the levels of co-regulated proteins that function within networks in an outbred yeast population is not driven by variation in the corresponding transcripts
A Novel Hands-On Way to Teach Introductory Statistics
We plan to present the novel approach to teaching intro-Stat classes. This technique has been implemented at FAU for several years and based on a few thousand students, it was very well received. We greatly simplified the material by eliminating much of redundant and tedious computations by hand, as well as formula memorization. Instead, via Excel, we introduce the hands-on data analysis on real data. And we do this on the very first week! As the semester progresses, we add the necessary computer skills as well as Statistical tools; as needed. By the completion of semester, the students learn to effortlessly open and manipulate data files containing tens of thousands data point, to perform multiple regression as well as t-tests
The bootstrap for empirical processes under dependence
In this thesis we establish that the blockwise bootstrap works for a large class of statistics. The main results are as follows: (i) A strongly mixing sequence satisfying the Central Limit Theorem for the mean, also satisfies the Moving Blocks Bootstrap Central Limit Theorem, in probability, even with bootstrapped norming. This result is optimal. (ii) The blockwise bootstrap of the empirical processes for a stationary sequence, indexed by VC-subgraph classes of functions, converges weakly to the appropriate Gaussian Process, conditionally in probability. The conditions imposed are only marginally stronger than the best known sufficient conditions for the regular CLT for these processes. (iii) The blockwise bootstrap of the classical empirical process for stationary strong mixing sequences, converges weakly to the appropriate Brownian bridge, conditionally in probability. The conditions imposed are weaker than the weakest known sufficient conditions for a regular CLT for this process.