652 research outputs found
Hierarchically nested factor model from multivariate data
We show how to achieve a statistical description of the hierarchical
structure of a multivariate data set. Specifically we show that the similarity
matrix resulting from a hierarchical clustering procedure is the correlation
matrix of a factor model, the hierarchically nested factor model. In this
model, factors are mutually independent and hierarchically organized. Finally,
we use a bootstrap based procedure to reduce the number of factors in the model
with the aim of retaining only those factors significantly robust with respect
to the statistical uncertainty due to the finite length of data records.Comment: 7 pages, 5 figures; accepted for publication in Europhys. Lett. ; the
Appendix corresponds to the additional material of the accepted letter
Spanning Trees and bootstrap reliability estimation in correlation based networks
We introduce a new technique to associate a spanning tree to the average
linkage cluster analysis. We term this tree as the Average Linkage Minimum
Spanning Tree. We also introduce a technique to associate a value of
reliability to links of correlation based graphs by using bootstrap replicas of
data. Both techniques are applied to the portfolio of the 300 most capitalized
stocks traded at New York Stock Exchange during the time period 2001-2003. We
show that the Average Linkage Minimum Spanning Tree recognizes economic sectors
and sub-sectors as communities in the network slightly better than the Minimum
Spanning Tree does. We also show that the average reliability of links in the
Minimum Spanning Tree is slightly greater than the average reliability of links
in the Average Linkage Minimum Spanning Tree.Comment: 17 pages, 3 figure
What a difference a term makes:the effect of educational attainment on marital outcomes in the UK
Abstract In the past, students in England and Wales born within the first 5 monthsof the academic year could leave school one term earlier than those born later inthe year. Focusing on women, those who were required to stay on an extra termmore frequently hold some academic qualification. Using having been required tostay on as an exogenous factor affecting academic attainment, we find that holding alow-level academic qualification has no effect on the probability of being currentlymarried for women aged 25 or above, but increases the probability of the husbandholding some academic qualification and being economically active.33 Halama
Cost-effectiveness of alternative methods of surgical repair of inguinal hernia
Objectives: To assess the relative cost-effectiveness of laparoscopic methods of inguinal hernia repair compared with open flat mesh and open non-mesh repair. Methods: Data on the effectiveness of these alternatives came from three systematic reviews comparing: (i) laparoscopic methods with open flat mesh or non-mesh methods; (ii) open flat mesh with open non-mesh repair; and (iii) methods that used synthetic mesh to repair the hernia defect with those that did not. Data on costs were obtained from the authors of economic evaluations previously conducted alongside trials included in the reviews. A Markov model was used to model cost-effectiveness for a five-year period after the initial operation. The outcomes of the model were presented using a balance sheet approach and as cost per hernia recurrence avoided and cost per extra day at usual activities. Results: Open flat mesh was the most cost-effective method of preventing recurrences. Laparoscopic repair provided a shorter period of convalescence and less long-term pain compared with open flat mesh but was more costly. The mean incremental cost per additional day back at usual activities compared with open flat mesh was âŹ38 and âŹ80 for totally extraperitoneal and transabdominal preperitoneal repair, respectively. Conclusions: Laparoscopic repair is not cost-effective compared with open flat mesh repair in terms of cost per recurrence avoided. Decisions about the use of laparoscopic repair depend on whether the benefits (reduced pain and earlier return to usual activities) outweigh the extra costs and intraoperative risks. On the evidence presented here, these extra costs are unlikely to be offset by the short-term benefits of laparoscopic repair.Luke Vale, Adrian Grant, Kirsty McCormack, Neil W. Scott and the EU Hernia Trialists Collaboratio
Autonomous clustering using rough set theory
This paper proposes a clustering technique that minimises the need for subjective
human intervention and is based on elements of rough set theory. The proposed algorithm is
unified in its approach to clustering and makes use of both local and global data properties to
obtain clustering solutions. It handles single-type and mixed attribute data sets with ease and
results from three data sets of single and mixed attribute types are used to illustrate the
technique and establish its efficiency
Flocking-based Document Clustering on the Graphics Processing Unit
Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the fl ocking behavior of birds. Each bird represents a single document and fl ies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly diffi cult to receive results in a reasonable amount of time. However, fl ocking behavior, along with most naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have experienced improved performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefi t the GPU brings to all naturally inspired algorithms. Using the CUDA platform from NVIDIAÂź, we developed a document fl ocking implementation to be run on the NVIDIAÂź GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3,000 documents. The results of these tests were very signifi cant. Performance gains ranged from three to nearly fi ve times improvement of the GPU over the CPU implementation. This dramatic improvement in runtime makes the GPU a potentially revolutionary platform for document clustering algorithms
Factors explaining variance in perceived pain in women with fibromyalgia
BACKGROUND: We hypothesized that a substantial proportion of the subjectively experienced variance in pain in fibromyalgia patients would be explained by psychological factors alone, but that a combined model, including neuroendocrine and autonomic factors, would give the most parsimonious explanation of variance in pain. METHODS: Psychometric assessment included McGill Pain Questionnaire, General Health Questionnaire, Hospital Anxiety and Depression Rating Scale, Eysenck personality Inventory, Neuroticism and Lie subscales, Toronto Alexithymia Scale, and Multidimensional Health Locus of Control Scale and was performed in 42 female patients with fibromyalgia and 48 female age matched random sample population controls. A subgroup of the original sample (22 fibromyalgia patients and 13 controls) underwent a pharmacological challenge test with buspirone to assess autonomic and adrenocortical reactivity to serotonergic challenge. RESULTS: Although fibromyalgia patients scored high on neuroticism, anxiety, depression and general distress, only a minor part of variance in pain was explained by psychological factors alone. High pain score was associated with high neuroticism, low baseline cortisol level and small drop in systolic blood pressure after buspirone challenge test. This model explained 41.5% of total pain in fibromyalgia patients. In population controls, psychological factors alone were significant predictors for variance in pain. CONCLUSION: Fibromyalgia patients may have reduced reactivity in the central sympathetic system or perturbations in the sympathetic-parasympathetic balance. This study shows that a biopsychosocial model, including psychological factors as well as factors related to perturbations of the autonomic nervous system and hypothalamic-pituitary-adrenal axis, is needed to explain perceived pain in fibromyalgia patients
Education and Optimal Dynamic Taxation
We study optimal tax and educational policies in a dynamic private information economy, in which ex-ante heterogeneous individuals make an educational investment early in their life and face a stochastic wage distribution. We characterize labor and education wedges in this setting analytically and numerically, using a calibrated example. We present ways to implement the optimum. In one implementation there is a common labor income tax schedule, and a repayment schedule for government loans given out to agents during education. These repayment plans are contingent on loan size and income and capture the history dependence of the labor wedges. Applying the model to US-data and a binary education decision (graduating from college or not) we characterize optimal labor wedges for individuals without college degree and with college degree. The labor wedge of college graduates as a function of income lies first strictly above their counterparts from high-school, but this reverses at higher incomes. The loan repayment schedule is hump-shaped in income for college graduates
Characterization of complex networks: A survey of measurements
Each complex network (or class of networks) presents specific topological
features which characterize its connectivity and highly influence the dynamics
of processes executed on the network. The analysis, discrimination, and
synthesis of complex networks therefore rely on the use of measurements capable
of expressing the most relevant topological features. This article presents a
survey of such measurements. It includes general considerations about complex
network characterization, a brief review of the principal models, and the
presentation of the main existing measurements. Important related issues
covered in this work comprise the representation of the evolution of complex
networks in terms of trajectories in several measurement spaces, the analysis
of the correlations between some of the most traditional measurements,
perturbation analysis, as well as the use of multivariate statistics for
feature selection and network classification. Depending on the network and the
analysis task one has in mind, a specific set of features may be chosen. It is
hoped that the present survey will help the proper application and
interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of
measurements for inclusion are welcomed by the author
- âŠ