18,503 research outputs found
UPMASK: unsupervised photometric membership assignment in stellar clusters
We develop a method for membership assignment in stellar clusters using only
photometry and positions. The method, UPMASK, is aimed to be unsupervised, data
driven, model free, and to rely on as few assumptions as possible. It is based
on an iterative process, principal component analysis, clustering algorithm,
and kernel density estimations. Moreover, it is able to take into account
arbitrary error models. An implementation in R was tested on simulated clusters
that covered a broad range of ages, masses, distances, reddenings, and also on
real data of cluster fields. Running UPMASK on simulations showed that it
effectively separates cluster and field populations. The overall spatial
structure and distribution of cluster member stars in the colour-magnitude
diagram were recovered under a broad variety of conditions. For a set of 360
simulations, the resulting true positive rates (a measurement of purity) and
member recovery rates (a measurement of completeness) at the 90% membership
probability level reached high values for a range of open cluster ages
( yr), initial masses (M_{\sun}) and
heliocentric distances ( kpc). UPMASK was also tested on real data
from the fields of the open cluster Haffner~16 and of the closely projected
clusters Haffner~10 and Czernik~29. These tests showed that even for moderate
variable extinction and cluster superposition, the method yielded useful
cluster membership probabilities and provided some insight into their stellar
contents. The UPMASK implementation will be available at the CRAN archive.Comment: 12 pages, 13 figures, accepted for publication in Astronomy and
Astrophysic
Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management
Spreadsheet software is the tool of choice for interactive ad-hoc data
management, with adoption by billions of users. However, spreadsheets are not
scalable, unlike database systems. On the other hand, database systems, while
highly scalable, do not support interactivity as a first-class primitive. We
are developing DataSpread, to holistically integrate spreadsheets as a
front-end interface with databases as a back-end datastore, providing
scalability to spreadsheets, and interactivity to databases, an integration we
term presentational data management (PDM). In this paper, we make a first step
towards this vision: developing a storage engine for PDM, studying how to
flexibly represent spreadsheet data within a database and how to support and
maintain access by position. We first conduct an extensive survey of
spreadsheet use to motivate our functional requirements for a storage engine
for PDM. We develop a natural set of mechanisms for flexibly representing
spreadsheet data and demonstrate that identifying the optimal representation is
NP-Hard; however, we develop an efficient approach to identify the optimal
representation from an important and intuitive subclass of representations. We
extend our mechanisms with positional access mechanisms that don't suffer from
cascading update issues, leading to constant time access and modification
performance. We evaluate these representations on a workload of typical
spreadsheets and spreadsheet operations, providing up to 20% reduction in
storage, and up to 50% reduction in formula evaluation time
The Use of Selected Methods of Linear Ordering to Assess the Innovation Performance of the European Union Member States
The growing interest in measuring economic and social phenomena that are difficult to observe directly increases the need for researchers to broaden the use of multivariate statistical analysis methods. The ease of interpreting results presented in the form of rankings makes it common practice to use different methods of linear ordering of objects. If the appropriate assumptions are met, the determined set of variables allows for the construction of a synthetic measure whose ordered values provide a ranking. Such a statistical approach is quite often used in assessing the level of innovativeness of economies, and the literature abounds in various innovation indices. The starting point of this paper is a set of 27 variables on the basis of which the Summary Innovation Index is developed. After verifying the statistical assumptions and reducing the database to 21 diagnostic factors, the authors construct a total of nine innovation rankings, using different methods of linear ordering and selected procedures for normalisation of variables. The aim of the paper is therefore to assess the impact of selected methods of linear ordering (Hellwig’s method, TOPSIS method, GDM method) and various procedures for normalising variables (classic standardisation, positional standardisation, quotient transformation) on the final ranking of the EU Member States due to the level of their innovation performance. The obtained results confirm that the applied method of linear ordering and the selection of the normalisation procedure have an impact on the final ranking of the examined objects – in this case, the final ranking of the EU Member States due to the level of their innovativeness analysed in the presented research
Thermal error modelling of machine tools based on ANFIS with fuzzy c-means clustering using a thermal imaging camera
Thermal errors are often quoted as being the largest contributor to CNC machine tool errors, but they can be effectively reduced using error compensation. The performance of a thermal error compensation system depends on the accuracy and robustness of the thermal error model and the quality of the inputs to the model. The location of temperature measurement must provide a representative measurement of the change in temperature that will affect the machine structure. The number of sensors and their locations are not always intuitive and the time required to identify the optimal locations is often prohibitive, resulting in compromise and poor results.
In this paper, a new intelligent compensation system for reducing thermal errors of machine tools using data obtained from a thermal imaging camera is introduced. Different groups of key temperature points were identified from thermal images using a novel schema based on a Grey model GM (0, N) and Fuzzy c-means (FCM) clustering method. An Adaptive Neuro-Fuzzy Inference System with Fuzzy c-means clustering (FCM-ANFIS) was employed to design the thermal prediction model. In order to optimise the approach, a parametric study was carried out by changing the number of inputs and number of membership functions to the FCM-ANFIS model, and comparing the relative robustness of the designs. According to the results, the FCM-ANFIS model with four inputs and six membership functions achieves the best performance in terms of the accuracy of its predictive ability. The residual value of the model is smaller than ± 2 μm, which represents a 95% reduction in the thermally-induced error on the machine. Finally, the proposed method is shown to compare favourably against an Artificial Neural Network (ANN) model
From structural to functional glycomics: core substitutions as molecular switches for shape and lectin affinity of N-glycans
Glycan epitopes of cellular glycoconjugates act as versatile biochemical signals (sugar coding). Here, we test the hypothesis that the common N-glycan modifications by core fucosylation and introduction of the bisecting N-acetylglucosamine moiety have long-range effects with functional consequences. Molecular dynamics simulations indicate a shift in conformational equilibria between linear extension or backfolding of the glycan antennae upon substitution. We also present a new fingerprint-like mode of presentation for this multi-parameter system. In order to delineate definite structure-function relationships, we strategically combined chemoenzymatic synthesis with bioassaying cell binding and the distribution of radioiodinated neoglycoproteins in vivo. Of clinical relevance, tailoring the core region affects serum clearance markedly, e. g., prolonging circulation time for the neoglycoprotein presenting the N-glycan with both substitutions. alpha 2,3-Sialylation is another means toward this end, similarly seen for type II branching in triantennary N-glycans. This discovery signifies that rational glycoengineering along the given lines is an attractive perspective to optimize pharmacokinetic behavior of glycosylated pharmaproteins. Of general importance for the concept of the sugar code, the presented results teach the fundamental lesson that N-glycan core substitutions convey distinct characteristics to the concerned oligosaccharide relevant for cis and trans biorecognition processes. These modifications are thus molecular switches
Consensus clustering and functional interpretation of gene-expression data
Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas
Direct Phenotypic Screening in Mice: Identification of Individual, Novel Antinociceptive Compounds from a Library of 734 821 Pyrrolidine Bis-piperazines
The hypothesis in the current study is that the simultaneous direct in vivo testing of thousands to millions of systematically arranged mixture-based libraries will facilitate the identification of enhanced individual compounds. Individual compounds identified from such libraries may have increased specificity and decreased side effects early in the discovery phase. Testing began by screening ten diverse scaffolds as single mixtures (ranging from 17 340 to 4 879 681 compounds) for analgesia directly in the mouse tail withdrawal model. The “all X” mixture representing the library TPI-1954 was found to produce significant antinociception and lacked respiratory depression and hyperlocomotor effects using the Comprehensive Laboratory Animal Monitoring System (CLAMS). The TPI-1954 library is a pyrrolidine bis-piperazine and totals 738 192 compounds. This library has 26 functionalities at the first three positions of diversity made up of 28 392 compounds each (26 × 26 × 42) and 42 functionalities at the fourth made up of 19 915 compounds each (26 × 26 × 26). The 120 resulting mixtures representing each of the variable four positions were screened directly in vivo in the mouse 55 °C warm-water tail-withdrawal assay (ip administration). The 120 samples were then ranked in terms of their antinociceptive activity. The synthesis of 54 individual compounds was then carried out. Nine of the individual compounds produced dose-dependent antinociception equivalent to morphine. In practical terms what this means is that one would not expect multiexponential increases in activity as we move from the all-X mixture, to the positional scanning libraries, to the individual compounds. Actually because of the systematic formatting one would typically anticipate steady increases in activity as the complexity of the mixtures is reduced. This is in fact what we see in the current study. One of the final individual compounds identified, TPI 2213-17, lacked significant respiratory depression, locomotor impairment, or sedation. Our results represent an example of this unique approach for screening large mixture-based libraries directly in vivo to rapidly identify individual compounds
- …