83,062 research outputs found
Scanning electron microscopy image representativeness: morphological data on nanoparticles.
A sample of a nanomaterial contains a distribution of nanoparticles of various shapes and/or sizes. A scanning electron microscopy image of such a sample often captures only a fragment of the morphological variety present in the sample. In order to quantitatively analyse the sample using scanning electron microscope digital images, and, in particular, to derive numerical representations of the sample morphology, image content has to be assessed. In this work, we present a framework for extracting morphological information contained in scanning electron microscopy images using computer vision algorithms, and for converting them into numerical particle descriptors. We explore the concept of image representativeness and provide a set of protocols for selecting optimal scanning electron microscopy images as well as determining the smallest representative image set for each of the morphological features. We demonstrate the practical aspects of our methodology by investigating tricalcium phosphate, Ca3 (PO4 )2 , and calcium hydroxyphosphate, Ca5 (PO4 )3 (OH), both naturally occurring minerals with a wide range of biomedical applications
Advances in Feature Selection with Mutual Information
The selection of features that are relevant for a prediction or
classification problem is an important problem in many domains involving
high-dimensional data. Selecting features helps fighting the curse of
dimensionality, improving the performances of prediction or classification
methods, and interpreting the application. In a nonlinear context, the mutual
information is widely used as relevance criterion for features and sets of
features. Nevertheless, it suffers from at least three major limitations:
mutual information estimators depend on smoothing parameters, there is no
theoretically justified stopping criterion in the feature selection greedy
procedure, and the estimation itself suffers from the curse of dimensionality.
This chapter shows how to deal with these problems. The two first ones are
addressed by using resampling techniques that provide a statistical basis to
select the estimator parameters and to stop the search procedure. The third one
is addressed by modifying the mutual information criterion into a measure of
how features are complementary (and not only informative) for the problem at
hand
Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs
We develop data structures for dynamic closest pair problems with arbitrary
distance functions, that do not necessarily come from any geometric structure
on the objects. Based on a technique previously used by the author for
Euclidean closest pairs, we show how to insert and delete objects from an
n-object set, maintaining the closest pair, in O(n log^2 n) time per update and
O(n) space. With quadratic space, we can instead use a quadtree-like structure
to achieve an optimal time bound, O(n) per update. We apply these data
structures to hierarchical clustering, greedy matching, and TSP heuristics, and
discuss other potential applications in machine learning, Groebner bases, and
local improvement algorithms for partition and placement problems. Experiments
show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at
the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp.
619-628. For source code and experimental results, see
http://www.ics.uci.edu/~eppstein/projects/pairs
Assessing forensic evidence by computing belief functions
We first discuss certain problems with the classical probabilistic approach
for assessing forensic evidence, in particular its inability to distinguish
between lack of belief and disbelief, and its inability to model complete
ignorance within a given population. We then discuss Shafer belief functions, a
generalization of probability distributions, which can deal with both these
objections. We use a calculus of belief functions which does not use the much
criticized Dempster rule of combination, but only the very natural
Dempster-Shafer conditioning. We then apply this calculus to some classical
forensic problems like the various island problems and the problem of parental
identification. If we impose no prior knowledge apart from assuming that the
culprit or parent belongs to a given population (something which is possible in
our setting), then our answers differ from the classical ones when uniform or
other priors are imposed. We can actually retrieve the classical answers by
imposing the relevant priors, so our setup can and should be interpreted as a
generalization of the classical methodology, allowing more flexibility. We show
how our calculus can be used to develop an analogue of Bayes' rule, with belief
functions instead of classical probabilities. We also discuss consequences of
our theory for legal practice.Comment: arXiv admin note: text overlap with arXiv:1512.01249. Accepted for
publication in Law, Probability and Ris
Identification of diverse database subsets using property-based and fragment-based molecular descriptions
This paper reports a comparison of calculated molecular properties and of 2D fragment bit-strings when used for the selection of structurally diverse subsets of a file of 44295 compounds. MaxMin dissimilarity-based selection and k-means cluster-based selection are used to select subsets containing between 1% and 20% of the file. Investigation of the numbers of bioactive molecules in the selected subsets suggest: that the MaxMin subsets are noticeably superior to the k-means subsets; that the property-based descriptors are marginally superior to the fragment-based descriptors; and that both approaches are noticeably superior to random selection
- …