18,689 research outputs found
Consistent Estimation of Functions of Data Missing Non-Monotonically and Not at Random
Abstract Missing records are a perennial problem in analysis of complex data of all types, when the target of inference is some function of the full data law. In simple cases, where data is missing at random or completely at rando
Twisted trees and inconsistency of tree estimation when gaps are treated as missing data -- the impact of model mis-specification in distance corrections
Statistically consistent estimation of phylogenetic trees or gene trees is
possible if pairwise sequence dissimilarities can be converted to a set of
distances that are proportional to the true evolutionary distances. Susko et
al. (2004) reported some strikingly broad results about the forms of
inconsistency in tree estimation that can arise if corrected distances are not
proportional to the true distances. They showed that if the corrected distance
is a concave function of the true distance, then inconsistency due to long
branch attraction will occur. If these functions are convex, then two "long
branch repulsion" trees will be preferred over the true tree -- though these
two incorrect trees are expected to be tied as the preferred true. Here we
extend their results, and demonstrate the existence of a tree shape (which we
refer to as a "twisted Farris-zone" tree) for which a single incorrect tree
topology will be guaranteed to be preferred if the corrected distance function
is convex. We also report that the standard practice of treating gaps in
sequence alignments as missing data is sufficient to produce non-linear
corrected distance functions if the substitution process is not independent of
the insertion/deletion process. Taken together, these results imply
inconsistent tree inference under mild conditions. For example, if some
positions in a sequence are constrained to be free of substitutions and
insertion/deletion events while the remaining sites evolve with independent
substitutions and insertion/deletion events, then the distances obtained by
treating gaps as missing data can support an incorrect tree topology even given
an unlimited amount of data.Comment: 29 pages, 3 figure
Maintaining the Regular Ultra Passum Law in data envelopment analysis
The variable returns to scale data envelopment analysis (DEA) model is developed with a maintained hypothesis of convexity in input-output space. This hypothesis is not consistent with standard microeconomic production theory that posits an S-shape for the production frontier, i.e. for production technologies that obey the Regular Ultra Passum Law. Consequently, measures of technical efficiency assuming convexity are biased downward. In this paper, we provide a more general DEA model that allows the S-shape.Data envelopment analysis; homothetic production; S-shaped production function; non-convex production set
- …