372 research outputs found
Ball: An R package for detecting distribution difference and association in metric spaces
The rapid development of modern technology facilitates the appearance of
numerous unprecedented complex data which do not satisfy the axioms of
Euclidean geometry, while most of the statistical hypothesis tests are
available in Euclidean or Hilbert spaces. To properly analyze the data of more
complicated structures, efforts have been made to solve the fundamental test
problems in more general spaces. In this paper, a publicly available R package
Ball is provided to implement Ball statistical test procedures for K-sample
distribution comparison and test of mutual independence in metric spaces, which
extend the test procedures for two sample distribution comparison and test of
independence. The tailormade algorithms as well as engineering techniques are
employed on the Ball package to speed up computation to the best of our
ability. Two real data analyses and several numerical studies have been
performed and the results certify the powerfulness of Ball package in analyzing
complex data, e.g., spherical data and symmetric positive matrix data
Competitiveness of dairy farms in three countries: the role of CAP subsidies
This paper investigates the impact of CAP subsidies on the competitiveness of dairy farms in Germany, the Netherlands, and Sweden. Technical efficiency results show that coupled subsidies have negative impacts in Germany and the Netherlands, but no significant impacts in Sweden. Decoupled subsidies negatively affect technical efficiency in each country and to a larger extent than coupled subsidies. Relative productivity results indicate that Dutch technology leads to the highest output, followed by technologies in Germany and Sweden. Dutch farms can improve their competitiveness by exploring their current production potential. Besides improving efficiency, German and Swedish farms may have options to improve their production technology.technical efficiency, output distance function, dairy farm, subsidy, relative productivity, Agricultural and Food Policy, Livestock Production/Industries,
A Splicing Approach to Best Subset of Groups Selection
Best subset of groups selection (BSGS) is the process of selecting a small
part of non-overlapping groups to achieve the best interpretability on the
response variable. It has attracted increasing attention and has far-reaching
applications in practice. However, due to the computational intractability of
BSGS in high-dimensional settings, developing efficient algorithms for solving
BSGS remains a research hotspot. In this paper,we propose a group-splicing
algorithm that iteratively detects the relevant groups and excludes the
irrelevant ones. Moreover, coupled with a novel group information criterion, we
develop an adaptive algorithm to determine the optimal model size. Under mild
conditions, it is certifiable that our algorithm can identify the optimal
subset of groups in polynomial time with high probability. Finally, we
demonstrate the efficiency and accuracy of our methods by comparing them with
several state-of-the-art algorithms on both synthetic and real-world datasets.Comment: 49 pages, 7 figure
Generalized synchronization-based partial topology identification of complex networks
summary:In this paper, partial topology identification of complex networks is investigated based on synchronization method. We construct the response networks consisting of nodes with sim-pler dynamics than that in the drive networks. By constructing Lyapunov function, sufficient conditions are derived to guarantee partial topology identification by designing suitable controllers and parameters update laws. Several numerical examples are provided to illustrate the effectiveness of the theoretical results
Nonparametric statistical inference via metric distribution function in metric spaces
The distribution function is essential in statistical inference and connected with samples to form a directed closed loop by the correspondence theorem in measure theory and the Glivenko-Cantelli and Donsker properties. This connection creates a paradigm for statistical inference. However, existing distribution functions are defined in Euclidean spaces and are no longer convenient to use in rapidly evolving data objects of complex nature. It is imperative to develop the concept of the distribution function in a more general space to meet emerging needs. Note that the linearity allows us to use hypercubes to define the distribution function in a Euclidean space. Still, without the linearity in a metric space, we must work with the metric to investigate the probability measure. We introduce a class of metric distribution functions through the metric only. We overcome this challenging step by proving the correspondence theorem and the Glivenko-Cantelli theorem for metric distribution functions in metric spaces, laying the foundation for conducting rational statistical inference for metric space-valued data. Then, we develop a homogeneity test and a mutual independence test for non-Euclidean random objects and present comprehensive empirical evidence to support the performance of our proposed methods. Supplementary materials for this article are available online
A SIMPLE Approach to Provably Reconstruct Ising Model with Global Optimality
Reconstruction of interaction network between random events is a critical
problem arising from statistical physics and politics to sociology, biology,
and psychology, and beyond. The Ising model lays the foundation for this
reconstruction process, but finding the underlying Ising model from the least
amount of observed samples in a computationally efficient manner has been
historically challenging for half a century. By using the idea of sparsity
learning, we present a approach named SIMPLE that has a dominant sample
complexity from theoretical limit. Furthermore, a tuning-free algorithm is
developed to give a statistically consistent solution of SIMPLE in polynomial
time with high probability. On extensive benchmarked cases, the SIMPLE approach
provably reconstructs underlying Ising models with global optimality. The
application on the U.S. senators voting in the last six congresses reveals that
both the Republicans and Democrats noticeably assemble in each congresses;
interestingly, the assembling of Democrats is particularly pronounced in the
latest congress
The Impact of Agri-Environmental Policies and Production Intensification on the Environmental Performance of Dutch Dairy Farms
This study examines the impact of policies and intensification on the environmental performance of Dutch dairy farms in the period 2001-2010 using a hyperbolic distance function. The results indicate that the change from the Mineral Accounting System to the combination of the Application Standards Policy with decoupled payments has not significantly changed farms’ hyperbolic efficiency. Farms receiving agri-environmental and animal welfare payments are less hyperbolically efficient than those that do not, highlighting greater decreases in desirable outputs than decreases in undesirable outputs. Finally, intensification increases hyperbolic efficiency, suggesting that intensive practices may increase production without harming the environment
Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces
The rapid development of modern technology has created many complex datasets in non-linear spaces, while most of the statistical hypothesis tests are only available in Euclidean or Hilbert spaces. To properly analyze the data with more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces (Lyons 2013; Pan, Tian, Wang, and Zhang 2018; Pan, Wang, Zhang, Zhu, and Zhu 2020). In this paper, we introduce a publicly available R package Ball for the comparison of multiple distributions and the test of mutual independence in metric spaces, which extends the test procedures for the equality of two distributions (Pan et al. 2018) and the independence of two random objects (Pan et al. 2020). The Ball package is computationally efficient since several novel algorithms as well as engineering techniques are employed in speeding up the ball test procedures. Two real data analyses and diverse numerical studies have been performed, and the results certify that the Ball package can detect various distribution differences and complicated dependencies in complex datasets, e.g., directional data and symmetric positive definite matrix data
- …