2,692 research outputs found
On the Doubt about Margin Explanation of Boosting
Margin theory provides one of the most popular explanations to the success of
\texttt{AdaBoost}, where the central point lies in the recognition that
\textit{margin} is the key for characterizing the performance of
\texttt{AdaBoost}. This theory has been very influential, e.g., it has been
used to argue that \texttt{AdaBoost} usually does not overfit since it tends to
enlarge the margin even after the training error reaches zero. Previously the
\textit{minimum margin bound} was established for \texttt{AdaBoost}, however,
\cite{Breiman1999} pointed out that maximizing the minimum margin does not
necessarily lead to a better generalization. Later, \cite{Reyzin:Schapire2006}
emphasized that the margin distribution rather than minimum margin is crucial
to the performance of \texttt{AdaBoost}. In this paper, we first present the
\textit{th margin bound} and further study on its relationship to previous
work such as the minimum margin bound and Emargin bound. Then, we improve the
previous empirical Bernstein bounds
\citep{Maurer:Pontil2009,Audibert:Munos:Szepesvari2009}, and based on such
findings, we defend the margin-based explanation against Breiman's doubts by
proving a new generalization error bound that considers exactly the same
factors as \cite{Schapire:Freund:Bartlett:Lee1998} but is sharper than
\cite{Breiman1999}'s minimum margin bound. By incorporating factors such as
average margin and variance, we present a generalization error bound that is
heavily related to the whole margin distribution. We also provide margin
distribution bounds for generalization error of voting classifiers in finite
VC-dimension space.Comment: 35 page
The minimal polynomial of sequence obtained from componentwise linear transformation of linear recurring sequence
Let be a linear recurring sequence with terms in
and be a linear transformation of over . Denote
. In this paper, we first present counter
examples to show the main result in [A.M. Youssef and G. Gong, On linear
complexity of sequences over , Theoretical Computer Science,
352(2006), 288-292] is not correct in general since Lemma 3 in that paper is
incorrect. Then, we determine the minimal polynomial of if the canonical
factorization of the minimal polynomial of without multiple roots is known
and thus present the solution to the problem which was mainly considered in the
above paper but incorrectly solved. Additionally, as a special case, we
determine the minimal polynomial of if the minimal polynomial of is
primitive. Finally, we give an upper bound on the linear complexity of
when exhausts all possible linear transformations of over
. This bound is tight in some cases.Comment: This paper was submitted to the journal Theoretical Computer Scienc
The Minimal Polynomial over F_q of Linear Recurring Sequence over F_{q^m}
Recently, motivated by the study of vectorized stream cipher systems, the
joint linear complexity and joint minimal polynomial of multisequences have
been investigated. Let S be a linear recurring sequence over finite field
F_{q^m} with minimal polynomial h(x) over F_{q^m}. Since F_{q^m} and F_{q}^m
are isomorphic vector spaces over the finite field F_q, S is identified with an
m-fold multisequence S^{(m)} over the finite field F_q. The joint minimal
polynomial and joint linear complexity of the m-fold multisequence S^{(m)} are
the minimal polynomial and linear complexity over F_q of S respectively. In
this paper, we study the minimal polynomial and linear complexity over F_q of a
linear recurring sequence S over F_{q^m} with minimal polynomial h(x) over
F_{q^m}. If the canonical factorization of h(x) in F_{q^m}[x] is known, we
determine the minimal polynomial and linear complexity over F_q of the linear
recurring sequence S over F_{q^m}.Comment: Submitted to the journal Finite Fields and Their Application
Black hole central engine for ultra-long gamma-ray burst 111209A and its associated supernova 2011kl
Recently, the first association between an ultra-long gamma-ray burst (GRB)
and a supernova is reported, i.e., GRB 111209A/SN 2011kl, which enables us to
investigate the physics of central engines or even progenitors for ultra-long
GRBs. In this paper, we inspect the broad-band data of GRB 111209A/SN 2011kl.
The late-time X-ray lightcurve exhibits a GRB 121027A-like fall-back bump,
suggesting a black hole central engine. We thus propose a collapsar model with
fall-back accretion for GRB 111209A/SN 2011kl. The required model parameters,
such as the total mass and radius of the progenitor star, suggest that the
progenitor of GRB 111209A is more likely a Wolf-Rayet star instead of blue
supergiant, and the central engine of this ultra-long burst is a black hole.
The implications of our results is discussed.Comment: Accepted for publication in Ap
On the Resistance of Nearest Neighbor to Random Noisy Labels
Nearest neighbor has always been one of the most appealing non-parametric
approaches in machine learning, pattern recognition, computer vision, etc.
Previous empirical studies partly shows that nearest neighbor is resistant to
noise, yet there is a lack of deep analysis. This work presents the
finite-sample and distribution-dependent bounds on the consistency of nearest
neighbor in the random noise setting. The theoretical results show that, for
asymmetric noises, k-nearest neighbor is robust enough to classify most data
correctly, except for a handful of examples, whose labels are totally misled by
random noises. For symmetric noises, however, k-nearest neighbor achieves the
same consistent rate as that of noise-free setting, which verifies the
resistance of k-nearest neighbor to random noisy labels. Motivated by the
theoretical analysis, we propose the Robust k-Nearest Neighbor (RkNN) approach
to deal with noisy labels. The basic idea is to make unilateral corrections to
examples, whose labels are totally misled by random noises, and classify the
others directly by utilizing the robustness of k-nearest neighbor. We verify
the effectiveness of the proposed algorithm both theoretically and empirically.Comment: 35 page
Effects of the optimisation of the margin distribution on generalisation in deep architectures
Despite being so vital to success of Support Vector Machines, the principle
of separating margin maximisation is not used in deep learning. We show that
minimisation of margin variance and not maximisation of the margin is more
suitable for improving generalisation in deep architectures. We propose the
Halfway loss function that minimises the Normalised Margin Variance (NMV) at
the output of a deep learning models and evaluate its performance against the
Softmax Cross-Entropy loss on the MNIST, smallNORB and CIFAR-10 datasets
Dynamic tomography of the spin-orbit coupling in nonlinear optics
Spin-orbit coupled (SOC) light fields with spatially inhomogeneous
polarization have attracted increasing research interest within the optical
community. In particular, owing to their spin-dependent phase and spatial
structures, many nonlinear optical phenomena which we have been familiar with
up to now are being re-examined, hence a revival of research in nonlinear
optics. To fully investigate this topic, knowledge on how the topological
structure of the light field evolves is necessary, but, as yet, there are few
studies that address the structural evolution of the light field. Here, for the
first time, we present a universal approach for theoretical tomographic
treatment of the structural evolution of SOC light in nonlinear optics
processes. Based on a Gedanken vector second harmonic generation, a
fine-grained slice of evolving SOC light in a nonlinear interaction and the
following diffraction propagation are studied theoretically and verified
experimentally, and which at the same time reveal several interesting
phenomena. The approach provides a useful tool for enhancing our capability to
obtain a more nuanced understanding of vector nonlinear optics, and sets a
foundation for further broad-based studies in nonlinear systems.Comment: 10 pages, 7 figure
On the Consistency of AUC Pairwise Optimization
AUC (area under ROC curve) is an important evaluation criterion, which has
been popularly used in many learning tasks such as class-imbalance learning,
cost-sensitive learning, learning to rank, etc. Many learning approaches try to
optimize AUC, while owing to the non-convexity and discontinuousness of AUC,
almost all approaches work with surrogate loss functions. Thus, the consistency
of AUC is crucial; however, it has been almost untouched before. In this paper,
we provide a sufficient condition for the asymptotic consistency of learning
approaches based on surrogate loss functions. Based on this result, we prove
that exponential loss and logistic loss are consistent with AUC, but hinge loss
is inconsistent. Then, we derive the -norm hinge loss and general hinge loss
that are consistent with AUC. We also derive the consistent bounds for
exponential loss and logistic loss, and obtain the consistent bounds for many
surrogate loss functions under the non-noise setting. Further, we disclose an
equivalence between the exponential surrogate loss of AUC and exponential
surrogate loss of accuracy, and one straightforward consequence of such finding
is that AdaBoost and RankBoost are equivalent
Beam energy dependence of the relativistic retardation effects of electrical fields on the ratio in heavy-ion collisions
In this article we investigate the beam energy dependence of relativistic
retardation effects of electrical fields on the single and double
ratios in three heavy-ion reactions with an isospin- and
momentum-dependent transport model IBUU11. With the beam energy increasing from
200 to 400 MeV/nucleon, effects of the relativistically retarded electrical
fields on the ratio are found to increase gradually from
negligibly to considerably significant as expectedly; it is however, the
interesting observation is the relativistic retardation effects of electrical
fields on the ratio are becoming gradually insignificant as
the beam energy further increasing from 400 to 800 MeV/nucleon. Moreover, we
also investigate the isospin dependence of relativistic retardation effects of
electrical fields on the ratio in two isobar reaction systems
of Ru+Ru and Zr+Zr at the beam energies from 200 to
800 MeV/nucleon. It is shown that the relativistic retardation effects of
electrical fields on the ratio are independent of the isospin
of reaction. Furthermore, we also examine the double ratio in
reactions of Zr+Zr over Ru+Ru at the beam energies
from 200 to 800 MeV/nucleon with the static field and retarded field,
respectively. It is shown the double ratio from two reactions
is still an effective observable of symmetry energy without the interference of
electrical field due to using the relativistic calculation compared to the
nonrelativistic calculation.Comment: 8 pages, 7 figures. Abbreviated abstract to meet the criterion of
arXiv platform. Accepted for publication in Physical Review C. An extended
but necessary complementary study to arXiv:1709.0912
Effective-range-expansion study of near threshold heavy-flavor resonances
In this work we study the resonances near the thresholds of the open
heavy-flavor hadrons using the effective-range-expansion method. The unitarity,
analyticity and compositeness coefficient are also taken into account in our
theoretical formalism. We consider the , ,
, and . The scattering lengths and
effective ranges from the relevant elastic -wave scattering amplitudes are
determined. Tentative discussions on the inner structures of the aforementioned
resonances are given.Comment: 10 pages. Comparisons with other ways to define the compositeness are
included. To match the published versio
- β¦