2,692 research outputs found

    On the Doubt about Margin Explanation of Boosting

    Full text link
    Margin theory provides one of the most popular explanations to the success of \texttt{AdaBoost}, where the central point lies in the recognition that \textit{margin} is the key for characterizing the performance of \texttt{AdaBoost}. This theory has been very influential, e.g., it has been used to argue that \texttt{AdaBoost} usually does not overfit since it tends to enlarge the margin even after the training error reaches zero. Previously the \textit{minimum margin bound} was established for \texttt{AdaBoost}, however, \cite{Breiman1999} pointed out that maximizing the minimum margin does not necessarily lead to a better generalization. Later, \cite{Reyzin:Schapire2006} emphasized that the margin distribution rather than minimum margin is crucial to the performance of \texttt{AdaBoost}. In this paper, we first present the \textit{kkth margin bound} and further study on its relationship to previous work such as the minimum margin bound and Emargin bound. Then, we improve the previous empirical Bernstein bounds \citep{Maurer:Pontil2009,Audibert:Munos:Szepesvari2009}, and based on such findings, we defend the margin-based explanation against Breiman's doubts by proving a new generalization error bound that considers exactly the same factors as \cite{Schapire:Freund:Bartlett:Lee1998} but is sharper than \cite{Breiman1999}'s minimum margin bound. By incorporating factors such as average margin and variance, we present a generalization error bound that is heavily related to the whole margin distribution. We also provide margin distribution bounds for generalization error of voting classifiers in finite VC-dimension space.Comment: 35 page

    The minimal polynomial of sequence obtained from componentwise linear transformation of linear recurring sequence

    Full text link
    Let S=(s1,s2,...,sm,...)S=(s_1,s_2,...,s_m,...) be a linear recurring sequence with terms in GF(qn)GF(q^n) and TT be a linear transformation of GF(qn)GF(q^n) over GF(q)GF(q). Denote T(S)=(T(s1),T(s2),...,T(sm),...)T(S)=(T(s_1),T(s_2),...,T(s_m),...). In this paper, we first present counter examples to show the main result in [A.M. Youssef and G. Gong, On linear complexity of sequences over GF(2n)GF(2^n), Theoretical Computer Science, 352(2006), 288-292] is not correct in general since Lemma 3 in that paper is incorrect. Then, we determine the minimal polynomial of T(S)T(S) if the canonical factorization of the minimal polynomial of SS without multiple roots is known and thus present the solution to the problem which was mainly considered in the above paper but incorrectly solved. Additionally, as a special case, we determine the minimal polynomial of T(S)T(S) if the minimal polynomial of SS is primitive. Finally, we give an upper bound on the linear complexity of T(S)T(S) when TT exhausts all possible linear transformations of GF(qn)GF(q^n) over GF(q)GF(q). This bound is tight in some cases.Comment: This paper was submitted to the journal Theoretical Computer Scienc

    The Minimal Polynomial over F_q of Linear Recurring Sequence over F_{q^m}

    Full text link
    Recently, motivated by the study of vectorized stream cipher systems, the joint linear complexity and joint minimal polynomial of multisequences have been investigated. Let S be a linear recurring sequence over finite field F_{q^m} with minimal polynomial h(x) over F_{q^m}. Since F_{q^m} and F_{q}^m are isomorphic vector spaces over the finite field F_q, S is identified with an m-fold multisequence S^{(m)} over the finite field F_q. The joint minimal polynomial and joint linear complexity of the m-fold multisequence S^{(m)} are the minimal polynomial and linear complexity over F_q of S respectively. In this paper, we study the minimal polynomial and linear complexity over F_q of a linear recurring sequence S over F_{q^m} with minimal polynomial h(x) over F_{q^m}. If the canonical factorization of h(x) in F_{q^m}[x] is known, we determine the minimal polynomial and linear complexity over F_q of the linear recurring sequence S over F_{q^m}.Comment: Submitted to the journal Finite Fields and Their Application

    Black hole central engine for ultra-long gamma-ray burst 111209A and its associated supernova 2011kl

    Full text link
    Recently, the first association between an ultra-long gamma-ray burst (GRB) and a supernova is reported, i.e., GRB 111209A/SN 2011kl, which enables us to investigate the physics of central engines or even progenitors for ultra-long GRBs. In this paper, we inspect the broad-band data of GRB 111209A/SN 2011kl. The late-time X-ray lightcurve exhibits a GRB 121027A-like fall-back bump, suggesting a black hole central engine. We thus propose a collapsar model with fall-back accretion for GRB 111209A/SN 2011kl. The required model parameters, such as the total mass and radius of the progenitor star, suggest that the progenitor of GRB 111209A is more likely a Wolf-Rayet star instead of blue supergiant, and the central engine of this ultra-long burst is a black hole. The implications of our results is discussed.Comment: Accepted for publication in Ap

    On the Resistance of Nearest Neighbor to Random Noisy Labels

    Full text link
    Nearest neighbor has always been one of the most appealing non-parametric approaches in machine learning, pattern recognition, computer vision, etc. Previous empirical studies partly shows that nearest neighbor is resistant to noise, yet there is a lack of deep analysis. This work presents the finite-sample and distribution-dependent bounds on the consistency of nearest neighbor in the random noise setting. The theoretical results show that, for asymmetric noises, k-nearest neighbor is robust enough to classify most data correctly, except for a handful of examples, whose labels are totally misled by random noises. For symmetric noises, however, k-nearest neighbor achieves the same consistent rate as that of noise-free setting, which verifies the resistance of k-nearest neighbor to random noisy labels. Motivated by the theoretical analysis, we propose the Robust k-Nearest Neighbor (RkNN) approach to deal with noisy labels. The basic idea is to make unilateral corrections to examples, whose labels are totally misled by random noises, and classify the others directly by utilizing the robustness of k-nearest neighbor. We verify the effectiveness of the proposed algorithm both theoretically and empirically.Comment: 35 page

    Effects of the optimisation of the margin distribution on generalisation in deep architectures

    Full text link
    Despite being so vital to success of Support Vector Machines, the principle of separating margin maximisation is not used in deep learning. We show that minimisation of margin variance and not maximisation of the margin is more suitable for improving generalisation in deep architectures. We propose the Halfway loss function that minimises the Normalised Margin Variance (NMV) at the output of a deep learning models and evaluate its performance against the Softmax Cross-Entropy loss on the MNIST, smallNORB and CIFAR-10 datasets

    Dynamic tomography of the spin-orbit coupling in nonlinear optics

    Full text link
    Spin-orbit coupled (SOC) light fields with spatially inhomogeneous polarization have attracted increasing research interest within the optical community. In particular, owing to their spin-dependent phase and spatial structures, many nonlinear optical phenomena which we have been familiar with up to now are being re-examined, hence a revival of research in nonlinear optics. To fully investigate this topic, knowledge on how the topological structure of the light field evolves is necessary, but, as yet, there are few studies that address the structural evolution of the light field. Here, for the first time, we present a universal approach for theoretical tomographic treatment of the structural evolution of SOC light in nonlinear optics processes. Based on a Gedanken vector second harmonic generation, a fine-grained slice of evolving SOC light in a nonlinear interaction and the following diffraction propagation are studied theoretically and verified experimentally, and which at the same time reveal several interesting phenomena. The approach provides a useful tool for enhancing our capability to obtain a more nuanced understanding of vector nonlinear optics, and sets a foundation for further broad-based studies in nonlinear systems.Comment: 10 pages, 7 figure

    On the Consistency of AUC Pairwise Optimization

    Full text link
    AUC (area under ROC curve) is an important evaluation criterion, which has been popularly used in many learning tasks such as class-imbalance learning, cost-sensitive learning, learning to rank, etc. Many learning approaches try to optimize AUC, while owing to the non-convexity and discontinuousness of AUC, almost all approaches work with surrogate loss functions. Thus, the consistency of AUC is crucial; however, it has been almost untouched before. In this paper, we provide a sufficient condition for the asymptotic consistency of learning approaches based on surrogate loss functions. Based on this result, we prove that exponential loss and logistic loss are consistent with AUC, but hinge loss is inconsistent. Then, we derive the qq-norm hinge loss and general hinge loss that are consistent with AUC. We also derive the consistent bounds for exponential loss and logistic loss, and obtain the consistent bounds for many surrogate loss functions under the non-noise setting. Further, we disclose an equivalence between the exponential surrogate loss of AUC and exponential surrogate loss of accuracy, and one straightforward consequence of such finding is that AdaBoost and RankBoost are equivalent

    Beam energy dependence of the relativistic retardation effects of electrical fields on the Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio in heavy-ion collisions

    Full text link
    In this article we investigate the beam energy dependence of relativistic retardation effects of electrical fields on the single and double Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratios in three heavy-ion reactions with an isospin- and momentum-dependent transport model IBUU11. With the beam energy increasing from 200 to 400 MeV/nucleon, effects of the relativistically retarded electrical fields on the Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio are found to increase gradually from negligibly to considerably significant as expectedly; it is however, the interesting observation is the relativistic retardation effects of electrical fields on the Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio are becoming gradually insignificant as the beam energy further increasing from 400 to 800 MeV/nucleon. Moreover, we also investigate the isospin dependence of relativistic retardation effects of electrical fields on the Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio in two isobar reaction systems of 96^{96}Ru+96^{96}Ru and 96^{96}Zr+96^{96}Zr at the beam energies from 200 to 800 MeV/nucleon. It is shown that the relativistic retardation effects of electrical fields on the Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio are independent of the isospin of reaction. Furthermore, we also examine the double Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio in reactions of 96^{96}Zr+96^{96}Zr over 96^{96}Ru+96^{96}Ru at the beam energies from 200 to 800 MeV/nucleon with the static field and retarded field, respectively. It is shown the double Ο€βˆ’/Ο€+\pi^{-}/\pi^{+} ratio from two reactions is still an effective observable of symmetry energy without the interference of electrical field due to using the relativistic calculation compared to the nonrelativistic calculation.Comment: 8 pages, 7 figures. Abbreviated abstract to meet the criterion of arXiv platform. Accepted for publication in Physical Review C. An extended but necessary complementary study to arXiv:1709.0912

    Effective-range-expansion study of near threshold heavy-flavor resonances

    Full text link
    In this work we study the resonances near the thresholds of the open heavy-flavor hadrons using the effective-range-expansion method. The unitarity, analyticity and compositeness coefficient are also taken into account in our theoretical formalism. We consider the Zc(3900)Z_c(3900), X(4020)X(4020), Ο‡c1(4140)\chi_{c1}(4140), ψ(4260)\psi(4260) and ψ(4660)\psi(4660). The scattering lengths and effective ranges from the relevant elastic SS-wave scattering amplitudes are determined. Tentative discussions on the inner structures of the aforementioned resonances are given.Comment: 10 pages. Comparisons with other ways to define the compositeness are included. To match the published versio
    • …
    corecore