Search CORE

5,409 research outputs found

Resampling methods for document clustering

Author: Stepanov M. G.
Volk D.
Publication venue
Publication date: 01/09/2001
Field of study

We compare the performance of different clustering algorithms applied to the task of unsupervised text categorization. We consider agglomerative clustering algorithms, principal direction divisive partitioning and (for the first time) superparamagnetic clustering with several distance measures. The algorithms have been applied to test databases extracted from the Reuters-21578 text categorization test database. We find that simple application of the different clustering algorithms yields clustering solutions of comparable quality. In order to achieve considerable improvements of the clustering results it is crucial to reduce the dictionary of words considered in the representation of the documents. Significant improvements of the quality of the clustering can be obtained by identifying discriminative words and filtering out indiscriminative words from the dictionary. We present two methods, each based on a resampling scheme, for selecting discriminative words in an unsupervised way.Comment: RevTeX, 9 pages, 2 figure

arXiv.org e-Print Archive

New Bisoltion Solutions in Dispersion Managed Systems

Author: Shkarayev M.
Stepanov M. G.
Publication venue
Publication date: 26/11/2007
Field of study

In this paper we propose a method which provides a full description of solitary wave solutions of the Schroedinger equation with periodically varying dispersion. This method is based on analysis and polynomial deformation of the spectrum of an iterative map. Using this method we discover a new family of antisymmetric bisoliton solutions. In addition to the fact that these solutions are of interest for nonlinear fiber optics and the theory of nonlinear Schroedinger equations with periodic coefficients, they have potential applications for increasing of bit-rate in high speed optical fiber communications

arXiv.org e-Print Archive

Rain initiation time in turbulent warm clouds

Author: Falkovich G.
Stepanov M. G.
Vucelja M.
Publication venue
Publication date: 21/11/2004
Field of study

We present a mean-field model that describes droplet growth due to condensation and collisions and droplet loss due to fallout. The model allows for an effective numerical simulation. We study how the rain initiation time depends on different parameters. We also present a simple model that allows one to estimate the rain initiation time for turbulent clouds with an inhomogeneous concentration of cloud condensation nuclei. In particular, we show that over-seeding even a part of a cloud by small hygroscopic nuclei one can substantially delay the onset of precipitation.Comment: submitted to Journal of Applied Meteorolog

arXiv.org e-Print Archive

CERN Document Server

On higher order Codazzi tensors on complete Riemannian manifolds

Author: Shandra I. G.
Stepanov S. E.
Publication venue
Publication date: 14/12/2018
Field of study

We prove several Liouville-type non-existence theorems for higher order Codazzi tensors and classical Codazzi tensors on complete and compact Riemannian manifolds, in particular. These results will be obtained by using theorems of the connections between the geometry of a complete smooth manifold and the global behavior of its subharmonic functions. In conclusion, we show applications of this method for global geometry of a complete locally conformally flat Riemannian manifold with constant scalar curvature because its Ricci tensor is a Codazzi tensor and for global geometry of a complete hypersurface in a standard sphere because its second fundamental form is also a Codazzi tensor

arXiv.org e-Print Archive

Analysis of spatial correlations in a model 2D liquid through eigenvalues and eigenvectors of atomic level stress matrices

Author: Levashov V. A.
Stepanov M. G.
Publication venue: 'American Physical Society (APS)'
Publication date: 27/07/2015
Field of study

Considerations of local atomic level stresses associated with each atom represent a particular approach to address structures of disordered materials at the atomic level. We studied structural correlations in a two-dimensional model liquid using molecular dynamics simulations in the following way. We diagonalized the atomic level stress tensors of every atom and investigated correlations between the eigenvalues and orientations of the eigenvectors of different atoms as a function of distance between them. It is demonstrated that the suggested approach can be used to characterize structural correlations in disordered materials. In particular, we found that changes in the stress correlation functions on decrease of temperature are the most pronounced for the pairs of atoms with separation distance that corresponds to the first minimum in the pair density function. We also show that the angular dependencies of the stress correlation functions previously reported in [Phys. Rev. E v.91, 032301 (2015)] related not to the alleged anisotropies of the Eshelby's stress fields, but to the rotational properties of the stress tensors.Comment: 14 pages, 9 figure

arXiv.org e-Print Archive

Predicting Failures in Power Grids: The Case of Static Overloads

Author: Chertkov Michael
Pan Feng
Stepanov Mikhail G.
Publication venue
Publication date: 15/09/2010
Field of study

Here we develop an approach to predict power grid weak points, and specifically to efficiently identify the most probable failure modes in static load distribution for a given power network. This approach is applied to two examples: Guam's power system and also the IEEE RTS-96 system, both modeled within the static Direct Current power flow model. Our algorithm is a power network adaption of the worst configuration heuristics, originally developed to study low probability events in physics and failures in error-correction. One finding is that, if the normal operational mode of the grid is sufficiently healthy, the failure modes, also called instantons, are sufficiently sparse, i.e. the failures are caused by load fluctuations at only a few buses. The technique is useful for discovering weak links which are saturated at the instantons. It can also identify generators working at the capacity and generators under capacity, thus providing predictive capability for improving the reliability of any power network.Comment: 11 pages, 10 figure

arXiv.org e-Print Archive

An Efficient Pseudo-Codeword Search Algorithm for Linear Programming Decoding of LDPC Codes

Author: Chertkov Michael
Stepanov Mikhail G.
Publication venue
Publication date: 01/01/2006
Field of study

In Linear Programming (LP) decoding of a Low-Density-Parity-Check (LDPC) code one minimizes a linear functional, with coefficients related to log-likelihood ratios, over a relaxation of the polytope spanned by the codewords \cite{03FWK}. In order to quantify LP decoding, and thus to describe performance of the error-correction scheme at moderate and large Signal-to-Noise-Ratios (SNR), it is important to study the relaxed polytope to understand better its vertexes, so-called pseudo-codewords, especially those which are neighbors of the zero codeword. In this manuscript we propose a technique to heuristically create a list of these neighbors and their distances. Our pseudo-codeword-search algorithm starts by randomly choosing the initial configuration of the noise. The configuration is modified through a discrete number of steps. Each step consists of two sub-steps. Firstly, one applies an LP decoder to the noise-configuration deriving a pseudo-codeword. Secondly, one finds configuration of the noise equidistant from the pseudo codeword and the zero codeword. The resulting noise configuration is used as an entry for the next step. The iterations converge rapidly to a pseudo-codeword neighboring the zero codeword. Repeated many times, this procedure is characterized by the distribution function (frequency spectrum) of the pseudo-codeword effective distance. The effective distance of the coding scheme is approximated by the shortest distance pseudo-codeword in the spectrum. The efficiency of the procedure is demonstrated on examples of the Tanner

[155,64,20]

code and Margulis

p=7

and

p=11

codes (672 and 2640 bits long respectively) operating over an Additive-White-Gaussian-Noise (AWGN) channel.Comment: 5 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Improving convergence of Belief Propagation decoding

Author: Chertkov M.
Stepanov M. G.
Publication venue
Publication date: 25/07/2006
Field of study

The decoding of Low-Density Parity-Check codes by the Belief Propagation (BP) algorithm is revisited. We check the iterative algorithm for its convergence to a codeword (termination), we run Monte Carlo simulations to find the probability distribution function of the termination time, n_it. Tested on an example [155, 64, 20] code, this termination curve shows a maximum and an extended algebraic tail at the highest values of n_it. Aiming to reduce the tail of the termination curve we consider a family of iterative algorithms modifying the standard BP by means of a simple relaxation. The relaxation parameter controls the convergence of the modified BP algorithm to a minimum of the Bethe free energy. The improvement is experimentally demonstrated for Additive-White-Gaussian-Noise channel in some range of the signal-to-noise ratios. We also discuss the trade-off between the relaxation parameter of the improved iterative scheme and the number of iterations

arXiv.org e-Print Archive

Instanton analysis of Low-Density-Parity-Check codes in the error-floor regime

Author: Chertkov M.
Stepanov M. G.
Publication venue
Publication date: 16/01/2006
Field of study

In this paper we develop instanton method introduced in [1], [2], [3] to analyze quantitatively performance of Low-Density-Parity-Check (LDPC) codes decoded iteratively in the so-called error-floor regime. We discuss statistical properties of the numerical instanton-amoeba scheme focusing on detailed analysis and comparison of two regular LDPC codes: Tanner's (155, 64, 20) and Margulis' (672, 336, 16) codes. In the regime of moderate values of the signal-to-noise ratio we critically compare results of the instanton-amoeba evaluations against the standard Monte-Carlo calculations of the Frame-Error-Rate.Comment: 5 pages, 5 figure

arXiv.org e-Print Archive

Testing the magnetic field models of galaxies with the SKA

Author: Arshakian T. G.
Beck R.
Frick P.
Krause M.
Stepanov R.
Publication venue
Publication date: 03/12/2007
Field of study

The future new-generation radio telescope SKA (Square Kilometre Array) and its precursors will provide a rapidly growing number of polarized radio sources. Hundred and thousands polarized background sources can be measured towards nearby galaxies thus allowing their detailed magnetic field mapping by means of Faraday rotation measures (RM). We aim to estimate the required density of the background polarized sources detected with the SKA for reliable recognition and reconstruction of the magnetic field structure in nearby spiral galaxies. We construct a galaxy model which includes the ionized gas and magnetic field patterns of different azimuthal symmetry (axisymmetric (ASS), bisymmetric (BSS) and quadrisymmetric spiral (QSS), and superpositions) plus a halo magnetic field. RM fluctuations with a Kolmogorov spectrum due to turbulent fields and/or fluctuations in ionized gas density are superimposed. Recognition of magnetic structures is possible from RM towards background sources behind galaxies or a continuous RM map obtained from the diffuse polarized emission from the galaxy itself. Under favourite conditions, about a few dozens of polarized sources are sufficient for a reliable recognition. Reconstruction of the field structure without precognition becomes possible for a large number of background sources. A reliable reconstruction of the field structure needs at least 20 RM values on a cut along the projected minor axis which translates to approximately 1200 sources towards the galaxy. Radio telescopes operating at low frequencies (LOFAR, ASKAP and the low-frequency SKA array) may also be useful instruments for field recognition or reconstruction with the help of RM, if background sources are still significantly polarized at low frequencies (abriged).Comment: 5 pages, 2 figures; contribution to the proceedings of the meeting "From planets to dark energy: the modern radio universe", 1-5 October, Manchester, UK (minor changes are added in the replaced version

arXiv.org e-Print Archive