6 research outputs found
Largest sparse subgraphs of random graphs
For the Erd\H{o}s-R\'enyi random graph G(n,p), we give a precise asymptotic
formula for the size of a largest vertex subset in G(n,p) that induces a
subgraph with average degree at most t, provided that p = p(n) is not too small
and t = t(n) is not too large. In the case of fixed t and p, we find that this
value is asymptotically almost surely concentrated on at most two explicitly
given points. This generalises a result on the independence number of random
graphs. For both the upper and lower bounds, we rely on large deviations
inequalities for the binomial distribution.Comment: 15 page
How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: application to text mining
This article is an extended version of a paper presented in the WSOM'2012
conference [1]. We display a combination of factorial projections, SOM
algorithm and graph techniques applied to a text mining problem. The corpus
contains 8 medieval manuscripts which were used to teach arithmetic techniques
to merchants. Among the techniques for Data Analysis, those used for
Lexicometry (such as Factorial Analysis) highlight the discrepancies between
manuscripts. The reason for this is that they focus on the deviation from the
independence between words and manuscripts. Still, we also want to discover and
characterize the common vocabulary among the whole corpus. Using the properties
of stochastic Kohonen maps, which define neighborhood between inputs in a
non-deterministic way, we highlight the words which seem to play a special role
in the vocabulary. We call them fickle and use them to improve both Kohonen map
robustness and significance of FCA visualization. Finally we use graph
algorithmic to exploit this fickleness for classification of words