23 research outputs found
The Latent Structure of Dictionaries
How many words (and which ones) are sufficient to define all other words? When dictionaries are analyzed as directed graphs with links from defining words to defined words, they reveal a latent structure. Recursively removing all words that are reachable by definition but that do not define any further words reduces the dictionary to a Kernel of about 10%. This is still not the smallest number of words that can define all the rest. About 75% of the Kernel turns out to be its Core, a Strongly Connected Subset of words with a definitional path to and from any pair of its words and no word’s definition depending on a word outside the set. But the Core cannot define all the rest of the dictionary. The 25% of the Kernel surrounding the Core consists of small strongly connected subsets of words: the Satellites. The size of the smallest set of words that can define all the rest (the graph’s Minimum Feedback Vertex Set or MinSet) is about 1% of the dictionary, 15% of the Kernel, and half-Core, half-Satellite. But every dictionary has a huge number of MinSets. The Core words are learned earlier, more frequent, and less concrete than the Satellites, which in turn are learned earlier and more frequent but more concrete than the rest of the Dictionary. In principle, only one MinSet’s words would need to be grounded through the sensorimotor capacity to recognize and categorize their referents. In a dual-code sensorimotor-symbolic model of the mental lexicon, the symbolic code could do all the rest via re-combinatory definition
Hierarchies in Dictionary De
A dictionary defines words in terms of other words. Definitions can tell you the meanings of words you don't know, but only if you know the meanings of the defining words. How many words do you need to know (and which ones) in order to be able to learn all the rest from definitions? We reduced dictionaries to their "grounding kernels" (GKs), about 10% of the dictionary, from which all the other words could be defined. The GK words turned out to have psycholinguistic correlates: they were learned at an earlier age and more concrete than the rest of the dictionary. But one can compress still more: the GK turns out to have internal structure, with a strongly connected "kernel core" (KC) and a surrounding layer, from which a hierarchy of definitional distances can be derived, all the way out to the periphery of the full dictionary. These definitional distances, too, are correlated with psycholinguistic variables (age of acquisition, concreteness, imageability, oral and written frequency) and hence perhaps with the ``mental lexicon" in each of our heads
On the feedback vertex set polytope of a series-parallel graph
CODES AMS :90C, 52B, 05Cinfo:eu-repo/semantics/publishe
On the maximum orders of an induced forest, an induced tree, and a stable set
Let G be a connected graph, n the order of G, and f (resp. t) the maximum
order of an induced forest (resp. tree) in G. We show that f - t is at most
n - 2√n-1. In the special case where n is of the form a2 + 1 for some
even integer a ≥ 4, f - t is at most n - 2√n-1-1. We also prove
that these bounds are tight. In addition, letting α denote the stability
number of G, we show that α - t is at most n + 1- 2√2n this bound is
also tight
On the maximum orders of an induced forest, an induced tree, and a stable set
Let G be a connected graph, n the order of G, and f (resp. t) the maximum order of an induced forest (resp. tree) in G. We show that f − t is at most n − ⌠2√n − 1⌡ . In the special case where n is of the form a² + 1 for some even integer a ≥ 4, f − t is at most n − ⌠2√n − 1⌡ − 1. We also prove that these bounds are tight. In addition, letting α denote the stability number of G, we show that α − t is at most n + 1 − ⌠2√2n⌡; this bound is also tight