14 research outputs found

    Learning curve parameters.

    No full text
    <p>The number of characters learned <i>N</i>, final learning efficiency Ξ›<sub><i>f</i></sub>, and integral learning efficiency βŒ©Ξ›βŒͺ for reference cumulative learning costs of <i>C</i><sub>0</sub> = 500 and <i>C</i><sub>0</sub> = 1500. The Yan et al. algorithm was optimized up to a cumulative learning cost of <i>C</i><sub>0</sub> = 4000.</p

    Learning curves for characters and words.

    No full text
    <p>The green curves correspond to HSK word lists for levels 1 to 4 (shorter curve) and 1 to 6 (longer curve). The yellow curves correspond to word lists generated from two levels of beginner readers. All curves were created using the OLS character decompositions.</p

    Measures of learning efficiency.

    No full text
    <p>The curves <i>A</i> and <i>B</i> represent two different learning curves. For each curve, the final learning efficiency Ξ›<sub><i>f</i></sub> is the cumulative usage frequency for a specific cumulative learning cost <i>C</i><sub>0</sub>, and the integral learning efficiency βŒ©Ξ›βŒͺ is the average cumulative usage frequency between the origin and <i>C</i><sub>0</sub>. Curve <i>A</i> has higher Ξ›<sub><i>f</i></sub> but lower βŒ©Ξ›βŒͺ. Illustrated values for βŒ©Ξ›βŒͺ are approximate.</p

    The first 85 characters of our optimized learning order.

    No full text
    <p>Taken together these characters have a cumulative usage frequency of 0.42.</p

    A network where our algorithm does not return the optimal character order.

    No full text
    <p>A hypothetical network where the integral learning efficiency of the order generated by the algorithm is lower than another possible order. Letters represent Chinese characters (for example, E is a compound character formed from primitives A and B) and the numbers are centralities. Both orders have identical final learning efficiencies.</p

    Usage frequencies for the first 85 characters.

    No full text
    <p>The gray, green and blue bars correspond to the black, green and blue curves in Fig 8. Dark bars represent primitives and light bars represent compounds.</p

    Usage frequency versus number of unique components for the 1000 most common Chinese characters.

    No full text
    <p>This plot shows the weak relationship between character usage frequency and complexity, the latter represented by the number of unique components used to construct the character. Usage frequency is normalized to 1.0 over the whole usage frequency data set, which encompasses more characters than shown in this plot. The six characters illustrated are the most common in each column. Note that the number of unique components is not the same as visual complexity: the characters ζˆ‘ and θ―΄ have similar visual complexity (they are composed of similar numbers of strokes) but ζˆ‘ is conceptually more simple, being, in the OLS character decomposition, composed of two relatively complex primitive components 手 and 戈, compared with the four from which θ―΄ is composed.</p

    Illustration of the topological sort algorithm.

    No full text
    <p>The ordered list is processed from low to high centrality (right to left in the figure). Once ηš„ is reached, its components are checked in turn. η™½ is found to lie to the right of ηš„ and so is repositioned to its left. Likewise ε‹Ί is found to the right of ηš„ and is similarly repositioned. ε‹Ί is positioned to the right of η™½ because it has lower centrality. The centralities used in this figure are for illustrative purposes only.</p

    Structural decomposition of the character η…§.

    No full text
    <p>Primitive characters appear as characters in their own right whereas primitive components do not. The primitive component 灬 is an abbreviated form of the primitive character 火. The parameter <i>r</i> is the SUBTLEX-CH usage frequency rank of the character. Pronunciations are given in pinyin romanization. Note that each character is only assigned a single meaning even though most actually possess a range of broadly related meanings.</p

    Measures of character clustering.

    No full text
    <p>The top panel shows the average distance, in number of characters, to the closest preceding component. The bottom panel shows the average distance, in number of characters, to another character which shares a component. Curves were generated with a fixed cumulative learning cost of <i>C</i><sub>0</sub> = 4000. Averages below 250 characters are not shown because in this region the averages fluctuate wildly.</p
    corecore