185 research outputs found

    Entropy of English text: Experiments with humans and a machine learning system based on rough sets

    Get PDF
    The goal of this paper is to show the dependency of the entropy of English text on the subject of the experiment, the type of English text, and the methodology used to estimate the entropy. Claude Shannon first described the technique for estimating the entropy of English text by a human subject guessing the next letter after viewing a string of characters taken from actual text. We show how this result is affected by using different humans in the experiment (Shannon used only his wife) and by using different types of text material (Shannon used only a single book). We also show how the results are affected when we replace the human subjects with a machine learning system based on rough sets. Automating the play of the guessing game with this system, called LERS, gives rise to a lossless data compression scheme. (C) Elsevier Science Inc. 1998

    Dealing with Missing Data and Uncertainty in the Context of Data Mining

    Get PDF
    Missing data is an issue in many real-world datasets yet robust methods for dealing with missing data appropriately still need development. In this paper we conduct an investigation of how some methods for handling missing data perform when the uncertainty increases. Using benchmark datasets from the UCI Machine Learning repository we generate datasets for our experimentation with increasing amounts of data Missing Completely At Random (MCAR) both at the attribute level and at the record level. We then apply four classification algorithms: C4.5, Random Forest, Naïve Bayes and Support Vector Machines (SVMs). We measure the performance of each classifiers on the basis of complete case analysis, simple imputation and then we study the performance of the algorithms that can handle missing data. We find that complete case analysis has a detrimental effect because it renders many datasets infeasible when missing data increases, particularly for high dimensional data. We find that increasing missing data does have a negative effect on the performance of all the algorithms tested but the different algorithms tested either using preprocessing in the form of simple imputation or handling the missing data do not show a significant difference in performance

    Around the Hossz\'u-Gluskin theorem for nn-ary groups

    Get PDF
    We survey results related to the important Hossz\'u-Gluskin Theorem on nn-ary groups adding also several new results and comments. The aim of this paper is to write all such results in uniform and compressive forms. Therefore some proofs of new results are only sketched or omitted if their completing seems to be not too difficult for readers. In particular, we show as the Hossz\'u-Gluskin Theorem can be used for evaluation how many different nn-ary groups (up to isomorphism) exist on some small sets. Moreover, we sketch as the mentioned theorem can be also used for investigation of Q\mathcal{Q}-independent subsets of semiabelian nn-ary groups for some special families Q\mathcal{Q} of mappings

    Co-optation & Clientelism: Nested Distributive Politics in China’s Single-Party Dictatorship

    Full text link
    What explains the persistent growth of public employment in reform-era China despite repeated and forceful downsizing campaigns? Why do some provinces retain more public employees and experience higher rates of bureaucratic expansion than others? Among electoral regimes, the creation and distribution of public jobs is typically attributed to the politics of vote buying and multi-party competition. Electoral factors, however, cannot explain the patterns observed in China’s single-party dictatorship. This study highlights two nested factors that influence public employment in China: party co-optation and personal clientelism. As a collective body, the ruling party seeks to co-opt restive ethnic minorities by expanding cadre recruitment in hinterland provinces. Within the party, individual elites seek to expand their own networks of power by appointing clients to office. The central government’s professed objective of streamlining bureaucracy is in conflict with the party’s co-optation goal and individual elites’ clientelist interest. As a result, the size of public employment has inflated during the reform period despite top-down mandates to downsize bureaucracy.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/116599/1/Ang, Cooptation & Clientelism, posted 2016-01.pdfDescription of Ang, Cooptation & Clientelism, posted 2016-01.pdf : First Onlin

    The historical origins of corruption in the developing world: a comparative analysis of East Asia

    Get PDF
    A new approach has emerged in the literature on corruption in the developing world that breaks with the assumption that corruption is driven by individualistic self-interest and, instead, conceptualizes corruption as an informal system of norms and practices. While this emerging neo-institutionalist approach has done much to further our understanding of corruption in the developing world, one key question has received relatively little attention: how do we explain differences in the institutionalization of corruption between developing countries? The paper here addresses this question through a systematic comparison of seven developing and newly industrialized countries in East Asia. The argument that emerges through this analysis is that historical sequencing mattered: countries in which the "political marketplace" had gone through a process of concentration before universal suffrage was introduced are now marked by less harmful types of corruption than countries where mass voting rights where rolled out in a context of fragmented political marketplaces. The paper concludes by demonstrating that this argument can be generalized to the developing world as a whole

    Citizenship Norms in Eastern Europe

    Get PDF
    Research on Eastern Europe stresses the weakness of its civil society and the lack of political and social involvement, neglecting the question: What do people themselves think it means to be a good citizen? This study looks at citizens’ definitions of good citizenship in Poland, Slovenia, the Czech Republic and Hungary, using 2002 European Social Survey data. We investigate mean levels of civic mindedness in these countries and perform regression analyses to investigate whether factors traditionally associated with civic and political participation are also correlated with citizenship norms across Eastern Europe. We show that mean levels of civic mindedness differ significantly across the four Eastern European countries. We find some support for theories on civic and political participation when explaining norms of citizenship, but also demonstrate that individual-level characteristics are differently related to citizenship norms across the countries of our study. Hence, our findings show that Eastern Europe is not a monolithic and homogeneous bloc, underscoring the importance of taking the specificities of countries into account

    State owned enterprises as bribe payers: the role of institutional environment

    Get PDF
    Our paper draws attention to a neglected channel of corruption—the bribe payments by state-owned enterprises (SOEs). This is an important phenomenon as bribe payments by SOEs fruitlessly waste national resources, compromising public welfare and national prosperity. Using a large dataset of 30,249 firms from 50 countries, we show that, in general, SOEs are less likely to pay bribes for achieving organizational objectives owing to their political connectivity. However, in deteriorated institutional environments, SOEs may be subjected to potential managerial rent-seeking behaviors, which disproportionately increase SOE bribe propensity relative to privately owned enterprises. Specifically, our findings highlight the importance of fostering democracy and rule of law, reducing prevalence of corruption and shortening power distance in reducing the incidence of SOE bribery
    corecore