23,645 research outputs found

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Brazilian Congress structural balance analysis

    Full text link
    In this work, we study the behavior of Brazilian politicians and political parties with the help of clustering algorithms for signed social networks. For this purpose, we extract and analyze a collection of signed networks representing voting sessions of the lower house of Brazilian National Congress. We process all available voting data for the period between 2011 and 2016, by considering voting similarities between members of the Congress to define weighted signed links. The solutions obtained by solving Correlation Clustering (CC) problems are the basis for investigating deputies voting networks as well as questions about loyalty, leadership, coalitions, political crisis, and social phenomena such as mediation and polarization.Comment: 27 pages, 15 tables, 6 figures; entire article was revised, new references added (including international press); correcting typing error

    Corporate venture capital, strategic alliances, and the governance of newly public firms

    Get PDF
    We examine the effect of investments by corporate venture capitalists (CVCs) on the governance structures of venture backed IPOs. One of the main differences between CVCs and traditional venture capitalists (TVCs) is that the former often invest for strategic reasons and enter into various types of strategic alliances with their portfolio firms that last well beyond the IPO. We argue that the presence of such strategic alliances will have a significant impact on the governance structure of CVC backed firms when they go public and in the following years. Using a sample of venture backed IPOs, we evaluate several hypotheses concerning the role of CVCs in the corporate governance of newly public firms. We find that strategic CVC backed IPOs have weaker CEOs and more outsiders on the board and on the compensation committee than a carefully selected sample of matching firms. In addition, the probability of forced CEOs turnover is higher for strategic CVC backed IPOs, while at the same time these firms use staggered boards more frequently. In contrast, the governance structures of purely financial CVC backed IPO firms and their matching firms do not exhibit any significant differences.

    Context and Keyword Extraction in Plain Text Using a Graph Representation

    Full text link
    Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

    Relevance of Negative Links in Graph Partitioning: A Case Study Using Votes From the European Parliament

    Get PDF
    In this paper, we want to study the informative value of negative links in signed complex networks. For this purpose, we extract and analyze a collection of signed networks representing voting sessions of the European Parliament (EP). We first process some data collected by the VoteWatch Europe Website for the whole 7 th term (2009-2014), by considering voting similarities between Members of the EP to define weighted signed links. We then apply a selection of community detection algorithms, designed to process only positive links, to these data. We also apply Parallel Iterative Local Search (Parallel ILS), an algorithm recently proposed to identify balanced partitions in signed networks. Our results show that, contrary to the conclusions of a previous study focusing on other data, the partitions detected by ignoring or considering the negative links are indeed remarkably different for these networks. The relevance of negative links for graph partitioning therefore is an open question which should be further explored.Comment: in 2nd European Network Intelligence Conference (ENIC), Sep 2015, Karlskrona, Swede
    • …
    corecore