Search CORE

23,645 research outputs found

Machine Learning in Automated Text Categorization

Author: ANDROUTSOPOULOS I.
ATTARDI G.
BAKER L.D.
BIEBRICHER P.
CAROPRESO M.F.
CAVNAR W.B.
CHAKRABARTI S.
CLACK C.
CLEVERDON C.
COHEN W. W.
COHEN W. W.
COHEN W.W.
DAGAN I.
DEERWESTER S.
DENOYER L.
DIAZ ESTEBAN A.
DRUCKER H.
DUMAIS S.T.
DUMAIS S.T.
ESCUDERO G.
Fabrizio Sebastiani
FIELD B.
FORSYTH R. S.
FUHR N.
FUHR N.
FUHR N.
FURNKRANZ J.
GALAVOTTI L.
GALE W. A.
GOVERT N.
GRAY W.A.
GUTHRIE L.
HAYES P.J.
HEAPS H.
HERSH W.
HULL D. A.
HULL D. A.
ITTNER D.J.
IWAYAMA M.
IYER R.D.
JOACHIMS T.
JOACHIMS T.
JOACHIMS T.
JOHN G. H.
JUNKER M.
JUNKER M.
KESSLER B.
KIM Y.-H.
KLINKENBERG R.
KNORZ G.
KOLLER D.
LAM S.L.
LAM W.
LAM W.
LANG K.
LARKEY L. S.
LARKEY L. S.
LARKEY L.S.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LI H.
LI Y.H.
LIERE R.
LIM J. H.
MASAND B.
MASAND B.
MCCALLUM A. K.
MCCALLUM A.K.
MLADENIC D.
MLADENIC D.
MOULINIER I.
MOULINIER I.
MYERS K.
NG H.T.
OH H.-J.
PAZIENZA M. T.
RILOFF E.
ROBERTSON S.E.
ROBERTSON S.E.
ROTH D.
RUIZ M.E.
SABLE C.L.
SARACEVIC T.
SCHAPIRE R. E.
SCHUTZE H.
SCHUTZE H.
SCOTT S.
SEBASTIANI F.
SINGHAL A.
SLONIM N.
TAIRA H.
TUMER K.
TZERAS K.
VAN RIJSBERGEN C. J.
WIENER E.D.
YANG Y.
YANG Y.
YANG Y.
YANG Y.
YU K.L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2001
Field of study

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

arXiv.org e-Print Archive

CiteSeerX

Crossref

Brazilian Congress structural balance analysis

Author: Frota Yuri
Levorato Mario
Publication venue
Publication date: 22/11/2016
Field of study

In this work, we study the behavior of Brazilian politicians and political parties with the help of clustering algorithms for signed social networks. For this purpose, we extract and analyze a collection of signed networks representing voting sessions of the lower house of Brazilian National Congress. We process all available voting data for the period between 2011 and 2016, by considering voting similarities between members of the Congress to define weighted signed links. The solutions obtained by solving Correlation Clustering (CC) problems are the basis for investigating deputies voting networks as well as questions about loyalty, leadership, coalitions, political crisis, and social phenomena such as mediation and polarization.Comment: 27 pages, 15 tables, 6 figures; entire article was revised, new references added (including international press); correcting typing error

arXiv.org e-Print Archive

Episciences.org

Corporate venture capital, strategic alliances, and the governance of newly public firms

Author: Ivanov Vladimir I.
Masulis Ronald W.
Publication venue
Publication date
Field of study

We examine the effect of investments by corporate venture capitalists (CVCs) on the governance structures of venture backed IPOs. One of the main differences between CVCs and traditional venture capitalists (TVCs) is that the former often invest for strategic reasons and enter into various types of strategic alliances with their portfolio firms that last well beyond the IPO. We argue that the presence of such strategic alliances will have a significant impact on the governance structure of CVC backed firms when they go public and in the following years. Using a sample of venture backed IPOs, we evaluate several hypotheses concerning the role of CVCs in the corporate governance of newly public firms. We find that strategic CVC backed IPOs have weaker CEOs and more outsiders on the board and on the compensation committee than a carefully selected sample of matching firms. In addition, the probability of forced CEOs turnover is higher for strategic CVC backed IPOs, while at the same time these firms use staggered boards more frequently. In contrast, the governance structures of purely financial CVC backed IPO firms and their matching firms do not exhibit any significant differences.

Research Papers in Economics

Context and Keyword Extraction in Plain Text Using a Graph Representation

Author: Chahine Carlo Abi
Chaignaud Nathalie
Kotowicz Jean-Philippe
Pécuchet Jean-Pierre
Publication venue
Publication date: 30/11/2008
Field of study

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

Developing the Intervention and Outcome Components of a Proposed Randomised Controlled trial (RCT) of a National Screening Programme for Open Angle Glaucoma (OAG) : Medical Research Council funded trial platform study (G0701759): Study protocol

Author: Azuara-Blanco Augusto
Burr Jennifer
Hernández Rodolfo Andrés
Ramsay Craig R
Vale Luke David
Publication venue
Publication date: 01/01/2008
Field of study

Postprin

Aberdeen University Research

Relevance of Negative Links in Graph Partitioning: A Case Study Using Votes From the European Parliament

Author: Figueiredo Rosa
Labatut Vincent
Mendonça Israel
Michelon Philippe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/07/2015
Field of study

In this paper, we want to study the informative value of negative links in signed complex networks. For this purpose, we extract and analyze a collection of signed networks representing voting sessions of the European Parliament (EP). We first process some data collected by the VoteWatch Europe Website for the whole 7 th term (2009-2014), by considering voting similarities between Members of the EP to define weighted signed links. We then apply a selection of community detection algorithms, designed to process only positive links, to these data. We also apply Parallel Iterative Local Search (Parallel ILS), an algorithm recently proposed to identify balanced partitions in signed networks. Our results show that, contrary to the conclusions of a previous study focusing on other data, the partitions detected by ignoring or considering the negative links are indeed remarkably different for these networks. The relevance of negative links for graph partitioning therefore is an open question which should be further explored.Comment: in 2nd European Network Intelligence Conference (ENIC), Sep 2015, Karlskrona, Swede

arXiv.org e-Print Archive

Crossref