300,221 research outputs found

    Collaborative Systems – Finite State Machines

    Get PDF
    In this paper the finite state machines are defined and formalized. There are presented the collaborative banking systems and their correspondence is done with finite state machines. It highlights the role of finite state machines in the complexity analysis and performs operations on very large virtual databases as finite state machines. It builds the state diagram and presents the commands and documents transition between the collaborative systems states. The paper analyzes the data sets from Collaborative Multicash Servicedesk application and performs a combined analysis in order to determine certain statistics. Indicators are obtained, such as the number of requests by category and the load degree of an agent in the collaborative system.Collaborative System, Finite State Machine, Inputs, States, Outputs

    Automatic multi-label subject indexing in a multilingual environment

    Get PDF
    This paper presents an approach to automatically subject index fulltext documents with multiple labels based on binary support vector machines(SVM). The aim was to test the applicability of SVMs with a real world dataset. We have also explored the feasibility of incorporating multilingual background knowledge, as represented in thesauri or ontologies, into our text document representation for indexing purposes. The test set for our evaluations has been compiled from an extensive document base maintained by the Food and Agriculture Organization (FAO) of the United Nations (UN). Empirical results show that SVMs are a good method for automatic multi- label classification of documents in multiple languages

    Thumbs up? Sentiment Classification using Machine Learning Techniques

    Full text link
    We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.Comment: To appear in EMNLP-200

    Adaptive content mapping for internet navigation

    Get PDF
    The Internet as the biggest human library ever assembled keeps on growing. Although all kinds of information carriers (e.g. audio/video/hybrid file formats) are available, text based documents dominate. It is estimated that about 80% of all information worldwide stored electronically exists in (or can be converted into) text form. More and more, all kinds of documents are generated by means of a text processing system and are therefore available electronically. Nowadays, many printed journals are also published online and may even discontinue to appear in print form tomorrow. This development has many convincing advantages: the documents are both available faster (cf. prepress services) and cheaper, they can be searched more easily, the physical storage only needs a fraction of the space previously necessary and the medium will not age. For most people, fast and easy access is the most interesting feature of the new age; computer-aided search for specific documents or Web pages becomes the basic tool for information-oriented work. But this tool has problems. The current keyword based search machines available on the Internet are not really appropriate for such a task; either there are (way) too many documents matching the specified keywords are presented or none at all. The problem lies in the fact that it is often very difficult to choose appropriate terms describing the desired topic in the first place. This contribution discusses the current state-of-the-art techniques in content-based searching (along with common visualization/browsing approaches) and proposes a particular adaptive solution for intuitive Internet document navigation, which not only enables the user to provide full texts instead of manually selected keywords (if available), but also allows him/her to explore the whole database

    Regulating social problems: the pokies, the Productivity Commission and an Aboriginal Community

    Get PDF
    Australia has 21 per cent of the world’s electronic gaming machines—more commonly known as poker machines. Deregulation of the industry has expanded the availability of gaming machines to an extent unprecedented in the western world. As a result there are estimated to be approximately 300,000 problem gamblers in Australia, an unknown number of whom are Indigenous Australians. This discussion paper documents the first successful Aboriginal use of regulation in order to prevent the installation of electronic gaming machines—a case that took place in South Australia in 1998. At around the same time, the Productivity Commission was conducting an inquiry into Australia’s gambling industries. This discussion paper, offered in part because of the dearth of published material on contemporary Indigenous gambling, discusses how the Productivity Commission dealt with Indigenous gambling and draws some conclusions from the South Australian case

    A preliminary approach to the multilabel classification problem of Portuguese juridical documents

    Get PDF
    Portuguese juridical documents from Supreme Courts and the Attorney General’s Office are manually classified by juridical experts into a set of classes belonging to a taxonomy of concepts. In this paper, a preliminary approach to develop techniques to automat- ically classify these juridical documents, is proposed. As basic strategy, the integration of natural language processing techniques with machine learning ones is used. Support Vector Machines (SVM) are used as learn- ing algorithm and the obtained results are presented and compared with other approaches, such as C4.5 and Naive Bayes

    Segmentation-free Word Spotting for Handwritten Arabic Documents

    Get PDF
    In this paper we present an unsupervised segmentation-free method for spotting and searching query, especially, for images documents in handwritten Arabic, for this, Histograms of Oriented Gradients (HOGs) are used as the feature vectors to represent the query and documents image. Then, we compress the descriptors with the product quantization method. Finally, a better representation of the query is obtained by using the Support Vector Machines (SVM)
    • 

    corecore