312,527 research outputs found

    Supporting text mining for e-Science: the challenges for Grid-enabled natural language processing

    Get PDF
    Over the last few years, language technology has moved rapidly from 'applied research' to 'engineering', and from small-scale to large-scale engineering. Applications such as advanced text mining systems are feasible, but very resource-intensive, while research seeking to address the underlying language processing questions faces very real practical and methodological limitations. The e-Science vision, and the creation of the e-Science Grid, promises the level of integrated large-scale technological support required to sustain this important and successful new technology area. In this paper, we discuss the foundations for the deployment of text mining and other language technology on the Grid - the protocols and tools required to build distributed large-scale language technology systems, meeting the needs of users, application builders and researchers

    Using compression to identify acronyms in text

    Get PDF
    Text mining is about looking for patterns in natural language text, and may be defined as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key technology for text mining, and backed this up with a study that showed how particular kinds of lexical tokens---names, dates, locations, etc.---can be identified and located in running text, using compression models to provide the leverage necessary to distinguish different token types (Witten et al., 1999)Comment: 10 pages. A short form published in DCC200

    Identification of Text Mining Use Cases In Manufacturing Companies

    Get PDF
    Manufacturing companies face the challenge of managing vast amounts of unstructured data generated by various sources such as social media, customer feedback, product reviews, and supplier data. Text-mining technology, a branch of data mining and natural language processing, provides a solution to extract valuable insights from unstructured data, enabling manufacturing companies to make informed decisions and improve their processes. Despite the potential benefits of text mining technology, many manufacturing companies struggle to implement use cases due to various reasons. Therefore, the project VoBAKI (IGF-Project No.: 22009 N) aims to enable manufacturing companies to identify and implement text mining use cases in their processes and decision-making processes. The paper presents an analysis of text mining use cases in manufacturing companies using Mayring's content analysis and case study research. The study aims to explore how text mining technology can be effectively used in improving production processes and decision-making in manufacturing companies

    TEXT MINING TECHNOLOGY TO SUPPORT ENTERPRISE KNOWLEDGE MANAGEMENT

    Get PDF
    A successful flexible enterprise must have an organization knowledge-based. In an era characterized by change, globalization and competition, knowledge is without doubt the most important asset for a company to gain a competitive advantage. Nowadays, in the enterprise, there is a huge amount of unstructured information especially in textual documents. The Text Mining technology, in the Knowledge Management platform, is the most important tool to manage continually this information.knowledge management, text mining, unstructured information, enterprise information system.

    Application of Biomedical Text Mining

    Get PDF
    With the enormous volume of biological literature, increasing growth phenomenon due to the high rate of new publications is one of the most common motivations for the biomedical text mining. Aiming at this massive literature to process, it could extract more biological information for mining biomedical knowledge. Using the information will help understand the mechanism of disease generation, promote the development of disease diagnosis technology, and promote the development of new drugs in the field of biomedical research. Based on the background, this chapter introduces the rise of biomedical text mining. Then, it describes the biomedical text-mining technology, namely natural language processing, including the several components. This chapter emphasizes the two aspects in biomedical text mining involving static biomedical information recognization and dynamic biomedical information extraction using instance analysis from our previous works. The aim is to provide a way to quickly understand biomedical text mining for some researchers

    Consumer-Oriented Tech Mining: Integrating the Consumer Perspective into Organizational Technology Intelligence - The Case of Autonomous Driving

    Get PDF
    To avoid missing technological opportunities and to counteract risks, organizations have to scan and monitor developments in the external environment through a structured process of technology intelligence. Previous approaches in tech mining—the application of text mining for technology intelligence —have primarily focused on the elicitation of technical or legal information from web, patent, or research databases. However, knowledge of consumers’ needs, fears, and hopes is a prerequisite for the success of an emerging technology in the marketplace. Thus, we claim that technology intelligence needs to also consider consumers’ technology perceptions. Hence, we propose a novel and comprehensive approach to collect user-generated content from the web and apply text mining to derive consumer perceptions. In doing so, we align with an established tech-mining process. This paper illustrates our approach on the emerging technology of autonomous driving and provides an initial indication of concurrent validity

    KACST Arabic Text Classification Project: Overview and Preliminary Results

    No full text
    Electronically formatted Arabic free-texts can be found in abundance these days on the World Wide Web, often linked to commercial enterprises and/or government organizations. Vast tracts of knowledge and relations lie hidden within these texts, knowledge that can be exploited once the correct intelligent tools have been identified and applied. For example, text mining may help with text classification and categorization. Text classification aims to automatically assign text to a predefined category based on identifiable linguistic features. Such a process has different useful applications including, but not restricted to, E-Mail spam detection, web pages content filtering, and automatic message routing. In this paper an overview of King Abdulaziz City for Science and Technology (KACST) Arabic Text Classification Project will be illustrated along with some preliminary results. This project will contribute to the better understanding and elaboration of Arabic text classification techniques

    Comparative Analysis of Indonesian Text Mining News Online Classification Using the K-Nearest Neighbor and Random Forest Algorithm

    Get PDF
    The rapid development of internet technology today makes many news media grow pretty rapidly. Newspaper companies have utilized internet technology to spread the latest news online through online mass media. Hundreds of thousands of stories are written and published daily on online-based Indonesian news portals, making it difficult for readers to find the news topics they want to read. In making it easier for readers to find the news they are looking for, news needs to be classified according to its respective categories, such as education, current news, finance, and sports. So to classify categories, a text classification method is needed or often called Text Mining. Text mining is a data mining classification technique for processing text using a computer to produce helpful text analysis. In this study, a comparison of 2 methods for developing texts was carried out to get accuracy above 80%
    • 

    corecore