5,279 research outputs found

    Datamining for Web-Enabled Electronic Business Applications

    Get PDF
    Web-Enabled Electronic Business is generating massive amount of data on customer purchases, browsing patterns, usage times and preferences at an increasing rate. Data mining techniques can be applied to all the data being collected for obtaining useful information. This chapter attempts to present issues associated with data mining for web-enabled electronic-business

    A Frame Work for Text Mining using Learned Information Extraction System

    Get PDF
    Text mining is a very exciting research area as it tries to discover knowledge from unstructured texts These texts can be found on a computer desktop intranets and the internet The aim of this paper is to give an overview of text mining in the contexts of its techniques application domains and the most challenging issue The Learned Information Extraction LIE is about locating specific items in natural-language documents This paper presents a framework for text mining called DTEX Discovery Text Extraction using a learned information extraction system to transform text into more structured data which is then mined for interesting relationships The initial version of DTEX integrates an LIE module acquired by an LIE learning system and a standard rule induction module In addition rules mined from a database extracted from a corpus of texts are used to predict additional information to extract from future documents thereby improving the recall of the underlying extraction system Applying these techniques best results are presented to a corpus of computer job announcement postings from an Internet newsgrou

    Ask, and shall you receive?: Understanding Desire Fulfillment in Natural Language Text

    Full text link
    The ability to comprehend wishes or desires and their fulfillment is important to Natural Language Understanding. This paper introduces the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled. We propose various unstructured and structured models that capture fulfillment cues such as the subject's emotional state and actions. Our experiments with two different datasets demonstrate the importance of understanding the narrative and discourse structure to address this task

    Knowledge mining of unstructured information: application to cyber-domain

    Full text link
    Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing the large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we present and implement a novel knowledge graph and knowledge mining framework for extracting the relevant information from free-form text about incidents in the cyberdomain. The framework includes a machine learning based pipeline for generating graphs of organizations, countries, industries, products and attackers with a non-technical cyber-ontology. The extracted knowledge graph is utilized to estimate the incidence of cyberattacks on a given graph configuration. We use publicly available collections of real cyber-incident reports to test the efficacy of our methods. The knowledge extraction is found to be sufficiently accurate, and the graph-based threat estimation demonstrates a level of correlation with the actual records of attacks. In practical use, an analyst utilizing the presented framework can infer additional information from the current cyber-landscape in terms of risk to various entities and propagation of the risk heuristic between industries and countries

    Automated retrieval and extraction of training course information from unstructured web pages

    Get PDF
    Web Information Extraction (WIE) is the discipline dealing with the discovery, processing and extraction of specific pieces of information from semi-structured or unstructured web pages. The World Wide Web comprises billions of web pages and there is much need for systems that will locate, extract and integrate the acquired knowledge into organisations practices. There are some commercial, automated web extraction software packages, however their success comes from heavily involving their users in the process of finding the relevant web pages, preparing the system to recognise items of interest on these pages and manually dealing with the evaluation and storage of the extracted results. This research has explored WIE, specifically with regard to the automation of the extraction and validation of online training information. The work also includes research and development in the area of automated Web Information Retrieval (WIR), more specifically in Web Searching (or Crawling) and Web Classification. Different technologies were considered, however after much consideration, Naïve Bayes Networks were chosen as the most suitable for the development of the classification system. The extraction part of the system used Genetic Programming (GP) for the generation of web extraction solutions. Specifically, GP was used to evolve Regular Expressions, which were then used to extract specific training course information from the web such as: course names, prices, dates and locations. The experimental results indicate that all three aspects of this research perform very well, with the Web Crawler outperforming existing crawling systems, the Web Classifier performing with an accuracy of over 95% and a precision of over 98%, and the Web Extractor achieving an accuracy of over 94% for the extraction of course titles and an accuracy of just under 67% for the extraction of other course attributes such as dates, prices and locations. Furthermore, the overall work is of great significance to the sponsoring company, as it simplifies and improves the existing time-consuming, labour-intensive and error-prone manual techniques, as will be discussed in this thesis. The prototype developed in this research works in the background and requires very little, often no, human assistance

    Powerful tool to expand business intelligence: Text mining

    Get PDF
    With the extensive inclusion of document, especially text, in the business systems, data mining does not cover the full scope of Business Intelligence. Data mining cannot deliver its impact on extracting useful details from the large collection of unstructured and semi-structured written materials based on natural languages. The most pressing issue is to draw the potential business intelligence from text. In order to gain competitive advantages for the business, it is necessary to develop the new powerful tool, text mining, to expand the scope of business intelligence. In this paper, we will work out the strong points of text mining in extracting business intelligence from huge amount of textual information sources within business systems. We will apply text mining to each stage of Business Intelligence systems to prove that text mining is the powerful tool to expand the scope of BI. After reviewing basic definitions and some related technologies, we will discuss the relationship and the benefits of these to text mining. Some examples and applications of text mining will also be given. The motivation behind is to develop new approach to effective and efficient textual information analysis. Thus we can expand the scope of Business Intelligence using the powerful tool, text mining
    • …
    corecore