986,361 research outputs found

    Mining Measured Information from Text

    Full text link
    We present an approach to extract measured information from text (e.g., a 1370 degrees C melting point, a BMI greater than 29.9 kg/m^2 ). Such extractions are critically important across a wide range of domains - especially those involving search and exploration of scientific and technical documents. We first propose a rule-based entity extractor to mine measured quantities (i.e., a numeric value paired with a measurement unit), which supports a vast and comprehensive set of both common and obscure measurement units. Our method is highly robust and can correctly recover valid measured quantities even when significant errors are introduced through the process of converting document formats like PDF to plain text. Next, we describe an approach to extracting the properties being measured (e.g., the property "pixel pitch" in the phrase "a pixel pitch as high as 352 {\mu}m"). Finally, we present MQSearch: the realization of a search engine with full support for measured information.Comment: 4 pages; 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15

    Web Content Mining for Information on Information Scientists

    Get PDF
    This paper presents a search system for information on scientists which was implemented prototypically for the area of information science, employing Web Content Mining techniques. The sources that are used in the implemented approach are online publication services and personal homepages of scientists. The system contains wrappers for querying the publication services and information extraction from their result pages, as well as methods for information extraction from homepages, which are based on heuristics concerning structure and composition of the pages. Moreover a specialised search technique for searching for personal homepages of information scientists was developed

    The contribution of data mining to information science

    Get PDF
    The information explosion is a serious challenge for current information institutions. On the other hand, data mining, which is the search for valuable information in large volumes of data, is one of the solutions to face this challenge. In the past several years, data mining has made a significant contribution to the field of information science. This paper examines the impact of data mining by reviewing existing applications, including personalized environments, electronic commerce, and search engines. For these three types of application, how data mining can enhance their functions is discussed. The reader of this paper is expected to get an overview of the state of the art research associated with these applications. Furthermore, we identify the limitations of current work and raise several directions for future research

    Semantic process mining tools: core building blocks

    Get PDF
    Process mining aims at discovering new knowledge based on information hidden in event logs. Two important enablers for such analysis are powerful process mining techniques and the omnipresence of event logs in today's information systems. Most information systems supporting (structured) business processes (e.g. ERP, CRM, and workflow systems) record events in some form (e.g. transaction logs, audit trails, and database tables). Process mining techniques use event logs for all kinds of analysis, e.g., auditing, performance analysis, process discovery, etc. Although current process mining techniques/tools are quite mature, the analysis they support is somewhat limited because it is purely based on labels in logs. This means that these techniques cannot benefit from the actual semantics behind these labels which could cater for more accurate and robust analysis techniques. Existing analysis techniques are purely syntax oriented, i.e., much time is spent on filtering, translating, interpreting, and modifying event logs given a particular question. This paper presents the core building blocks necessary to enable semantic process mining techniques/tools. Although the approach is highly generic, we focus on a particular process mining technique and show how this technique can be extended and implemented in the ProM framework tool
    • …
    corecore