Search CORE

393,596 research outputs found

Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

Author: Han Jiawei
Ji Heng
Jiao Yizhu
Li Sha
Ouyang Siru
Zhao Ruining
Zhong Ming
Publication venue
Publication date: 24/10/2023
Field of study

Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE. Comprehensive evaluations on our benchmark reveal that ODIE substantially outperforms the existing open-source models of similar size. Our code and dataset are released on https://github.com/yzjiao/On-Demand-IE.Comment: EMNLP 202

arXiv.org e-Print Archive

Remote Sensing of Snow Fields from Earth Satellites

Author: Baker D. R.
Publication venue
Publication date
Field of study

Considerable effort has gone into snow line delineation using available satellite data. Furthermore, increasing emphasis is being put on automated extraction of such information and generation of a useable product for hydrologists. Implications are clear that the impact from future satellite and sensors systems will create an increased demand for computer processing before the data can be used by the hydrologist. If the coarse-resolution, broad spectral band data available from current satellites already create a demand by hydrologists for computer processing of the data, it is obvious there will be an even greater demand for computer analysis and evaluation when the future ERTS data become available

NASA Technical Reports Server

Optimisation using Natural Language Processing: Personalized Tour Recommendation for Museums

Author: El-Bèze Marc
Josselin Didier
Linhares Andréa Carneiro
Mathias Mayeul
Moussa Assema
Poli Marie-Sylvie
Rigat Francoise
Torres-Moreno Juan-Manuel
Zhou Fen
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/01/2014
Field of study

This paper proposes a new method to provide personalized tour recommendation for museum visits. It combines an optimization of preference criteria of visitors with an automatic extraction of artwork importance from museum information based on Natural Language Processing using textual energy. This project includes researchers from computer and social sciences. Some results are obtained with numerical experiments. They show that our model clearly improves the satisfaction of the visitor who follows the proposed tour. This work foreshadows some interesting outcomes and applications about on-demand personalized visit of museums in a very near future.Comment: 8 pages, 4 figures; Proceedings of the 2014 Federated Conference on Computer Science and Information Systems pp. 439-44

arXiv.org e-Print Archive

Regional price targets appropriate for advanced coal extraction

Author: Terasawa K. L.
Whipple D. M.
Publication venue
Publication date
Field of study

A methodology is presented for predicting coal prices in regional markets for the target time frames 1985 and 2000 that could subsequently be used to guide the development of an advanced coal extraction system. The model constructed is a supply and demand model that focuses on underground mining since the advanced technology is expected to be developed for these reserves by the target years. Coal reserve data and the cost of operating a mine are used to obtain the minimum acceptable selling price that would induce the producer to bring the mine into production. Based on this information, market supply curves can be generated. Demand by region is calculated based on an EEA methodology that emphasizes demand by electric utilities and demand by industry. The demand and supply curves are then used to obtain the price targets. The results show a growth in the size of the markets for compliance and low sulphur coal regions. A significant rise in the real price of coal is not expected even by the year 2000. The model predicts heavy reliance on mines with thick seams, larger block size and deep overburden

NASA Technical Reports Server

The iCrawl Wizard -- Supporting Interactive Focused Crawl Specification

Author: Demidova Elena
Gossen Gerhard
Risse Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Collections of Web documents about specific topics are needed for many areas of current research. Focused crawling enables the creation of such collections on demand. Current focused crawlers require the user to manually specify starting points for the crawl (seed URLs). These are also used to describe the expected topic of the collection. The choice of seed URLs influences the quality of the resulting collection and requires a lot of expertise. In this demonstration we present the iCrawl Wizard, a tool that assists users in defining focused crawls efficiently and semi-automatically. Our tool uses major search engines and Social Media APIs as well as information extraction techniques to find seed URLs and a semantic description of the crawl intent. Using the iCrawl Wizard even non-expert users can create semantic specifications for focused crawlers interactively and efficiently.Comment: Published in the Proceedings of the European Conference on Information Retrieval (ECIR) 201

arXiv.org e-Print Archive

Crossref

Development of an ontology for aerospace engine components degradation in service

Author: Harrison Alan
Mehnen Jorn
Okoh Caxton
Redding Louis E.
Roy Rajkumar
Publication venue: 'Scitepress'
Publication date: 01/01/2014
Field of study

This paper presents the development of an ontology for component service degradation. In this paper, degradation mechanisms in gas turbine metallic components are used for a case study to explain how a taxonomy within an ontology can be validated. The validation method used in this paper uses an iterative process and sanity checks. Data extracted from on-demand textual information are filtered and grouped into classes of degradation mechanisms. Various concepts are systematically and hierarchically arranged for use in the service maintenance ontology. The allocation of the mechanisms to the AS-IS ontology presents a robust data collection hub. Data integrity is guaranteed when the TO-BE ontology is introduced to analyse processes relative to various failure events. The initial evaluation reveals improvement in the performance of the TO-BE domain ontology based on iterations and updates with recognised mechanisms. The information extracted and collected is required to improve service k nowledge and performance feedback which are important for service engineers. Existing research areas such as natural language processing, knowledge management, and information extraction were also examined

Cranfield CERES

Observation of strongly entangled photon pairs from a nanowire quantum dot

Author: Dalacu Dan
Giudice Andrea
Gulinatti Angelo
Jöns Klaus D.
Poole Philip J.
Reimer Michael E.
Versteegh Marijn A. M.
Zwiller Val
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

A bright photon source that combines high-fidelity entanglement, on-demand generation, high extraction efficiency, directional and coherent emission, as well as position control at the nanoscale is required for implementing ambitious schemes in quantum information processing, such as that of a quantum repeater. Still, all of these properties have not yet been achieved in a single device. Semiconductor quantum dots embedded in nanowire waveguides potentially satisfy all of these requirements; however, although theoretically predicted, entanglement has not yet been demonstrated for a nanowire quantum dot. Here, we demonstrate a bright and coherent source of strongly entangled photon pairs from a position controlled nanowire quantum dot with a fidelity as high as 0.859 +/- 0.006 and concurrence of 0.80 +/- 0.02. The two-photon quantum state is modified via the nanowire shape. Our new nanoscale entangled photon source can be integrated at desired positions in a quantum photonic circuit, single electron devices and light emitting diodes.Comment: Article and Supplementary Information with open access published at: http://www.nature.com/ncomms/2014/141031/ncomms6298/full/ncomms6298.htm

arXiv.org e-Print Archive

NRC Publications Archive

Archivio istituzionale della ricerca - Politecnico di Milano

TU Delft Repository

PubMed Central