393,596 research outputs found
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction
Large language models with instruction-following capabilities open the door
to a wider group of users. However, when it comes to information extraction - a
classic task in natural language processing - most task-specific systems cannot
align well with long-tail ad hoc extraction use cases for non-expert users. To
address this, we propose a novel paradigm, termed On-Demand Information
Extraction, to fulfill the personalized demands of real-world users. Our task
aims to follow the instructions to extract the desired content from the
associated text and present it in a structured tabular format. The table
headers can either be user-specified or inferred contextually by the model. To
facilitate research in this emerging area, we present a benchmark named
InstructIE, inclusive of both automatically generated training data, as well as
the human-annotated test set. Building on InstructIE, we further develop an
On-Demand Information Extractor, ODIE. Comprehensive evaluations on our
benchmark reveal that ODIE substantially outperforms the existing open-source
models of similar size. Our code and dataset are released on
https://github.com/yzjiao/On-Demand-IE.Comment: EMNLP 202
Remote Sensing of Snow Fields from Earth Satellites
Considerable effort has gone into snow line delineation using available satellite data. Furthermore, increasing emphasis is being put on automated extraction of such information and generation of a useable product for hydrologists. Implications are clear that the impact from future satellite and sensors systems will create an increased demand for computer processing before the data can be used by the hydrologist. If the coarse-resolution, broad spectral band data available from current satellites already create a demand by hydrologists for computer processing of the data, it is obvious there will be an even greater demand for computer analysis and evaluation when the future ERTS data become available
Optimisation using Natural Language Processing: Personalized Tour Recommendation for Museums
This paper proposes a new method to provide personalized tour recommendation
for museum visits. It combines an optimization of preference criteria of
visitors with an automatic extraction of artwork importance from museum
information based on Natural Language Processing using textual energy. This
project includes researchers from computer and social sciences. Some results
are obtained with numerical experiments. They show that our model clearly
improves the satisfaction of the visitor who follows the proposed tour. This
work foreshadows some interesting outcomes and applications about on-demand
personalized visit of museums in a very near future.Comment: 8 pages, 4 figures; Proceedings of the 2014 Federated Conference on
Computer Science and Information Systems pp. 439-44
Regional price targets appropriate for advanced coal extraction
A methodology is presented for predicting coal prices in regional markets for the target time frames 1985 and 2000 that could subsequently be used to guide the development of an advanced coal extraction system. The model constructed is a supply and demand model that focuses on underground mining since the advanced technology is expected to be developed for these reserves by the target years. Coal reserve data and the cost of operating a mine are used to obtain the minimum acceptable selling price that would induce the producer to bring the mine into production. Based on this information, market supply curves can be generated. Demand by region is calculated based on an EEA methodology that emphasizes demand by electric utilities and demand by industry. The demand and supply curves are then used to obtain the price targets. The results show a growth in the size of the markets for compliance and low sulphur coal regions. A significant rise in the real price of coal is not expected even by the year 2000. The model predicts heavy reliance on mines with thick seams, larger block size and deep overburden
The iCrawl Wizard -- Supporting Interactive Focused Crawl Specification
Collections of Web documents about specific topics are needed for many areas
of current research. Focused crawling enables the creation of such collections
on demand. Current focused crawlers require the user to manually specify
starting points for the crawl (seed URLs). These are also used to describe the
expected topic of the collection. The choice of seed URLs influences the
quality of the resulting collection and requires a lot of expertise. In this
demonstration we present the iCrawl Wizard, a tool that assists users in
defining focused crawls efficiently and semi-automatically. Our tool uses major
search engines and Social Media APIs as well as information extraction
techniques to find seed URLs and a semantic description of the crawl intent.
Using the iCrawl Wizard even non-expert users can create semantic
specifications for focused crawlers interactively and efficiently.Comment: Published in the Proceedings of the European Conference on
Information Retrieval (ECIR) 201
Development of an ontology for aerospace engine components degradation in service
This paper presents the development of an ontology for component service degradation. In this paper, degradation mechanisms in gas turbine metallic components are used for a case study to explain how a taxonomy within an ontology can be validated. The validation method used in this paper uses an iterative process and sanity checks. Data extracted from on-demand textual information are filtered and grouped into classes of degradation mechanisms. Various concepts are systematically and hierarchically arranged for use in the service maintenance ontology. The allocation of the mechanisms to the AS-IS ontology presents a robust data collection hub. Data integrity is guaranteed when the TO-BE ontology is introduced to analyse processes relative to various failure events. The initial evaluation reveals improvement in the performance of the TO-BE domain ontology based on iterations and updates with recognised mechanisms. The information extracted and collected is required to improve service k nowledge and performance feedback which are important for service engineers. Existing research areas such as natural language processing, knowledge management, and information extraction were also examined
Observation of strongly entangled photon pairs from a nanowire quantum dot
A bright photon source that combines high-fidelity entanglement, on-demand
generation, high extraction efficiency, directional and coherent emission, as
well as position control at the nanoscale is required for implementing
ambitious schemes in quantum information processing, such as that of a quantum
repeater. Still, all of these properties have not yet been achieved in a single
device. Semiconductor quantum dots embedded in nanowire waveguides potentially
satisfy all of these requirements; however, although theoretically predicted,
entanglement has not yet been demonstrated for a nanowire quantum dot. Here, we
demonstrate a bright and coherent source of strongly entangled photon pairs
from a position controlled nanowire quantum dot with a fidelity as high as
0.859 +/- 0.006 and concurrence of 0.80 +/- 0.02. The two-photon quantum state
is modified via the nanowire shape. Our new nanoscale entangled photon source
can be integrated at desired positions in a quantum photonic circuit, single
electron devices and light emitting diodes.Comment: Article and Supplementary Information with open access published at:
http://www.nature.com/ncomms/2014/141031/ncomms6298/full/ncomms6298.htm
- …