18,512 research outputs found
Embedding-based Scientific Literature Discovery in a Text Editor Application
Each claim in a research paper requires all relevant prior knowledge to be
discovered, assimilated, and appropriately cited. However, despite the
availability of powerful search engines and sophisticated text editing
software, discovering relevant papers and integrating the knowledge into a
manuscript remain complex tasks associated with high cognitive load. To define
comprehensive search queries requires strong motivation from authors,
irrespective of their familiarity with the research field. Moreover, switching
between independent applications for literature discovery, bibliography
management, reading papers, and writing text burdens authors further and
interrupts their creative process. Here, we present a web application that
combines text editing and literature discovery in an interactive user
interface. The application is equipped with a search engine that couples
Boolean keyword filtering with nearest neighbor search over text embeddings,
providing a discovery experience tuned to an author's manuscript and his
interests. Our application aims to take a step towards more enjoyable and
effortless academic writing.
The demo of the application (https://SciEditorDemo2020.herokuapp.com/) and a
short video tutorial (https://youtu.be/pkdVU60IcRc) are available online
SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation
Scientific writing involves retrieving, summarizing, and citing relevant
papers, which can be time-consuming processes in large and rapidly evolving
fields. By making these processes inter-operable, natural language processing
(NLP) provides opportunities for creating end-to-end assistive writing tools.
We propose SciLit, a pipeline that automatically recommends relevant papers,
extracts highlights, and suggests a reference sentence as a citation of a
paper, taking into consideration the user-provided context and keywords. SciLit
efficiently recommends papers from large databases of hundreds of millions of
papers using a two-stage pre-fetching and re-ranking literature search system
that flexibly deals with addition and removal of a paper database. We provide a
convenient user interface that displays the recommended papers as extractive
summaries and that offers abstractively-generated citing sentences which are
aligned with the provided context and which mention the chosen keyword(s). Our
assistive tool for literature discovery and scientific writing is available at
https://scilit.vercel.appComment: Accepted at ACL 2023 System Demonstratio
Unified Description for Network Information Hiding Methods
Until now hiding methods in network steganography have been described in
arbitrary ways, making them difficult to compare. For instance, some
publications describe classical channel characteristics, such as robustness and
bandwidth, while others describe the embedding of hidden information. We
introduce the first unified description of hiding methods in network
steganography. Our description method is based on a comprehensive analysis of
the existing publications in the domain. When our description method is applied
by the research community, future publications will be easier to categorize,
compare and extend. Our method can also serve as a basis to evaluate the
novelty of hiding methods proposed in the future.Comment: 24 pages, 7 figures, 1 table; currently under revie
Identifying e-Commerce in Enterprises by means of Text Mining and Classification Algorithms
Monitoring specific features of the enterprises, for example, the adoption of e-commerce, is an important and basic task for several economic activities. This type of information is usually obtained by means of surveys, which are costly due to the amount of personnel involved in the task. An automatic detection of this information would allow consistent savings. This can actually be performed by relying on computer engineering, since in general this information is publicly available on-line through the corporate websites. This work describes how to convert the detection of e-commerce into a supervised classification problem, where each record is obtained from the automatic analysis of one corporate website, and the class is the presence or the absence of e-commerce facilities. The automatic generation of similar data records requires the use of several Text Mining phases; in particular we compare six strategies based on the selection of best words and best n-grams. After this, we classify the obtained dataset by means of four classification algorithms: Support Vector Machines; Random Forest; Statistical and Logical Analysis of Data; Logistic Classifier. This turns out to be a difficult case of classification problem. However, after a careful design and set-up of the whole procedure, the results on a practical case of Italian enterprises are encouraging
- …