400,776 research outputs found
Leveraging Large Language Models for Enhanced Product Descriptions in eCommerce
In the dynamic field of eCommerce, the quality and comprehensiveness of
product descriptions are pivotal for enhancing search visibility and customer
engagement. Effective product descriptions can address the 'cold start'
problem, align with market trends, and ultimately lead to increased
click-through rates. Traditional methods for crafting these descriptions often
involve significant human effort and may lack both consistency and scalability.
This paper introduces a novel methodology for automating product description
generation using the LLAMA 2.0 7B language model. We train the model on a
dataset of authentic product descriptions from Walmart, one of the largest
eCommerce platforms. The model is then fine-tuned for domain-specific language
features and eCommerce nuances to enhance its utility in sales and user
engagement. We employ multiple evaluation metrics, including NDCG, customer
click-through rates, and human assessments, to validate the effectiveness of
our approach. Our findings reveal that the system is not only scalable but also
significantly reduces the human workload involved in creating product
descriptions. This study underscores the considerable potential of large
language models like LLAMA 2.0 7B in automating and optimizing various facets
of eCommerce platforms, offering significant business impact, including
improved search functionality and increased sales.Comment: 9 pages, 4 figures, EMNLP2023 workshop, The 2023 Conference on
Empirical Methods in Natural Language Processin
Knowledge Author: Facilitating user-driven, Domain content development to support clinical information extraction
Background: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text. Results: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76%) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86%) and varied recall for modifiers (certainty: 91% sidedness: 80%, neurovascular anatomy: 46%). Conclusion: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts
Abstraction as a basis for the computational interpretation of creative cross-modal metaphor
Various approaches to computational metaphor interpretation are based on pre-existing similarities between source and target domains and/or are based on metaphors already observed to be prevalent in the language. This paper addresses similarity-creating cross-modal metaphoric expressions. It is shown how the “abstract concept as object” (or reification) metaphor plays a central role in a large class of metaphoric extensions. The described approach depends on the imposition of abstract ontological components, which represent source concepts, onto target concepts. The challenge of such a system is to represent both denotative and connotative components which are extensible, together with a framework of general domains between which such extensions can conceivably occur. An existing ontology of this kind, consistent with some mathematic concepts and widely held linguistic notions, is outlined. It is suggested that the use of such an abstract representation system is well adapted to the interpretation of both conventional and unconventional metaphor that is similarity-creating
Interactive Behavior-driven Development: a Low-code Perspective
Within behavior-driven development (BDD), different types of stakeholders collaborate in creating scenarios that specify application behavior. The current workflow for BDD expects non-technical stakeholders to use an integrated development environment (IDE) to write textual scenarios in the Gherkin language and verify application behavior using test passed/failed reports. Research to date shows that this approach leads non-technical stakeholders to perceive BDD as an overhead in addition to the testing. In this vision paper, we propose an alternative approach to specify and verify application behavior visually, interactively, and collaboratively within an IDE. Instead of writing textual scenarios, non-technical stakeholders compose, edit, and save scenarios by using tailored graphical interfaces that allow them to manipulate involved domain objects. Upon executing such interactively composed scenarios, all stakeholders verify the application behavior by inspecting domain-specific representations of run-time domain objects instead of a test run report. Such a low code approach to BDD has the potential to enable nontechnical stakeholders to engage more harmoniously in behavior specification and validation together with technical stakeholders within an IDE. There are two main contributions of this work: (i) we present an analysis of the features of 13 BDD tools, (ii) we describe a prototype implementation of our approach, and (iii) we outline our plan to conduct a large-scale developer survey to evaluate our approach to highlight the perceived benefits over the existing approach
Language design for a personal learning environment design language
Approaching technology-enhanced learning from the perspective of a learner, we foster the idea of learning environment design, learner interactions, and tool interoperability. In this paper, we shortly summarize the motivation for our personal learning environment approach and describe the development of a domain-specific language for this purpose as well as its realization in practice. Consequently, we examine our learning environment design language according to its lexis and syntax, the semantics behind it, and pragmatical aspects within a first prototypic implementation. Finally, we discuss strengths, problematic aspects, and open issues of our approach
Combining multi-domain statistical machine translation models using automatic classifiers
This paper presents a set of experiments on Domain Adaptation of Statistical Machine Translation systems. The experiments focus on Chinese-English and two domain-specific
corpora. The paper presents a novel approach for combining multiple domain-trained translation models to achieve improved translation quality for both domain-specific as well as combined sets of sentences. We train a statistical
classifier to classify sentences according to the appropriate domain and utilize the corresponding domain-specific MT models to translate them. Experimental results show that the method achieves a statistically significant
absolute improvement of 1.58 BLEU (2.86% relative improvement) score over a translation model trained on combined data, and considerable improvements over a model using multiple decoding paths of the Moses decoder, for the combined domain test set. Furthermore, even for domain-specific test sets, our approach works almost as well as dedicated domain-specific models and perfect classification
A Classification of Scripting Systems for Entertainment and Serious Computer Games
The technology base for modern computer games is usually provided by a game engine. Many game engines have built-in dedicated scripting languages that allow the development of complete games that are built using those engines, as well as extensive modification of existing games through scripting alone. While some of these game engines implement proprietary languages, others use existing scripting systems that have been modified according to the game engine's requirements. Scripting languages generally provide a very high level of abstraction method for syntactically controlling the behaviour of their host applications and different types of scripting system allow different types of modification of their underlying host application. In this paper we propose a simple classification for scripting systems used in computer games for entertainment and serious purposes
Grand Challenges of Traceability: The Next Ten Years
In 2007, the software and systems traceability community met at the first
Natural Bridge symposium on the Grand Challenges of Traceability to establish
and address research goals for achieving effective, trustworthy, and ubiquitous
traceability. Ten years later, in 2017, the community came together to evaluate
a decade of progress towards achieving these goals. These proceedings document
some of that progress. They include a series of short position papers,
representing current work in the community organized across four process axes
of traceability practice. The sessions covered topics from Trace Strategizing,
Trace Link Creation and Evolution, Trace Link Usage, real-world applications of
Traceability, and Traceability Datasets and benchmarks. Two breakout groups
focused on the importance of creating and sharing traceability datasets within
the research community, and discussed challenges related to the adoption of
tracing techniques in industrial practice. Members of the research community
are engaged in many active, ongoing, and impactful research projects. Our hope
is that ten years from now we will be able to look back at a productive decade
of research and claim that we have achieved the overarching Grand Challenge of
Traceability, which seeks for traceability to be always present, built into the
engineering process, and for it to have "effectively disappeared without a
trace". We hope that others will see the potential that traceability has for
empowering software and systems engineers to develop higher-quality products at
increasing levels of complexity and scale, and that they will join the active
community of Software and Systems traceability researchers as we move forward
into the next decade of research
- …