3,720 research outputs found
Recommended from our members
Ontology learning for Semantic Web Services
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 18/10/2010.The expansion of Semantic Web Services is restricted by traditional ontology engineering methods. Manual ontology development is time consuming, expensive and a resource exhaustive task. Consequently, it is important to support ontology engineers by
automating the ontology acquisition process to help deliver the Semantic Web vision.
Existing Web Services offer an affluent source of domain knowledge for ontology
engineers. Ontology learning can be seen as a plug-in in the Web Service ontology
development process, which can be used by ontology engineers to develop and maintain
an ontology that evolves with current Web Services. Supporting the domain engineer
with an automated tool whilst building an ontological domain model, serves the purpose
of reducing time and effort in acquiring the domain concepts and relations from Web
Service artefacts, whilst effectively speeding up the adoption of Semantic Web Services, thereby allowing current Web Services to accomplish their full potential. With that in mind, a Service Ontology Learning Framework (SOLF) is developed and
applied to a real set of Web Services. The research contributes a rigorous method that
effectively extracts domain concepts, and relations between these concepts, from Web
Services and automatically builds the domain ontology. The method applies pattern-based
information extraction techniques to automatically learn domain concepts and
relations between those concepts. The framework is automated via building a tool that implements the techniques. Applying the SOLF and the tool on different sets of services results in an automatically built domain ontology model that represents semantic knowledge in the underlying domain. The framework effectiveness, in extracting domain concepts and relations, is evaluated
by its appliance on varying sets of commercial Web Services including the financial domain. The standard evaluation metrics, precision and recall, are employed to determine both the accuracy and coverage of the learned ontology models. Both the
lexical and structural dimensions of the models are evaluated thoroughly. The evaluation results are encouraging, providing concrete outcomes in an area that is little researched
Semantic Representation and Inference for NLP
Semantic representation and inference is essential for Natural Language
Processing (NLP). The state of the art for semantic representation and
inference is deep learning, and particularly Recurrent Neural Networks (RNNs),
Convolutional Neural Networks (CNNs), and transformer Self-Attention models.
This thesis investigates the use of deep learning for novel semantic
representation and inference, and makes contributions in the following three
areas: creating training data, improving semantic representations and extending
inference learning. In terms of creating training data, we contribute the
largest publicly available dataset of real-life factual claims for the purpose
of automatic claim verification (MultiFC), and we present a novel inference
model composed of multi-scale CNNs with different kernel sizes that learn from
external sources to infer fact checking labels. In terms of improving semantic
representations, we contribute a novel model that captures non-compositional
semantic indicators. By definition, the meaning of a non-compositional phrase
cannot be inferred from the individual meanings of its composing words (e.g.,
hot dog). Motivated by this, we operationalize the compositionality of a phrase
contextually by enriching the phrase representation with external word
embeddings and knowledge graphs. Finally, in terms of inference learning, we
propose a series of novel deep learning architectures that improve inference by
using syntactic dependencies, by ensembling role guided attention heads,
incorporating gating layers, and concatenating multiple heads in novel and
effective ways. This thesis consists of seven publications (five published and
two under review).Comment: PhD thesis, the University of Copenhage
Data-Driven Design-by-Analogy: State of the Art and Future Directions
Design-by-Analogy (DbA) is a design methodology wherein new solutions,
opportunities or designs are generated in a target domain based on inspiration
drawn from a source domain; it can benefit designers in mitigating design
fixation and improving design ideation outcomes. Recently, the increasingly
available design databases and rapidly advancing data science and artificial
intelligence technologies have presented new opportunities for developing
data-driven methods and tools for DbA support. In this study, we survey
existing data-driven DbA studies and categorize individual studies according to
the data, methods, and applications in four categories, namely, analogy
encoding, retrieval, mapping, and evaluation. Based on both nuanced organic
review and structured analysis, this paper elucidates the state of the art of
data-driven DbA research to date and benchmarks it with the frontier of data
science and AI research to identify promising research opportunities and
directions for the field. Finally, we propose a future conceptual data-driven
DbA system that integrates all propositions.Comment: A Preprint Versio
Bridging the gap between textual and formal business process representations
Tesi en modalitat de compendi de publicacionsIn the era of digital transformation, an increasing number of organizations are start ing to think in terms of business processes. Processes are at the very heart of each business, and must be understood and carried out by a wide range of actors, from both technical and non-technical backgrounds alike.
When embracing digital transformation practices, there is a need for all involved parties to be aware of the underlying business processes in an organization. However, the representational complexity and biases of the state-of-the-art modeling notations pose a challenge in understandability. On the other hand, plain language representations, accessible by nature and easily understood by everyone, are often frowned upon by technical specialists due to their ambiguity.
The aim of this thesis is precisely to bridge this gap: Between the world of the techni cal, formal languages and the world of simpler, accessible natural languages. Structured as an article compendium, in this thesis we present four main contributions to address specific problems in the intersection between the fields of natural language processing and business process management.A l’era de la transformaciĂł digital, cada vegada mĂ©s organitzacions comencen a pensar en termes de processos de negoci. Els processos sĂłn el nucli principal de tota empresa i, com a tals, han de ser fĂ cilment comprensibles per un ampli ventall de rols, tant perfils tècnics com no-tècnics. Quan s’adopta la transformaciĂł digital, Ă©s necessari que totes les parts involucrades estiguin ben informades sobre els protocols implantats com a part del procĂ©s de digitalitzaciĂł. Tot i això, la complexitat i biaixos de representaciĂł dels llenguatges de modelitzaciĂł que actualment conformen l’estat de l’art sovint en dificulten la seva com prensiĂł. D’altra banda, les representacions basades en documentaciĂł usant llenguatge natural, accessibles per naturalesa i fĂ cilment comprensibles per tothom, moltes vegades sĂłn vistes com un problema pels perfils mĂ©s tècnics a causa de la presència d’ambigĂĽitats en els textos. L’objectiu d’aquesta tesi Ă©s precisament el de superar aquesta distĂ ncia: La distĂ ncia entre el mĂłn dels llenguatges tècnics i formals amb el dels llenguatges naturals, mĂ©s accessibles i senzills. Amb una estructura de compendi d’articles, en aquesta tesi presentem quatre grans lĂnies de recerca per adreçar problemes especĂfics en aquesta intersecciĂł entre les tecnologies d’anĂ lisi de llenguatge natural i la gestiĂł dels processos de negoci.Postprint (published version
Improving Editorial Workflow and Metadata Quality at Springer Nature
Identifying the research topics that best describe the scope of a scientific publication is a crucial task for editors, in particular because the quality of these annotations determine how effectively users are able to discover the right content in online libraries. For this reason, Springer Nature, the world's largest academic book publisher, has traditionally entrusted this task to their most expert editors. These editors manually analyse all new books, possibly including hundreds of chapters, and produce a list of the most relevant topics. Hence, this process has traditionally been very expensive, time-consuming, and confined to a few senior editors. For these reasons, back in 2016 we developed Smart Topic Miner (STM), an ontology-driven application that assists the Springer Nature editorial team in annotating the volumes of all books covering conference proceedings in Computer Science. Since then STM has been regularly used by editors in Germany, China, Brazil, India, and Japan, for a total of about 800 volumes per year. Over the past three years the initial prototype has iteratively evolved in response to feedback from the users and evolving requirements. In this paper we present the most recent version of the tool and describe the evolution of the system over the years, the key lessons learnt, and the impact on the Springer Nature workflow. In particular, our solution has drastically reduced the time needed to annotate proceedings and significantly improved their discoverability, resulting in 9.3 million additional downloads. We also present a user study involving 9 editors, which yielded excellent results in term of usability, and report an evaluation of the new topic classifier used by STM, which outperforms previous versions in recall and F-measure
Semantic Technologies for Business Decision Support
2015 - 2016In order to improve and to be competitive, enterprises should know how to get opportunities coming from data provided from the Web. The strategic vision implies a high level of communication sharing and the integration of practices across every business level. This not means that enterprises need a disruptive change in informative systems, but the conversion of them, reusing existent business data and integrating new data. However, data is heterogeneous, and so to maximise the value of the data it is necessary to extract meaning from it considering the context in which they evolve. The proliferation of new linguistic data linked to the growth of textual resources on the Web generate an inadequacy in the analysis and integration phases of data in the enterprise. Thus, the use of Semantic Technologies based on Natural Language Processing (NLP) applications is required in advance. This study arises as a first approach to the development of a document-driven Decision Support System (DSS) based on NLP technology within the theoretical framework of Lexicon-Grammar by Maurice Gross. Our research project has two main objectives: the first is to recognize and codify the innovative language with which the companies express and describe their business, in order to standardize it and make it actionable by machine. The second one aims to use information resulting from the text analysis to support strategic decisions, considering that through Text Mining analysis we can capture the hidden meaning in business documents.
In the first chapter we examine the concept, characteristics and different types of DSS (with particular reference to document-driven analysis) and changes that these systems have experienced with web development and consequently of information systems within companies. In the second chapter, we proceed with a brief review of Computational Linguistics, paying particular attention to goals, resources and applications. In the third chapter, we provide a state-of-the-art of Semantic Technology Enterprises (STEs) and their process of integration in the innovation market, analysing the diffusion, the types of technologies and main sectors in which they operate. In the fourth chapter, we propose a model of linguistic support and analysis, according with Lexicon-Grammar, in order to create an
enriched solution for document-driven decision systems: we provide specific features of business language, resulted from experimental research work in the startup ecosystem.
Finally, we recognize that the formalization of all linguistic phenomena is extremely complex, but the results of analysis make us hopeful to continue with this line of research. Applying linguistic support to the business technological environment provides results that are more efficient and in constantly updated innovating even in strong resistance to change conditions. [edited by author]XV n.s
A Systematic Review of Automated Query Reformulations in Source Code Search
Fixing software bugs and adding new features are two of the major maintenance
tasks. Software bugs and features are reported as change requests. Developers
consult these requests and often choose a few keywords from them as an ad hoc
query. Then they execute the query with a search engine to find the exact
locations within software code that need to be changed. Unfortunately, even
experienced developers often fail to choose appropriate queries, which leads to
costly trials and errors during a code search. Over the years, many studies
attempt to reformulate the ad hoc queries from developers to support them. In
this systematic literature review, we carefully select 70 primary studies on
query reformulations from 2,970 candidate studies, perform an in-depth
qualitative analysis (e.g., Grounded Theory), and then answer seven research
questions with major findings. First, to date, eight major methodologies (e.g.,
term weighting, term co-occurrence analysis, thesaurus lookup) have been
adopted to reformulate queries. Second, the existing studies suffer from
several major limitations (e.g., lack of generalizability, vocabulary mismatch
problem, subjective bias) that might prevent their wide adoption. Finally, we
discuss the best practices and future opportunities to advance the state of
research in search query reformulations.Comment: 81 pages, accepted at TOSE
- …