9 research outputs found
Enhancing API Documentation through BERTopic Modeling and Summarization
As the amount of textual data in various fields, including software
development, continues to grow, there is a pressing demand for efficient and
effective extraction and presentation of meaningful insights. This paper
presents a unique approach to address this need, focusing on the complexities
of interpreting Application Programming Interface (API) documentation. While
official API documentation serves as a primary source of information for
developers, it can often be extensive and lacks user-friendliness. In light of
this, developers frequently resort to unofficial sources like Stack Overflow
and GitHub. Our novel approach employs the strengths of BERTopic for topic
modeling and Natural Language Processing (NLP) to automatically generate
summaries of API documentation, thereby creating a more efficient method for
developers to extract the information they need. The produced summaries and
topics are evaluated based on their performance, coherence, and
interoperability.
The findings of this research contribute to the field of API documentation
analysis by providing insights into recurring topics, identifying common
issues, and generating potential solutions. By improving the accessibility and
efficiency of API documentation comprehension, our work aims to enhance the
software development process and empower developers with practical tools for
navigating complex APIs
Recommended from our members
Computational Argumentation Approaches to Improve Sensemaking and Evidence-based Reasoning in Online Deliberation Systems
Deliberation is the process through which communities identify potential solutions for a problem and select the solution that most effectively meets their diverse requirements through dialogic communication. Online deliberation is implemented nowadays with means of social media and online discussion platforms; however, these media present significant challenges and issues that can be traced to inadequate support for Sensemaking processes and poor endorsement of the quality characteristics of deliberation.
This thesis investigates integrating computational argumentation methods in online deliberation platforms as an effective way to improve participants' perception of the quality of the deliberation process, their way of making sense of the overall process and producing healthier social dynamics.
For that, two computational artefacts are proposed: (i) a Synoptical summariser of long discussions and (ii) a Scientific Argument Recommender System (SciArgRecSys).
The two artefacts are designed and developed with state-of-the-art methods (with the use of Large Language Models - LLMs) and evaluated intrinsically and extrinsically when deployed in a real live platform (BCause).
Through extensive evaluation, the positive effect of both artefacts is illustrated in human Sensemaking and essential quality characteristics of deliberation such as reciprocal Engagement, Mutual Understanding, and Social dynamics. In addition, it has been demonstrated that these interventions effectively reduce polarisation, the formation of sub-communities while significantly enhancing the quality of the discussion by making it more coherent and diverse
Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021
The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at UniversitĂ degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown
Automatic information search for countering covid-19 misinformation through semantic similarity
Trabajo Fin de MĂĄster en BioinformĂĄtica y BiologĂa ComputacionalInformation quality in social media is an increasingly important issue and misinformation problem has become even more critical in the current COVID-19 pandemic, leading people exposed
to false and potentially harmful claims and rumours. Civil society organizations, such as the
World Health Organization, have demanded a global call for action to promote access to health
information and mitigate harm from health misinformation. Consequently, this project pursues
countering the spread of COVID-19 infodemic and its potential health hazards.
In this work, we give an overall view of models and methods that have been employed in the
NLP field from its foundations to the latest state-of-the-art approaches. Focusing on deep learning methods, we propose applying multilingual Transformer models based on siamese networks,
also called bi-encoders, combined with ensemble and PCA dimensionality reduction techniques.
The goal is to counter COVID-19 misinformation by analyzing the semantic similarity between
a claim and tweets from a collection gathered from official fact-checkers verified by the International Fact-Checking Network of the Poynter Institute.
It is factual that the number of Internet users increases every year and the language spoken
determines access to information online. For this reason, we give a special effort in the application of multilingual models to tackle misinformation across the globe. Regarding semantic
similarity, we firstly evaluate these multilingual ensemble models and improve the result in the
STS-Benchmark compared to monolingual and single models. Secondly, we enhance the interpretability of the modelsâ performance through the SentEval toolkit. Lastly, we compare these
modelsâ performance against biomedical models in TREC-COVID task round 1 using the BM25
Okapi ranking method as the baseline. Moreover, we are interested in understanding the ins
and outs of misinformation. For that purpose, we extend interpretability using machine learning
and deep learning approaches for sentiment analysis and topic modelling. Finally, we developed
a dashboard to ease visualization of the results.
In our view, the results obtained in this project constitute an excellent initial step toward
incorporating multilingualism and will assist researchers and people in countering COVID-19
misinformation
Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes
Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute