Online visibility of software-related web sites: The case of biomedical text mining tools

Abstract

Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.ipm.2018.11.011.Internet, in general, and the WWW, in particular, have become an immediate, practical means of introducing software tools and resources, and most importantly, a key vehicle to attract the attention of the potential users. In this scenario, content organization as well as different development practices may affect the online visibility of the target resource. Therefore, the careful selection, organization and presentation of contents are critical to guarantee that the main features of the target tool can be easily discovered by potential visitors, while ensuring a proper indexation by automatic online systems and resource recognizers. Understanding how software is depicted in scientific manuscripts and comparing these texts with the corresponding online descriptions can help to improve the visibility of the target website. It is particularly relevant to be able to align online descriptions and those found in literature, and use the resulting knowledge to improve software indexing and grouping. Therefore, this paper presents a novel method for formally defining and mining software-related websites and related literature with the ultimate aim of improving the global online visibility of the software. As a proof of concept, the method was used to evaluate the online visibility of biomedical text mining tools. These tools have evolved considerably in the last decades, and are gathering together a heterogeneous development community as well as various user groups. For the most part, these tools are not easily discovered via general search engines. Hence, the proposed method enabled the identification of specific issues regarding the visibility of these online contents and the discussion of some possible improvements.SING group thanks CITI (Centro de Investigación, Transferencia e Innovación) from University of Vigo for hosting its IT infrastructure. This work was partially supported by the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE2020(POCI-01-0145-FEDER-006684).The authors also acknowledge the Ph.D.grants of MartínPérez-Pérez and Gael Pérez - Rodríguez, funded by the Xunta de Galicia.info:eu-repo/semantics/publishedVersio

    Similar works