116,155 research outputs found
Ciclo de vida de hiperlinks: um estudo sobre a persistência e perda de referências e conteúdos da Web
Este estudo foi realizado para investigar as características e o grau de persistência dos links da web. Pesquisadores como Król e Zdonek (2020) demonstram diferentes graus de persistência de links da web, esse trabalho busca apontar os fenômenos como link rot e reference rot que causam falha no acesso do conteúdo direcionado por esses links da web, causando o seu desaparecimento. Koehler (2004) realizou uma das mais extensas pesquisas sobre o tema e apresentou um estudo comparando a persistência de recursos da web. Identificamos o ciclo de vidas dos links da web utilizados como referência bibliográfica em teses do Lume. Concluímos essa tarefa mapeando as teses do Lume, identificando páginas de referência bibliográfica nas teses e extraindo os links da web das páginas selecionadas. As etapas dessa pesquisa são guiadas pelas orientações encontradas em pesquisas correlatas. Criamos categorias para análise conforme orientações citadas por Dimitrova e Bugeja (2007), demonstrando dessa maneira a persistência dos links nas teses selecionadas. Na literatura científica brasileira identificamos uma escassez de informações sobre a persistência de links da web quando utilizados em referências bibliográficas. Para realização deste estudo, após a definição do Lume como nosso corpus de pesquisa, realizamos a extração dos links da web das referências bibliográficas, uma amostra de 368 teses entre os anos de 2012 e 2021. O conjunto continha 5582 links os quais foram testados para sua disponibilidade, sendo considerados acessíveis ou com falha. Os resultados apontam que apenas 48% dos links utilizados como referência bibliográfica em 2012 ainda estavam acessíveis, uma meia-vida de 8,02 anos foi estimada para o conjunto estudado. O uso de links persistentes e políticas de preservação da web podem contribuir para sua maior persistência.This study was carried out to investigate the characteristics and degree of persistence of web links. Researchers such as Król and Zdonek (2020) demonstrate different degrees of persistence of web links. This work seeks to point out phenomena such as link rot and reference rot that cause failure to access content directed by these web links, causing their disappearance. Koehler (2004) carried out one of the most extensive studies on the subject and presented a study comparing the persistence of web resources. We identified the life cycle of web links used as bibliographic references in Lume theses. We completed this task by mapping Lume theses, identifying bibliographic reference pages in theses, and extracting web links from selected pages. The stages of this research are guided by the guidelines found in related research. We created categories for analysis according to guidelines cited by Dimitrova and Bugeja (2007), thus demonstrating the persistence of links in selected theses. In the Brazilian scientific literature, we identified a lack of information about the persistence of web links when used in bibliographic references. To carry out this study, after defining Lume as our research corpus, we extracted web links from bibliographic references, a sample of 368 theses between 2012 and 2021. The set contained 5582 links that were tested for their availability, being considered accessible or failing. The results indicate that only 48% of the links used as bibliographic references in 2012 were still accessible. A half-life of 8.02 years was estimated for the studied set. Persistent links and web preservation policies can contribute to its greater persistence
COEL: A Web-based Chemistry Simulation Framework
The chemical reaction network (CRN) is a widely used formalism to describe
macroscopic behavior of chemical systems. Available tools for CRN modelling and
simulation require local access, installation, and often involve local file
storage, which is susceptible to loss, lacks searchable structure, and does not
support concurrency. Furthermore, simulations are often single-threaded, and
user interfaces are non-trivial to use. Therefore there are significant hurdles
to conducting efficient and collaborative chemical research. In this paper, we
introduce a new enterprise chemistry simulation framework, COEL, which
addresses these issues. COEL is the first web-based framework of its kind. A
visually pleasing and intuitive user interface, simulations that run on a large
computational grid, reliable database storage, and transactional services make
COEL ideal for collaborative research and education. COEL's most prominent
features include ODE-based simulations of chemical reaction networks and
multicompartment reaction networks, with rich options for user interactions
with those networks. COEL provides DNA-strand displacement transformations and
visualization (and is to our knowledge the first CRN framework to do so), GA
optimization of rate constants, expression validation, an application-wide
plotting engine, and SBML/Octave/Matlab export. We also present an overview of
the underlying software and technologies employed and describe the main
architectural decisions driving our development. COEL is available at
http://coel-sim.org for selected research teams only. We plan to provide a part
of COEL's functionality to the general public in the near future.Comment: 23 pages, 12 figures, 1 tabl
Analyzing the Persistence of Referenced Web Resources with Memento
In this paper we present the results of a study into the persistence and
availability of web resources referenced from papers in scholarly repositories.
Two repositories with different characteristics, arXiv and the UNT digital
library, are studied to determine if the nature of the repository, or of its
content, has a bearing on the availability of the web resources cited by that
content. Memento makes it possible to automate discovery of archived resources
and to consider the time between the publication of the research and the
archiving of the referenced URLs. This automation allows us to process more
than 160000 URLs, the largest known such study, and the repository metadata
allows consideration of the results by discipline. The results are startling:
45% (66096) of the URLs referenced from arXiv still exist, but are not
preserved for future generations, and 28% of resources referenced by UNT papers
have been lost. Moving forwards, we provide some initial recommendations,
including that repositories should publish URL lists extracted from papers that
could be used as seeds for web archiving systems.Comment: 4 pages, 5 figures. Accepted to Open Repositories 2011 Conferenc
Recommended from our members
Citation and peer review of data: moving towards formal data publication
This paper discusses many of the issues associated with formally publishing data in academia, focusing primarily on the structures that need to be put in place for peer review and formal citation of datasets. Data publication is becoming increasingly important to the scientific community, as it will provide a mechanism for those who create data to receive academic credit for their work and will allow the conclusions arising from an analysis to be more readily verifiable, thus promoting transparency in the scientific process. Peer review of data will also provide a mechanism for ensuring the quality of datasets, and we provide suggestions on the types of activities one expects to see in the peer review of data. A simple taxonomy of data publication methodologies is presented and evaluated, and the paper concludes with a discussion of dataset granularity, transience and semantics, along with a recommended human-readable citation syntax
Eprints and the Open Archives Initiative
The Open Archives Initiative (OAI) was created as a practical way to promote
interoperability between eprint repositories. Although the scope of the OAI has
been broadened, eprint repositories still represent a significant fraction of
OAI data providers. In this article I present a brief survey of OAI eprint
repositories, and of services using metadata harvested from eprint repositories
using the OAI protocol for metadata harvesting (OAI-PMH). I then discuss
several situations where metadata harvesting may be used to further improve the
utility of eprint archives as a component of the scholarly communication
infrastructure.Comment: 13 page
Biodiversity informatics: the challenge of linking data and the role of shared identifiers
A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers
Emotional persistence in online chatting communities
How do users behave in online chatrooms, where they instantaneously read and
write posts? We analyzed about 2.5 million posts covering various topics in
Internet relay channels, and found that user activity patterns follow known
power-law and stretched exponential distributions, indicating that online chat
activity is not different from other forms of communication. Analysing the
emotional expressions (positive, negative, neutral) of users, we revealed a
remarkable persistence both for individual users and channels. I.e. despite
their anonymity, users tend to follow social norms in repeated interactions in
online chats, which results in a specific emotional "tone" of the channels. We
provide an agent-based model of emotional interaction, which recovers
qualitatively both the activity patterns in chatrooms and the emotional
persistence of users and channels. While our assumptions about agent's
emotional expressions are rooted in psychology, the model allows to test
different hypothesis regarding their emotional impact in online communication.Comment: 34 pages, 4 main and 12 supplementary figure
Computational Controversy
Climate change, vaccination, abortion, Trump: Many topics are surrounded by
fierce controversies. The nature of such heated debates and their elements have
been studied extensively in the social science literature. More recently,
various computational approaches to controversy analysis have appeared, using
new data sources such as Wikipedia, which help us now better understand these
phenomena. However, compared to what social sciences have discovered about such
debates, the existing computational approaches mostly focus on just a few of
the many important aspects around the concept of controversies. In order to
link the two strands, we provide and evaluate here a controversy model that is
both, rooted in the findings of the social science literature and at the same
time strongly linked to computational methods. We show how this model can lead
to computational controversy analytics that have full coverage over all the
crucial aspects that make up a controversy.Comment: In Proceedings of the 9th International Conference on Social
Informatics (SocInfo) 201
- …