510 research outputs found

    Bringing named entity recognition on Drupal content management system

    Get PDF
    Publicado em "8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014)"Content management systems and frameworks (CMS/F) play a key role in Web development. They support common Web operations and provide for a number of optional modules to implement customized functionalities. Given the increasing demand for text mining (TM) applications, it seems logical that CMS/F extend their offer of TM modules. In this regard, this work contributes to Drupal CMS/F with modules that support customized named entity recognition and enable the construction of domain-specific document search engines. Implementation relies on well-recognized Apache Information Retrieval and TM initiatives, namely Apache Lucene, Apache Solr and Apache Unstructured Information Management Architecture (UIMA). As proof of concept, we present here the development of a Drupal CMS/F that retrieves biomedical articles and performs automatic recognition of organism names to enable further organism-driven document screening

    Evaluating NLP toxicity tools: Towards the ethical limits

    Get PDF
    In the last years we have seen and big evolution in the field of neuronal networks, and the field of natural language processing (NLP). Solutions as voice assistants write assistance, or chatbots are present, every time more often, in our daily work. In addition, these techniques are used for more sophisticated analysis as sentimental classification or hate-speech detection. In contrast, the detection of gender or racial biases in these solutions has created problems. This problem has opened a debate around the limitations and potentials of these solutions. The goal of this work is to evaluate the present tools around the sentimental analysis that are available at the moment of writing. To achieve this, we have selected a set of tools and we have compared its usability over a specific dataset focused on biased detection. In addition, we have developed a tool to evaluate these models in a real-world application by integrating these models into Content Management systems. The developed tool has the goal to help in the moderation of the content in the CMS, is developed over a popular CMS distribution (Drupal). Finally, we present a debate around the ethics and fairness in sentiment analysis using NLP

    The building and application of a semantic platform for an e-research society

    No full text
    This thesis reviews the area of e-Research (the use of electronic infrastructure to support research) and considers how the insight gained from the development of social networking sites in the early 21st century might assist researchers in using this infrastructure. In particular it examines the myExperiment project, a website for e-Research that allows users to upload, share and annotate work flows and associated files, using a social networking framework. This Virtual Organisation (VO) supports many of the attributes required to allow a community of users to come together to build an e-Research society. The main focus of the thesis is how the emerging society that is developing out of my-Experiment could use Semantic Web technologies to provide users with a significantly richer representation of their research and research processes to better support reproducible research. One of the initial major contributions was building an ontology for myExperiment. Through this it became possible to build an API for generating and delivering this richer representation and an interface for querying it. Having this richer representation it has been possible to follow Linked Data principles to link up with other projects that have this type of representation. Doing this has allowed additional data to be provided to the user and has begun to set in context the data produced by myExperiment. The way that the myExperiment project has gone about this task and consideration of how changes may affect existing users, is another major contribution of this thesis. Adding a semantic representation to an emergent e-Research society like myExperiment,has given it the potential to provide additional applications. In particular the capability to support Research Objects, an encapsulation of a scientist's research or research process to support reproducibility. The insight gained by adding a semantic representation to myExperiment, has allowed this thesis to contribute towards the design of the architecture for these Research Objects that use similar Semantic Web technologies. The myExperiment ontology has been designed such that it can be aligned with other ontologies. Scientific Discourse, the collaborative argumentation of different claims and hypotheses, with the support of evidence from experiments, to construct, confirm or disprove theories requires the capability to represent experiments carried out in silico. This thesis discusses how, as part of the HCLS Scientific Discourse subtask group, the myExperiment ontology has begun to be aligned with other scientific discourse ontologies to provide this capability. It also compares this alignment of ontologies with the architecture for Research Objects. This thesis has also examines how myExperiment's Linked Data and that of other projects can be used in the design of novel interfaces. As a theoretical exercise, it considers how this Linked Data might be used to support a Question-Answering system, that would allow users to query myExperiment's data in a more efficient and user-friendly way. It concludes by reviewing all the steps undertaken to provide a semantic platform for an emergent e-Research society to facilitate the sharing of research and its processes to support reproducible research. It assesses their contribution to enhancing the features provided by myExperiment, as well as e-Research as a whole. It considers how the contributions provided by this thesis could be extended to produce additional tools that will allow researchers to make greater use of the rich data that is now available, in a way that enhances their research process rather than significantly changing it or adding extra workload

    Volume 34, Number 3, September 2014 OLAC Newsletter

    Get PDF
    Digitized September 2014 issue of the OLAC Newsletter

    Enabling Scalable Multi-channel Communication through Semantic Technologies

    Get PDF
    With the advance of the Web in the direction Social Media the number of communication possibilities has exponentially increased bringing new challenges and opportunities for companies to build and shape their reputation online as well as to engage and maintain the relationships to their customers. In this paper we describe how semantic technologies enable scalable, effective and efficient on-line communication. We illustrate four different ways in which semantics can be used for this purpose. First, we discuss semantic analysis of communication items based on 'classical' semantic, such as natural language processing. Second, we look at semantics as a channel, viewing Linked Open Data vocabularies not only as terminological assets but as communication channels. Third, semantics provide the methodologies and tools for content modeling by means of ontologies. Finally, semantics through semantic matchmaking enable semi-automatic assignment and distribution of content to channels and vice-versa

    DARIAH and the Benelux

    Get PDF

    An exploration of online information spaces that support instructional design and teacher professional development

    Get PDF
    Members in online communities of practice (CoPs) take advantage of information and communication technologies (ICTs) to exchange practical or work-related knowledge in asynchronous online environments. Practical knowledge represents individuals' mental models allowing them to interact with the environment and perform tasks. With ICTs, practical knowledge accumulates over time and becomes an integral part of online CoPs. Due to ease of implementation, content management systems (CMSs) and social media platforms, primarily Facebook, have enabled the emergence of large online CoPs. However, research has shown that online CoPs are not conducive information spaces for seeking solutions independently, and hashtags used for topic organization are not representative of the wealth of practical knowledge. This three-article dissertation describes design recommendations for supporting the information needs of community members by analyzing the practical knowledge in instructional design and technology (IDT) that rely on a CMS and the Facebook platform and conducting usability testing to improve an existing teacher professional development CoP. By applying natural language processing (NLP) and usability testing, quantitative and qualitative approaches were implemented to examine the practical knowledge and help guide the design of information spaces that enable members to search for solutions through better topic representations or categories. The results of the first study showed that the e-learning development CoP emphasized producing online articles related to educational technology and the lack of transparency in evaluating such materials. The results of the second study showed that the four IDT CoPs on the Facebook platform were characterized by the lack of effective topic structures representative of the accumulated knowledge and the lack of community protocols for curating knowledge and taking corrective actions toward misinformation. The third study relied on usability testing to design an information space to support educators' ability to align materials with Missouri teacher standards. This three-article dissertation suggests five design features that online CoPs can implement in addressing the shortcomings of asynchronous online environments, including (1) improving topic organization, (2) establishing community protocols, (3) increasing transparency, (4) improving search functions, and (5) leveraging NLP in future web technologies. Lastly, the dissertation discussed the results of the three published studies, offered recommendations for improving online CoPs as conducive information spaces, and provided future directions.Includes bibliographical references

    Provenance XXX

    Get PDF
    • …
    corecore