3 research outputs found

    Generating a Malay sentiment lexicon based on wordnet

    Get PDF
    Sentiment lexicon is a list of vocabularies that consists of positive and negative words. In opinion mining, sentiment lexicon is one of the important source in text polarity classification task in sentiment analysis model. Studies in Malay sentiment analysis is increasing since the volume of sentiment data is growing on social media. Therefore, requirement in Malay sentiment lexicon is high. However, Malay sentiment lexicon development is a difficult task due to the scarcity of Malay language resource. Thus, various approaches and techniques are used to generate sentiment lexicon. The objective of this paper is to develop Malay sentiment lexicon generation algorithm based on WordNet. In this study, the method is to map the WordNet Bahasa with English WordNet to get the offset value of a seed set of sentiment words. The seed set is used to generate the synonym and antonym semantic relation in English WordNet. The highest result achives 86.58% agreement with human annotators and 91.31% F1-measure in word polarity classification. The result shows the effectiveness of the proposed algorithm to generate Malay sentiment lexicon based on WordNet

    Una propuesta de mejoramiento del estándar Essence mediante el uso de unificación terminológica

    Get PDF
    Context: SEMAT (Software Engineering Method and Theory) is promoting a software engineering theory with adequate terminology to improve the transference of methods and practices between teams. Terminologies should be uniform in order to eliminate ambiguity, improve communication among teams, and support new concepts. Method: The process of reaching uniformity is called terminology unification. In this paper we propose a method for improving the Essence standard based on terminology unification. This method comprises four stages: selection of base models and definitions for structuring terms, identification of terminology problems by comparing the base models and definitions, unification of terms among the base models and definitions, and measurement of the gap between the current standard terms and the proposed changes. Results: We propose a set of modifications to the Essence standard in constructs like: alpha state cards, relationships among alphas, and names of activity spaces. Conclusions: By solving such conflicts, we can define a common, unambiguous terminology for software engineering teams.Contexto: En SEMAT (Software Engineering Method and Theory) se promueve una nueva teoría de la ingeniería de software con terminología apropiada para mejorar la transferencia de métodos y prácticas entre equipos. La terminología debe ser uniforme para eliminar la ambigüedad, mejorar la comunicación entre equipos y apoyar el surgimiento de nuevos conceptos. Método: Al proceso para alcanzar uniformidad se le denomina unificación terminológica. En este artículo se propone un mejoramiento del estándar Essence basado en la unificación terminológica. Este método comprende cuatro etapas: selección de modelos base y definiciones para estructurar términos, identificación de problemas terminológicos comparando las bases y definiciones, unificación de términos con base en los modelos y definiciones y medición de la brecha entre los términos actuales del estándar y los cambios propuestos. Resultados: Se propone un conjunto de modificaciones al estándar Essence en constructos como cartas de estado de los alfas, relaciones entre los alfas y nombres de los espacios de actividad. Conclusiones: Al corregir estos conflictos, es posible definir una terminología común y sin ambigüedades para todos los equipos de ingeniería de software

    Automating the Human Factors Analysis and Classification System (HFACS): An Initial Investigation Based on Error Reports in Radiation Oncology

    Get PDF
    This study constitutes an evaluation of two datasets of error reports from the University of North Carolina School of Medicine's Department of Radiation Oncology. These errors were reported in accord with the Human Factors Analysis and Classification System (HFACS), using HFACS Level 1 and HFACS Level 2 codes. Keywords from an initial dataset of 58 reports, a list of HFACS theory keywords, and a list of related thesaurus words were used to develop a dictionary of signal words. These words are related to the HFACS Level 1 code for "Condition of Operator" and the HFACS Level 2 code for "Inattention-Distraction." These words were evaluated for relevance to the "Inattention-Distraction" category, both at face value, and in the context of "Inattention-Distraction" reports from the initial dataset of 58 reports. The signal words were then evaluated a second time in the context of a second dataset of 3459 reports, for confirmation of contextual relevance. The findings suggest that while more data is needed for future research, findings could lead to the development of an automated and more user-friendly HFACS reporting system at UNC School of Medicine's Department of Radiation Oncology.Master of Science in Information Scienc
    corecore