110 research outputs found

    A Review of the Analytics Techniques for an Efficient Management of Online Forums: An Architecture Proposal

    Get PDF
    E-learning is a response to the new educational needs of society and an important development in information and communication technologies because it represents the future of the teaching and learning processes. However, this trend presents many challenges, such as the processing of online forums which generate a huge number of messages with an unordered structure and a great variety of topics. These forums provide an excellent platform for learning and connecting students of a subject but the difficulty of following and searching the vast volume of information that they generate may be counterproductive. The main goal of this paper is to review the approaches and techniques related to online courses in order to present a set of learning analytics techniques and a general architecture that solve the main challenges found in the state of the art by managing them in a more efficient way: 1) efficient tracking and monitoring of forums generated; 2) design of effective search mechanisms for questions and answers in the forums; and 3) extraction of relevant key performance indicators with the objective of carrying out an efficient management of online forums. In our proposal, natural language processing, clustering, information retrieval, question answering, and data mining techniques will be used.This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the Project SEQUOIA-UA under Grant TIN2015-63502-C3-3-R, the Project RESCATA under Grant TIN2015-65100-R, and the Project PROMETEO/2018/089, and in part by the Spanish Research Agency (AEI) and the European Regional Development Fund (FEDER) through the Project CloudDriver4Industry under Grant TIN2017-89266-R

    The Influence of Nursing Home Administrator Turnover on Resident Quality of Life

    Get PDF
    By 2040, 79.7 million older adults will live in the US, and nearly 40% will need nursing home services that are primarily funded by Medicare and Medicaid. Researchers have underscored the importance of leadership in quality healthcare care delivery, suggesting that nursing home administrator turnover could influence resident quality of life, causing ill-health for the residents and preventable medical costs for taxpayers. In spite of the suggested association, little research has specifically examined the role of administrator turnover on resident quality of life. As such, the purpose and central research questions of this case study were designed specifically to address the relationship between nursing home administrator turnover and resident quality of life. The Donabedian health services quality model was the framework for the study. Data were collected from 14 nursing homes, and included semistructured interview data with 7 nursing home administrators, and a review of other documents related to quality of care including site visit reports and surveys. An iterative process of coding and constant comparison was used to identify themes and categories from the data. The findings indicate that turnover likely caused an adverse impact on the nursing home overall, which was expected. The study also determined, however, that high turnover itself was not perceived to be associated to low resident quality of life. The implication for social change is that nursing home stakeholders may develop processes to retain competent administrators which in turn could reduce absent leadership presence in nursing homes. Consistent leadership presence may lead to improvement in quality of life regulatory compliance and reduction in unnecessary Medicare and Medicaid spending by nursing home residents

    SOCIALQ&A: A NOVEL APPROACH TO NOTIFIYING THE CORRECT USERS IN QUESTION AND ANSWERING SYSTEMS

    Get PDF
    Question and Answering (Q&A) systems are currently in use by a large number of Internet users. Q&A systems play a vital role in our daily life as an important platform for information and knowledge sharing. Hence, much research has been devoted to improving the performance of Q&A systems, with a focus on improving the quality of answers provided by users, reducing the wait time for users who ask questions, using a knowledge base to provide answers via text mining, and directing questions to appropriate users. Due to the growing popularity of Q&A systems, the number of questions in the system can become very large; thus, it is unlikely for an answer provider to simply stumble upon a question that he/she can answer properly. The primary objective of this research is to improve the quality of answers and to decrease wait times by forwarding questions to users who exhibit an interest or expertise in the area to which the question belongs. To that end, this research studies how to leverage social networks to enhance the performance of Q&A systems. We have proposed SocialQ&A, a social network based Q&A system that identifies and notifies the users who are most likely to answer a question. SocialQ&A incorporates three major components: User Interest Analyzer, Question Categorizer, and Question- User Mapper. The User Interest Analyzer associates each user with a vector of interest categories. The Question Categorizer algorithm associates a vector of interest categories to each question. Then, based on user interest and user social connectedness, the Question-User Mapper identifies a list of potential answer providers for each question. We have also implemented a real-world prototype for SocialQ&A and analyzed the data from questions/answers obtained from the prototype. Results suggest that social networks can be leveraged to improve the quality of answers and reduce the wait time for answers. Thus, this research provides a promising direction to improve the performance of Q&A systems

    Justification for Class 3 Permit Modification, Corrective Action Complete with Controls, Solid Waste Management Unit 76, Mixed Waste Landfill, Sandia National Laboratories/New Mexico, EPA ID Number NM5890110518 Volumes I through VIII

    Get PDF
    The Department of Energy/National Nuclear Security Administration (DOE) and Sandia Corporation (Sandia) are submitting a request for a Class 3 Modification to Module IV of Hazardous Waste Permit NM5890110518-1 (the Permit). DOE and Sandia are requesting that the New Mexico Environment Department (NMED) designate solid waste management unit (SWMU) 76 as approved for Corrective Action Complete status. NMED made a preliminary determination in October 2014 that corrective action is complete at this SWMU. SWMU 76, known as the Mixed Waste Landfill (MWL), is a 2.6-acre site at Sandia National Laboratories, located on Kirtland Air Force Base immediately southeast of Albuquerque, New Mexico. Radioactive wastes and mixed wastes (radioactive wastes that are also hazardous wastes) were disposed of in the MWL from March 1959 through December 1988. The meximum depth of burial is approximately 25 feet below the ground surface. Groundwater occurs approximately 500 feet below the ground surface at the MWL. DOE and Sandia have implemented corrective measures at SWMU 76 in accordance with the requirements of the Permit; an April 2004 Compliance Order on Consent between NMED, DOE, and Sandia; and the plans approved by NMED. On January 8, 2014, NMED approved a long-term monitoring and maintenance plan (LTMMP) for SWMU 76. DOE and Sandia have implemented the approved LTMMP, maintaining the controls established through the corrective measures. The permit modification request consists of a letter with two enclosures: 1. A brief history or corrective action at SWMU 76 2. An index of the supporting documents that comprise the justification for the permit modification request. The supporting documents are included in an 8-volume set: Justification for Class 3 Permit Modification for Corrective Action Complete With Controls, Solid Waste Management Unit 76, Mixed Waste Landfill. Volume/pages: I/858. II/420. III/556. IV/1128. V/848. VI/1110. VII/914. VIII/866

    COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE

    Get PDF
    Much research in recent years has focused on question answering. Due to significant advances in answering simple fact-seeking questions, research is moving towards resolving complex questions. An approach adopted by many researchers is to decompose a complex question into a series of fact-seeking questions and reuse techniques developed for answering simple questions. This thesis presents an alternative novel approach to domain-specific complex question answering based on consistently applying a semantic domain model to question and document understanding as well as to answer extraction and generation. This study uses a semantic domain model of clinical medicine to encode (a) a clinician's information need expressed as a question on the one hand and (b) the meaning of scientific publications on the other to yield a common representation. It is hypothesized that this approach will work well for (1) finding documents that contain answers to clinical questions and (2) extracting these answers from the documents. The domain of clinical question answering was selected primarily because of its unparalleled resources that permit providing a proof by construction for this hypothesis. In addition, a working prototype of a clinical question answering system will support research in informed clinical decision making. The proposed methodology is based on the semantic domain model developed within the paradigm of Evidence Based Medicine. Three basic components of this model - the clinical task, a framework for capturing a synopsis of a clinical scenario that generated the question, and strength of evidence presented in an answer - are identified and discussed in detail. Algorithms and methods were developed that combine knowledge-based and statistical techniques to extract the basic components of the domain model from abstracts of biomedical articles. These algorithms serve as a foundation for the prototype end-to-end clinical question answering system that was built and evaluated to test the hypotheses. Evaluation of the system on test collections developed in the course of this work and based on real life clinical questions demonstrates feasibility of complex question answering and high accuracy information retrieval using a semantic domain model

    A Cross-domain and Cross-language Knowledge-based Representation of Text and its Meaning

    Full text link
    Tesis por compendioNatural Language Processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human languages. One of its most challenging aspects involves enabling computers to derive meaning from human natural language. To do so, several meaning or context representations have been proposed with competitive performance. However, these representations still have room for improvement when working in a cross-domain or cross-language scenario. In this thesis we study the use of knowledge graphs as a cross-domain and cross-language representation of text and its meaning. A knowledge graph is a graph that expands and relates the original concepts belonging to a set of words. We obtain its characteristics using a wide-coverage multilingual semantic network as knowledge base. This allows to have a language coverage of hundreds of languages and millions human-general and -specific concepts. As starting point of our research we employ knowledge graph-based features - along with other traditional ones and meta-learning - for the NLP task of single- and cross-domain polarity classification. The analysis and conclusions of that work provide evidence that knowledge graphs capture meaning in a domain-independent way. The next part of our research takes advantage of the multilingual semantic network and focuses on cross-language Information Retrieval (IR) tasks. First, we propose a fully knowledge graph-based model of similarity analysis for cross-language plagiarism detection. Next, we improve that model to cover out-of-vocabulary words and verbal tenses and apply it to cross-language document retrieval, categorisation, and plagiarism detection. Finally, we study the use of knowledge graphs for the NLP tasks of community questions answering, native language identification, and language variety identification. The contributions of this thesis manifest the potential of knowledge graphs as a cross-domain and cross-language representation of text and its meaning for NLP and IR tasks. These contributions have been published in several international conferences and journals.El Procesamiento del Lenguaje Natural (PLN) es un campo de la informática, la inteligencia artificial y la lingüística computacional centrado en las interacciones entre las máquinas y el lenguaje de los humanos. Uno de sus mayores desafíos implica capacitar a las máquinas para inferir el significado del lenguaje natural humano. Con este propósito, diversas representaciones del significado y el contexto han sido propuestas obteniendo un rendimiento competitivo. Sin embargo, estas representaciones todavía tienen un margen de mejora en escenarios transdominios y translingües. En esta tesis estudiamos el uso de grafos de conocimiento como una representación transdominio y translingüe del texto y su significado. Un grafo de conocimiento es un grafo que expande y relaciona los conceptos originales pertenecientes a un conjunto de palabras. Sus propiedades se consiguen gracias al uso como base de conocimiento de una red semántica multilingüe de amplia cobertura. Esto permite tener una cobertura de cientos de lenguajes y millones de conceptos generales y específicos del ser humano. Como punto de partida de nuestra investigación empleamos características basadas en grafos de conocimiento - junto con otras tradicionales y meta-aprendizaje - para la tarea de PLN de clasificación de la polaridad mono- y transdominio. El análisis y conclusiones de ese trabajo muestra evidencias de que los grafos de conocimiento capturan el significado de una forma independiente del dominio. La siguiente parte de nuestra investigación aprovecha la capacidad de la red semántica multilingüe y se centra en tareas de Recuperación de Información (RI). Primero proponemos un modelo de análisis de similitud completamente basado en grafos de conocimiento para detección de plagio translingüe. A continuación, mejoramos ese modelo para cubrir palabras fuera de vocabulario y tiempos verbales, y lo aplicamos a las tareas translingües de recuperación de documentos, clasificación, y detección de plagio. Por último, estudiamos el uso de grafos de conocimiento para las tareas de PLN de respuesta de preguntas en comunidades, identificación del lenguaje nativo, y identificación de la variedad del lenguaje. Las contribuciones de esta tesis ponen de manifiesto el potencial de los grafos de conocimiento como representación transdominio y translingüe del texto y su significado en tareas de PLN y RI. Estas contribuciones han sido publicadas en diversas revistas y conferencias internacionales.El Processament del Llenguatge Natural (PLN) és un camp de la informàtica, la intel·ligència artificial i la lingüística computacional centrat en les interaccions entre les màquines i el llenguatge dels humans. Un dels seus majors reptes implica capacitar les màquines per inferir el significat del llenguatge natural humà. Amb aquest propòsit, diverses representacions del significat i el context han estat proposades obtenint un rendiment competitiu. No obstant això, aquestes representacions encara tenen un marge de millora en escenaris trans-dominis i trans-llenguatges. En aquesta tesi estudiem l'ús de grafs de coneixement com una representació trans-domini i trans-llenguatge del text i el seu significat. Un graf de coneixement és un graf que expandeix i relaciona els conceptes originals pertanyents a un conjunt de paraules. Les seves propietats s'aconsegueixen gràcies a l'ús com a base de coneixement d'una xarxa semàntica multilingüe d'àmplia cobertura. Això permet tenir una cobertura de centenars de llenguatges i milions de conceptes generals i específics de l'ésser humà. Com a punt de partida de la nostra investigació emprem característiques basades en grafs de coneixement - juntament amb altres tradicionals i meta-aprenentatge - per a la tasca de PLN de classificació de la polaritat mono- i trans-domini. L'anàlisi i conclusions d'aquest treball mostra evidències que els grafs de coneixement capturen el significat d'una forma independent del domini. La següent part de la nostra investigació aprofita la capacitat\hyphenation{ca-pa-ci-tat} de la xarxa semàntica multilingüe i se centra en tasques de recuperació d'informació (RI). Primer proposem un model d'anàlisi de similitud completament basat en grafs de coneixement per a detecció de plagi trans-llenguatge. A continuació, vam millorar aquest model per cobrir paraules fora de vocabulari i temps verbals, i ho apliquem a les tasques trans-llenguatges de recuperació de documents, classificació, i detecció de plagi. Finalment, estudiem l'ús de grafs de coneixement per a les tasques de PLN de resposta de preguntes en comunitats, identificació del llenguatge natiu, i identificació de la varietat del llenguatge. Les contribucions d'aquesta tesi posen de manifest el potencial dels grafs de coneixement com a representació trans-domini i trans-llenguatge del text i el seu significat en tasques de PLN i RI. Aquestes contribucions han estat publicades en diverses revistes i conferències internacionals.Franco Salvador, M. (2017). A Cross-domain and Cross-language Knowledge-based Representation of Text and its Meaning [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/84285TESISCompendi

    User behavior modeling: Towards solving the duality of interpretability and precision

    Get PDF
    User behavior modeling has become an indispensable tool with the proliferation of socio-technical systems to provide a highly personalized experience to the users. These socio-technical systems are used in sectors as diverse as education, health, law to e-commerce, and social media. The two main challenges for user behavioral modeling are building an in-depth understanding of online user behavior and using advanced computational techniques to capture behavioral uncertainties accurately. This thesis addresses both these challenges by developing interpretable models that aid in understanding user behavior at scale and by developing sophisticated models that perform accurate modeling of user behavior. Specifically, we first propose two distinct interpretable approaches to understand explicit and latent user behavioral characteristics. Firstly, in Chapter 3, we propose an interpretable Gaussian Hidden Markov Model-based cluster model leveraging user activity data to identify users with similar patterns of behavioral evolution. We apply our approach to identify researchers with similar patterns of research interests evolution. We further show the utility of our interpretable framework to identify differences in gender distribution and the value of awarded grants among the identified archetypes. We also demonstrate generality of our approach by applying on StackExchange to identify users with a similar change in usage patterns. Next in Chapter 4, we estimate user latent behavioral characteristics by leveraging user-generated content (questions or answers) in Community Question Answering (CQA) platforms. In particular, we estimate the latent aspect-based reliability representations of users in the forum to infer the trustworthiness of their answers. We also simultaneously learn the semantic meaning of their answers through text representations. We empirically show that the estimated behavioral representations can accurately identify topical experts. We further propose to improve current behavioral models by modeling explicit and implicit user-to-user influence on user behavior. To this end, in Chapter 5, we propose a novel attention-based approach to incorporate influence from both user's social connections and other similar users on their preferences in recommender systems. Additionally, we also incorporate implicit influence in the item space by considering frequently co-occurring and similar feature items. Our modular approach captures the different influences efficiently and later fuses them in an interpretable manner. Extensive experiments show that incorporating user-to-user influence outperforms approaches relying on solely user data. User behavior remains broadly consistent across the platform. Thus, incorporating user behavioral information can be beneficial to estimate the characteristics of user-generated content. To verify it, in Chapter 6, we focus on the task of best answer selection in CQA forums that traditionally only considers textual features. We induce multiple connections between user-generated content, i.e., answers, based on the similarity and contrast in the behavior of authoring users in the platform. These induced connections enable information sharing between connected answers and, consequently, aid in estimating the quality of the answer. We also develop convolution operators to encode these semantically different graphs and later merge them using boosting. We also proposed an alternative approach to incorporate user behavioral information by jointly estimating the latent behavioral representations of user with text representations in Chapter 7. We evaluate our approach on the offensive language prediction task on Twitter. Specially, we learn an improved text representation by leveraging syntactic dependencies between the words in the tweet. We also estimate the abusive behavior of users, i.e., their likelihood of posting offensive content online from their tweets. We further show that combining the textual and user behavioral features can outperform the sophisticated textual baselines

    Utilization of common human queries in ranking automatically generated questions

    Get PDF
    We challenge a form of Paragraph-to-Question generation task. We propose a question generation system which can be used to generate questions from a body of text. Our goal is to rank the generated questions by using Community-based Question Answering systems to calculate the importance of the questions beside tree kernel functions to assess how grammatically correct they are. The main assumption that our project is based on is that each body of text is related to a topic of interest and it has comprehensive information about the topic

    Mejora de los sistemas de recomendación de música de filtrado colaborativo: Un enfoque en la caracterización del usuario a partir de factores de comportamiento y contextuales

    Get PDF
    [ES] La popularización de la distribución digital de contenido multimedia, conocido como streaming, permite a cada vez más usuarios el acceso a prácticamente toda la música existente desde cualquier lugar sin la limitación de la capacidad de almacenamiento de los dispositivos. Esa enorme disponibilidad, así como la gran variedad de proveedores de estos servicios hace muy difícil al usuario encontrar música que pueda encajar en sus gustos. De ahí deriva el gran interés actual por el desarrollo de algoritmos de recomendación que ayuden al usuario a filtrar y descubrir la música que se ajusta a sus preferencias a partir de la enorme cantidad de contenido musical disponible en el espacio digital. La mayoría de las plataformas disponen de servicios de búsqueda y algunas de ellas disponen de mecanismos de recomendación y ofrecen listas personalizadas de reproducción (playlists), aunque todavía se requieren muchas mejoras. Los métodos utilizados en los sistemas de recomendación son muy variados, aunque los basados en filtrado colaborativo (FC) se encuentran entre los más extendidos. Las recomendaciones que proporcionan se basan en las valoraciones (ratings) que los usuarios hacen de los ítems a recomendar, que en el caso de los sistemas de recomendación de música son canciones o artistas. Las recomendaciones para un usuario dado se basan en las valoraciones realizadas por otros usuarios con gustos similares a él. Los resultados de este tipo de técnicas son bastante buenos, sin embargo, la dificultad de obtener la evaluación explicita de los ítems por parte de los usuarios hace que el número de valoraciones sea insuficiente, causando problemas de dispersión (sparsity), que impiden o dificultan la aplicación de tales métodos. Por este motivo, en algunas ocasiones se recurre a formas implícitas de obtener dicha información, las cuales son usualmente complejas y no siempre son efectivas. Otros problemas causados por la incorporación de nuevos usuarios o nuevos productos en el sistema son los de arranque en frío (cold start) y primera valoración (first rater) respectivamente. A esto hay que sumar la dificultad para ofrecer recomendaciones fiables a usuarios con gustos inusuales (gray sheep users). Para hacer frente a los problemas anteriores se han propuesto algoritmos basados en el contenido como alternativa a los métodos de CF. Estos métodos pueden utilizarse para recomendar cualquier ítem haciendo uso de sus características, de manera que el usuario recibe recomendaciones de ítems similares a otros por los que ha mostrado interés en el pasado. La mayoría de los sistemas de recomendación actuales utilizan técnicas híbridas destinadas a aprovechar las ventajas de ambos enfoques y evitar sus inconvenientes. Estos métodos hacen uso de atributos de ítems y usuarios, además de información de valoraciones. Este trabajo se centra en la caracterización del usuario con el fin de aumentar el grado de personalización y así mejorar las recomendaciones proporcionadas por los métodos de filtrado colaborativo. Las propuestas que se presentan, aunque pudieran hacerse extensivas a otros dominios de aplicación, se centran en el ámbito de la música debido a que la forma de consumo de la música difiere significativamente de la forma de consumir otros productos y, en consecuencia, algunos aspectos relativos a las recomendaciones también son diferentes. Los diferentes enfoques propuestos para caracterizar al usuario tienen en común el hecho de requerir únicamente la información disponible en las plataformas de música en streaming, sin necesidad de ningún dato adicional como puede ser información demográfica de los usuarios o atributos de los ítems. Además del hecho de no disponer de valoraciones explícitas de los ítems de música y tener que obtenerlos implícitamente a partir de las reproducciones de artistas o canciones por parte de cada usuario. La primera propuesta aborda el problema de la oveja negra mediante la caracterización del usuario en función de la popularidad de la música que escucha, lo que está estrechamente relacionado con la distribución de ley de potencia de la frecuencia de reproducción de los ítems. Este enfoque es aplicable tanto para la recomendación de artistas como de canciones, y en este último caso, las recomendaciones se pueden mejorar teniendo en cuenta la posición de las canciones en las sesiones del usuario. El tiempo es otro factor importante relacionado con el comportamiento y los hábitos del usuario. La propuesta de mejora de los métodos de recomendación en relación con este factor se aborda desde tres perspectivas centradas en el usuario: modelado tanto de la evolución de sus preferencias, como de sus hábitos de escucha en función del tiempo, y uso del tiempo como variable contextual para generar recomendaciones sensibles al contexto. El modelo de evolución de preferencias está incluido en el proceso de obtención de calificaciones implícitas. Otra forma de caracterizar al usuario es a través de su contexto social. Las plataformas de música en streaming no disponen de mucha información de este tipo. Sin embargo, los datos disponibles sobre relaciones de amistad y etiquetado social se pueden utilizar para este propósito. En concreto, esta información se ha utilizado en este trabajo para modelar su grado de influencia, a partir de las propiedades de confianza y homofilia, y su nivel de conocimiento (expertise) respectivamente. Aunque los métodos presentados no están diseñados específicamente para abordar el inconveniente del arranque en frío, algunos de ellos se han probado en este escenario, mostrando que también contribuyen a minimizar ese problema

    A Computational Model of Trust Based on Dynamic Interaction in the Stack Overflow Community

    Get PDF
    A member’s reputation in an online community is a quantified representation of their trustworthiness within the community. Reputation is calculated using rules-based algorithms which are primarily tied to the upvotes or downvotes a member receives on posts. The main drawback of this form of reputation calculation is the inability to consider dynamic factors such as a member’s activity (or inactivity) within the community. The research involves the construction of dynamic mathematical models to calculate reputation and then determine to what extent these results compare with rules-based models. This research begins with exploratory research of the existing corpus of knowledge. Constructive research in the building of mathematical dynamic models and then empirical research to determine the effectiveness of the models. Data collected from the Stack Overflow (SO) database is used by models to calculate a rule-based and dynamic member reputation and then using statistical correlation testing methods (i.e., Pearson and Spearman) to determine the extent of the relationship. Statistically significant results with moderate relationship size were found from correlation testing between rules-based and dynamic temporal models. The significance of the research and its conclusion that dynamic and temporal models can indeed produce results comparative to that of subjective vote-based systems is important in the context of building trust in online communities. Developing models to determine reputation in online communities based upon member post and comment activity avoids the potential drawbacks associated with vote-based reputation systems
    • …
    corecore