11 research outputs found

    Intelligent RSS Tool

    Get PDF
    Projecte realitzat en col·laboració amb la Aalto UniversityEasy access to a wide range of information available online enables people to explore this information with an ambition to explore interesting content even more. This opportunity often leads to a problem of finding interesting and relevant information from the sea of knowledge. This problem is often referred to as the information overload problem, which is getting harder and harder to deal with as the amount of information available online grows. In this thesis, one source of information is exploited and organized in such a way that the task of discovering new content is made easier. We use Really Simple Syndication (RSS) as our source of information and two methods to categorize it: document clustering with K-Means and Latent Dirichlet Allocation (LDA). We use the textual information that the RSS contains, each RSS feed usually contains a specific set of topics. Our first goal is to perform document clustering to the data, in order to generate meaningful clusters with the help of natural language processing (NLP) techniques to preprocess the data. Our second goal is to analyze the clustered RSS feeds and exploit the similarities between the documents to generate meaningful user models based on user feed subscriptions. The third goal is to provide relevant recommendations based on the user models we have learned. We combine the current state-of-the-art methods and present novel methods to compare feeds. We exploit WordNet shallow ontologies in our novel method to create generalized representations of the feeds. The final goal is to develop a functional application that can leverage the methods we developed with the help of machine learning libraries. The method we propose is a combination of document clustering techniques, text similarity, feed modeling and recommendation system.The results of our experiments show that K-Means clustered documents combined with recommendations based on the feed contents yield the best results. Using WordNet to measure the similarity of words provides also promising results. Further exploring the advantages of using semantic similarities would be an interesting research topic in the document similarity measures

    An Intelligent Technique for Extracting Subjects from User Profile Using ODP Ontology-Driven Reasoning

    Get PDF
    Abstract: Nowadays, the amount of available information, especially on the Web, is increasing. In this field, the role of user modeling and personalized information access is obviously vital. The traditional techniques like BOW (Bags of words) limit recommendations to the words which have been stored in the profile. In other words, the news items, which semantically relate to the users interests, can't be recognized and recommended to the users. Besides, BOW technique suffers from the curse of dimensionality, thus computational burden reduction is an essential task to efficiently handle a large number of terms in practical applications. This study focuses on the problem of choosing a representation of documents that can be suitable to induce concept-based user profiles as well as to support a content-based retrieval process. In this study, a new approach has been proposed to construct a ranked semantic user profile through extracting the related subjects. The new items can be recommended through collecting information from the user's selections, based on existing domain ontology ODP. The efficiency of the proposed technique has been shown by embedding it into an intelligent aggregator, RSS (RSS is acronym of " Really Simple Syndication) feed reader, which has been trained and evaluated by different and heterogeneous users. The results in experimental session show that the incoming news item which semantically relates to the profile gets highly recommended to the user despite its excluding of common words in the profile

    Differences in intention to use educational RSS feeds between Lebanese and British students: A multi‑group analysis based on the technology acceptance model

    Get PDF
    Really Simple Syndication (RSS) offers a means for university students to receive timely updates from virtual learning environments. However, despite its utility, only 21% of home students surveyed at a university in Lebanon claim to have ever used the technology. To investigate whether national culture could be an influence on intention to use RSS, the survey was extended to British students in the UK. Using the Technology Adoption Model (TAM) as a research framework, 437 students responded to a questionnaire containing four constructs: behavioral intention to use; attitude towards benefit; perceived usefulness; and perceived ease of use. Principle components analysis and structural equation modelling were used to explore the psychometric qualities and utility of TAM in both contexts. The results show that adoption was significantly higher, but also modest, in the British context at 36%. Configural and metric invariance were fully supported, while scalar and factorial invariance were partially supported. Further analysis shows significant differences between perceived usefulness and perceived ease of use across the two contexts studied. Therefore, it is recommended that faculty demonstrate to students how educational RSS feeds can be used effectively to increase awareness and emphasize usefulness in both contexts

    Desarrollo y evaluación de una solución tecnológica destinada a mejorar el acceso a contenidos digitales por parte de población excluida socialmente

    Get PDF
    El objetivo del TFM, es el desarrollo, implantación y posterior evaluación de una novedosa herramienta llamada RSS_PROYECT, basada en la tecnología RSS de sindicación de contenidos. En la actualidad, existen decenas de sindicadores Web de contenidos, aunque aún no existe uno diseñado para la búsqueda de noticias relativas a mujeres en riesgo de exclusión social ni posee las características y grado de configuración como el del programa sindicador RSS_PROYECT, presentado en este paper. Para obtener los contenidos, utilizaremos 2 filtros (genérico y selectivo), configurados por el administrador desde el módulo RSS_PROYECT instalado en Joomla. El filtro genérico permitirá hacer una búsqueda de las palabras introducidas en una serie de fuentes indexadas por el usuario. Este filtro mostrará todas las fuentes que contengan esta palabra, sin excepción. En el filtro selectivo, la condición para que se muestren las fuentes será que todas las palabras del filtro selectivo estén contenidas en la fuente. Para realizar este proyecto hemos utilizado diversos lenguajes como: PHP, MySQL, HTML, XML y la Aplication Program Interface (API) de Joomla! A su vez, hemos usado el programa Firebug, para medir la velocidad de respuesta del módulo en dos casos: con el filtro selectivo y el genérico. Los resultados han sido favorables para el filtro selectivo, y muy favorables para el filtro genérico, concluyendo que el tiempo de procesado ha sido bajo y se ejecuta con eficiencia. Se obtuvieron mejores tiempos promedio para el módulo RSS_PROYECT con respecto a otros módulos analizados en Joomla! Hoyen día, esta herramienta es usada por el Centro Integral de Ayuda a la Mujer (CIAM) de Valladolid, España.Teoría de la Señal y Comunicaciones e Ingeniería TelemáticaMáster en Investigación en Tecnologías de la Información y las Comunicacione

    Design implications for task-specific search utilities for retrieval and re-engineering of code

    Get PDF
    The importance of information retrieval systems is unquestionable in the modern society and both individuals as well as enterprises recognise the benefits of being able to find information effectively. Current code-focused information retrieval systems such as Google Code Search, Codeplex or Koders produce results based on specific keywords. However, these systems do not take into account developers’ context such as development language, technology framework, goal of the project, project complexity and developer’s domain expertise. They also impose additional cognitive burden on users in switching between different interfaces and clicking through to find the relevant code. Hence, they are not used by software developers. In this paper, we discuss how software engineers interact with information and general-purpose information retrieval systems (e.g. Google, Yahoo!) and investigate to what extent domain-specific search and recommendation utilities can be developed in order to support their work-related activities. In order to investigate this, we conducted a user study and found that software engineers followed many identifiable and repeatable work tasks and behaviours. These behaviours can be used to develop implicit relevance feedback-based systems based on the observed retention actions. Moreover, we discuss the implications for the development of task-specific search and collaborative recommendation utilities embedded with the Google standard search engine and Microsoft IntelliSense for retrieval and re-engineering of code. Based on implicit relevance feedback, we have implemented a prototype of the proposed collaborative recommendation system, which was evaluated in a controlled environment simulating the real-world situation of professional software engineers. The evaluation has achieved promising initial results on the precision and recall performance of the system

    A series of case studies to enhance the social utility of RSS

    Get PDF
    RSS (really simple syndication, rich site summary or RDF site summary) is a dialect of XML that provides a method of syndicating on-line content, where postings consist of frequently updated news items, blog entries and multimedia. RSS feeds, produced by organisations or individuals, are often aggregated, and delivered to users for consumption via readers. The semi-structured format of RSS also allows the delivery/exchange of machine-readable content between different platforms and systems. Articles on web pages frequently include icons that represent social media services which facilitate social data. Amongst these, RSS feeds deliver data which is typically presented in the journalistic style of headline, story and snapshot(s). Consequently, applications and academic research have employed RSS on this basis. Therefore, within the context of social media, the question arises: can the social function, i.e. utility, of RSS be enhanced by producing from it data which is actionable and effective? This thesis is based upon the hypothesis that the fluctuations in the keyword frequencies present in RSS can be mined to produce actionable and effective data, to enhance the technology's social utility. To this end, we present a series of laboratory-based case studies which demonstrate two novel and logically consistent RSS-mining paradigms. Our first paradigm allows users to define mining rules to mine data from feeds. The second paradigm employs a semi-automated classification of feeds and correlates this with sentiment. We visualise the outputs produced by the case studies for these paradigms, where they can benefit users in real-world scenarios, varying from statistics and trend analysis to mining financial and sporting data. The contributions of this thesis to web engineering and text mining are the demonstration of the proof of concept of our paradigms, through the integration of an array of open-source, third-party products into a coherent and innovative, alpha-version prototype software implemented in a Java JSP/servlet-based web application architecture

    A series of case studies to enhance the social utility of RSS

    Get PDF
    RSS (really simple syndication, rich site summary or RDF site summary) is a dialect of XML that provides a method of syndicating on-line content, where postings consist of frequently updated news items, blog entries and multimedia. RSS feeds, produced by organisations or individuals, are often aggregated, and delivered to users for consumption via readers. The semi-structured format of RSS also allows the delivery/exchange of machine-readable content between different platforms and systems. Articles on web pages frequently include icons that represent social media services which facilitate social data. Amongst these, RSS feeds deliver data which is typically presented in the journalistic style of headline, story and snapshot(s). Consequently, applications and academic research have employed RSS on this basis. Therefore, within the context of social media, the question arises: can the social function, i.e. utility, of RSS be enhanced by producing from it data which is actionable and effective? This thesis is based upon the hypothesis that the fluctuations in the keyword frequencies present in RSS can be mined to produce actionable and effective data, to enhance the technology's social utility. To this end, we present a series of laboratory-based case studies which demonstrate two novel and logically consistent RSS-mining paradigms. Our first paradigm allows users to define mining rules to mine data from feeds. The second paradigm employs a semi-automated classification of feeds and correlates this with sentiment. We visualise the outputs produced by the case studies for these paradigms, where they can benefit users in real-world scenarios, varying from statistics and trend analysis to mining financial and sporting data. The contributions of this thesis to web engineering and text mining are the demonstration of the proof of concept of our paradigms, through the integration of an array of open-source, third-party products into a coherent and innovative, alpha-version prototype software implemented in a Java JSP/servlet-based web application architecture

    Personalization and usage data in academic libraries : an exploratory study

    Get PDF
    Personalization is a service pattern for ensuring proactive information delivery tailored to an individual based on learned or perceived needs of the person. It is credited as a remedy for information explosion especially in the academic environment and its importance to libraries was described to the extent of justifying their existence. There have been numerous novel approaches or technical specifications forwarded for realization of personalization in libraries. However, literature shows that the implementation of the services in libraries is minimal which implies the need for a thorough analysis and discussion of issues underlying the practicality of this service in the library environment. This study was initiated by this need and it was done with the objective of finding answers for questions related to library usage data, user profiles and privacy which are among the factors determining the success of personalized services in academic libraries. With the aim of finding comprehensive answers, five distinct cases representing different approaches to academic library personalization were chosen for thorough analysis and themes extracted from them was substantiated by extensive literature review. Moreover, with the aim of getting more information, unstructured questions were presented to the libraries running the services. The overall finding shows that personalization can be realized in academic libraries but it has to address issues related to collecting and processing user/usage data, user interest management, safeguarding user privacy, library privacy laws and other important matters discovered in the course of the study.Joint Master Degree in Digital Library Learning (DILL
    corecore