11 research outputs found
Intelligent RSS Tool
Projecte realitzat en col·laboració amb la Aalto UniversityEasy access to a wide range of information available online enables people to explore
this information with an ambition to explore interesting content even more.
This opportunity often leads to a problem of finding interesting and relevant information
from the sea of knowledge. This problem is often referred to as the
information overload problem, which is getting harder and harder to deal with
as the amount of information available online grows. In this thesis, one source of
information is exploited and organized in such a way that the task of discovering
new content is made easier.
We use Really Simple Syndication (RSS) as our source of information and two
methods to categorize it: document clustering with K-Means and Latent Dirichlet
Allocation (LDA). We use the textual information that the RSS contains, each
RSS feed usually contains a specific set of topics. Our first goal is to perform
document clustering to the data, in order to generate meaningful clusters with
the help of natural language processing (NLP) techniques to preprocess the data.
Our second goal is to analyze the clustered RSS feeds and exploit the similarities
between the documents to generate meaningful user models based on user feed
subscriptions. The third goal is to provide relevant recommendations based on
the user models we have learned. We combine the current state-of-the-art methods
and present novel methods to compare feeds. We exploit WordNet shallow
ontologies in our novel method to create generalized representations of the feeds.
The final goal is to develop a functional application that can leverage the methods
we developed with the help of machine learning libraries. The method we
propose is a combination of document clustering techniques, text similarity, feed
modeling and recommendation system.The results of our experiments show that K-Means clustered documents combined
with recommendations based on the feed contents yield the best results.
Using WordNet to measure the similarity of words provides also promising results.
Further exploring the advantages of using semantic similarities would be
an interesting research topic in the document similarity measures
An Intelligent Technique for Extracting Subjects from User Profile Using ODP Ontology-Driven Reasoning
Abstract: Nowadays, the amount of available information, especially on the Web, is increasing. In this field, the role of user modeling and personalized information access is obviously vital. The traditional techniques like BOW (Bags of words) limit recommendations to the words which have been stored in the profile. In other words, the news items, which semantically relate to the users interests, can't be recognized and recommended to the users. Besides, BOW technique suffers from the curse of dimensionality, thus computational burden reduction is an essential task to efficiently handle a large number of terms in practical applications. This study focuses on the problem of choosing a representation of documents that can be suitable to induce concept-based user profiles as well as to support a content-based retrieval process. In this study, a new approach has been proposed to construct a ranked semantic user profile through extracting the related subjects. The new items can be recommended through collecting information from the user's selections, based on existing domain ontology ODP. The efficiency of the proposed technique has been shown by embedding it into an intelligent aggregator, RSS (RSS is acronym of " Really Simple Syndication) feed reader, which has been trained and evaluated by different and heterogeneous users. The results in experimental session show that the incoming news item which semantically relates to the profile gets highly recommended to the user despite its excluding of common words in the profile
Differences in intention to use educational RSS feeds between Lebanese and British students: A multi‑group analysis based on the technology acceptance model
Really Simple Syndication (RSS) offers a means for university students to receive timely updates from virtual learning environments. However, despite its utility, only 21% of home students surveyed at a university in Lebanon claim to have ever used the technology. To investigate whether national culture could be an influence on intention to use RSS, the survey was extended to British students in the UK. Using the Technology Adoption Model (TAM) as a research framework, 437 students responded to a questionnaire containing four constructs: behavioral intention to use; attitude towards benefit; perceived usefulness; and perceived ease of use. Principle components analysis and structural equation modelling were used to explore the psychometric qualities and utility of TAM in both contexts. The results show that adoption was significantly higher, but also modest, in the British context at 36%. Configural and metric invariance were fully supported, while scalar and factorial invariance were partially supported. Further analysis shows significant differences between perceived usefulness and perceived ease of use across the two contexts studied. Therefore, it is recommended that faculty demonstrate to students how educational RSS feeds can be used effectively to increase awareness and emphasize usefulness in both contexts
Desarrollo y evaluación de una solución tecnológica destinada a mejorar el acceso a contenidos digitales por parte de población excluida socialmente
El objetivo del TFM, es el desarrollo, implantación y posterior evaluación de una novedosa herramienta llamada RSS_PROYECT, basada en la tecnología RSS de sindicación de contenidos. En la actualidad, existen decenas de sindicadores Web de contenidos, aunque aún no existe uno diseñado para la búsqueda de noticias relativas a mujeres en riesgo de exclusión social ni posee las características y grado de configuración como el del programa sindicador RSS_PROYECT, presentado en este paper.
Para obtener los contenidos, utilizaremos 2 filtros (genérico y selectivo), configurados por el administrador desde el módulo RSS_PROYECT instalado en Joomla.
El filtro genérico permitirá hacer una búsqueda de las palabras introducidas en una serie de fuentes indexadas por el usuario. Este filtro mostrará todas las fuentes que contengan esta palabra, sin excepción. En el filtro selectivo, la condición para que se muestren las fuentes será que todas las palabras del filtro selectivo estén contenidas en la fuente.
Para realizar este proyecto hemos utilizado diversos lenguajes como: PHP, MySQL, HTML, XML y la Aplication Program Interface (API) de Joomla! A su vez, hemos usado el programa Firebug, para medir la velocidad de respuesta del módulo en dos casos: con el filtro selectivo y el genérico. Los resultados han sido favorables para el filtro selectivo, y muy favorables para el filtro genérico, concluyendo que el tiempo de procesado ha sido bajo y se ejecuta con eficiencia.
Se obtuvieron mejores tiempos promedio para el módulo RSS_PROYECT con respecto a otros módulos analizados en Joomla!
Hoyen día, esta herramienta es usada por el Centro Integral de Ayuda a la Mujer (CIAM) de Valladolid, España.Teoría de la Señal y Comunicaciones e Ingeniería TelemáticaMáster en Investigación en Tecnologías de la Información y las Comunicacione
Design implications for task-specific search utilities for retrieval and re-engineering of code
The importance of information retrieval systems is unquestionable in the modern society and both individuals as well as enterprises recognise the benefits of being able to find information effectively. Current code-focused information retrieval systems such as Google Code Search, Codeplex or Koders produce results based on specific keywords. However, these systems do not take into account developers’ context such as development language, technology framework, goal of the project, project complexity and developer’s domain expertise. They also impose additional cognitive burden on users in switching between different interfaces and clicking through to find the relevant code. Hence, they are not used by software developers. In this paper, we discuss how software engineers interact with information and general-purpose information retrieval systems (e.g. Google, Yahoo!) and investigate to what extent domain-specific search and recommendation utilities can be developed in order to support their work-related activities. In order to investigate this, we conducted a user study and found that software engineers followed many identifiable and repeatable work tasks and behaviours. These behaviours can be used to develop implicit relevance feedback-based systems based on the observed retention actions. Moreover, we discuss the implications for the development of task-specific search and collaborative recommendation utilities embedded with the Google standard search engine and Microsoft IntelliSense for retrieval and re-engineering of code. Based on implicit relevance feedback, we have implemented a prototype of the proposed collaborative recommendation system, which was evaluated in a controlled environment simulating the real-world situation of professional software engineers. The evaluation has achieved promising initial results on the precision and recall performance of the system
A series of case studies to enhance the social utility of RSS
RSS (really simple syndication, rich site summary or RDF site summary) is a dialect of
XML that provides a method of syndicating on-line content, where postings consist of
frequently updated news items, blog entries and multimedia. RSS feeds, produced by
organisations or individuals, are often aggregated, and delivered to users for consumption
via readers. The semi-structured format of RSS also allows the delivery/exchange of
machine-readable content between different platforms and systems.
Articles on web pages frequently include icons that represent social media services
which facilitate social data. Amongst these, RSS feeds deliver data which is typically
presented in the journalistic style of headline, story and snapshot(s). Consequently, applications
and academic research have employed RSS on this basis. Therefore, within the
context of social media, the question arises: can the social function, i.e. utility, of RSS be
enhanced by producing from it data which is actionable and effective?
This thesis is based upon the hypothesis that the
fluctuations in the keyword frequencies
present in RSS can be mined to produce actionable and effective data, to enhance
the technology's social utility. To this end, we present a series of laboratory-based case
studies which demonstrate two novel and logically consistent RSS-mining paradigms. Our first paradigm allows users to define mining rules to mine data from feeds. The second
paradigm employs a semi-automated classification of feeds and correlates this with sentiment.
We visualise the outputs produced by the case studies for these paradigms, where
they can benefit users in real-world scenarios, varying from statistics and trend analysis
to mining financial and sporting data.
The contributions of this thesis to web engineering and text mining are the demonstration
of the proof of concept of our paradigms, through the integration of an array of
open-source, third-party products into a coherent and innovative, alpha-version prototype
software implemented in a Java JSP/servlet-based web application architecture
A series of case studies to enhance the social utility of RSS
RSS (really simple syndication, rich site summary or RDF site summary) is a dialect of
XML that provides a method of syndicating on-line content, where postings consist of
frequently updated news items, blog entries and multimedia. RSS feeds, produced by
organisations or individuals, are often aggregated, and delivered to users for consumption
via readers. The semi-structured format of RSS also allows the delivery/exchange of
machine-readable content between different platforms and systems.
Articles on web pages frequently include icons that represent social media services
which facilitate social data. Amongst these, RSS feeds deliver data which is typically
presented in the journalistic style of headline, story and snapshot(s). Consequently, applications
and academic research have employed RSS on this basis. Therefore, within the
context of social media, the question arises: can the social function, i.e. utility, of RSS be
enhanced by producing from it data which is actionable and effective?
This thesis is based upon the hypothesis that the
fluctuations in the keyword frequencies
present in RSS can be mined to produce actionable and effective data, to enhance
the technology's social utility. To this end, we present a series of laboratory-based case
studies which demonstrate two novel and logically consistent RSS-mining paradigms. Our first paradigm allows users to define mining rules to mine data from feeds. The second
paradigm employs a semi-automated classification of feeds and correlates this with sentiment.
We visualise the outputs produced by the case studies for these paradigms, where
they can benefit users in real-world scenarios, varying from statistics and trend analysis
to mining financial and sporting data.
The contributions of this thesis to web engineering and text mining are the demonstration
of the proof of concept of our paradigms, through the integration of an array of
open-source, third-party products into a coherent and innovative, alpha-version prototype
software implemented in a Java JSP/servlet-based web application architecture
Personalization and usage data in academic libraries : an exploratory study
Personalization is a service pattern for ensuring proactive information delivery tailored to an individual based on learned or perceived needs of the person. It is credited as a remedy for information explosion especially in the academic environment and its importance to libraries was described to the extent of justifying their existence. There have been numerous novel approaches or technical specifications forwarded for realization of personalization in libraries. However, literature shows that the implementation of the services in libraries is minimal which implies the need for a thorough analysis and discussion of issues underlying the practicality of this service in the library environment. This study was initiated by this need and it was done with the objective of finding answers for questions related to library usage data, user profiles and privacy which are among the factors determining the success of personalized services in academic libraries. With the aim of finding comprehensive answers, five distinct cases representing different approaches to academic library personalization were chosen for thorough analysis and themes extracted from them was substantiated by extensive literature review. Moreover, with the aim of getting more information, unstructured questions were presented to the libraries running the services. The overall finding shows that personalization can be realized in academic libraries but it has to address issues related to collecting and processing user/usage data, user interest management, safeguarding user privacy, library privacy laws and other important matters discovered in the course of the study.Joint Master Degree in Digital Library Learning (DILL