11 research outputs found

    Estratégias de classificação de informação em instituições de I&=eD : um estudo de caso sobre o INESC-Porto

    Get PDF
    Tese de mestrado. Gestão da Informação. Faculdade de Engenharia. Universidade do Porto. 200

    The European Language Resources and Technologies Forum: Shaping the Future of the Multilingual Digital Europe

    Get PDF
    Proceedings of the 1st FLaReNet Forum on the European Language Resources and Technologies, held in Vienna, at the Austrian Academy of Science, on 12-13 February 2009

    Context-Aware Service Creation On The Semantic Web

    Get PDF
    With the increase of the computational power of mobile devices, their new capabilities and the addition of new context sensors, it is possible to obtain more information from mobile users and to offer new ways and tools to facilitate the content creation process. All this information can be exploited by the service creators to provide mobile services with higher degree of personalization that translate into better experiences. Currently on the web, many data sources containing UGC provide access to them through classical web mechanisms (built on a small set of standards), that is, custom web APIs that promote the fragmentation of the Web. To address this issue, Tim Berners-Lee proposed the Linked Data principles to provide guidelines for the use of standard web technologies, thus allowing the publication of structured on the Web that can be accessed using standard database mechanisms. The increase of Linked Data published on the web, increases opportunities for mobile services take advantage of it as a huge source of data, information and knowledge, either user-generated or not. This dissertation proposes a framework for creating mobile services that exploit the context information, generated content of its users and the data, information and knowledge present on the Web of Data. In addition we present, the cases of different mobile services created to take advantage of these elements and in which the proposed framework have been implemented (at least partially). Each of these services belong to different domains and each of them highlight the advantages provided to their end user

    Contributions to privacy in web search engines

    Get PDF
    Els motors de cerca d’Internet recullen i emmagatzemen informació sobre els seus usuaris per tal d’oferir-los millors serveis. A canvi de rebre un servei personalitzat, els usuaris perden el control de les seves pròpies dades. Els registres de cerca poden revelar informació sensible de l’usuari, o fins i tot revelar la seva identitat. En aquesta tesis tractem com limitar aquests problemes de privadesa mentre mantenim suficient informació a les dades. La primera part d’aquesta tesis tracta els mètodes per prevenir la recollida d’informació per part dels motores de cerca. Ja que aquesta informació es requerida per oferir un servei precís, l’objectiu es proporcionar registres de cerca que siguin adequats per proporcionar personalització. Amb aquesta finalitat, proposem un protocol que empra una xarxa social per tal d’ofuscar els perfils dels usuaris. La segona part tracta la disseminació de registres de cerca. Proposem tècniques que la permeten, proporcionant k-anonimat i minimitzant la pèrdua d’informació.Web Search Engines collects and stores information about their users in order to tailor their services better to their users' needs. Nevertheless, while receiving a personalized attention, the users lose the control over their own data. Search logs can disclose sensitive information and the identities of the users, creating risks of privacy breaches. In this thesis we discuss the problem of limiting the disclosure risks while minimizing the information loss. The first part of this thesis focuses on the methods to prevent the gathering of information by WSEs. Since search logs are needed in order to receive an accurate service, the aim is to provide logs that are still suitable to provide personalization. We propose a protocol which uses a social network to obfuscate users' profiles. The second part deals with the dissemination of search logs. We propose microaggregation techniques which allow the publication of search logs, providing kk-anonymity while minimizing the information loss

    SemAware: An Ontology-Based Web Recommendation System

    Get PDF
    Web Recommendation Systems (WRS\u27s) are used to recommend items and future page views to world wide web users. Web usage mining lays the platform for WRS\u27s, as results of mining user browsing patterns are used for recommendation and prediction. Existing WRS\u27s are still limited by several problems, some of which are the problem of recommending items to a new user whose browsing history is not available (Cold Start), sparse data structures (Sparsity), and no diversity in the set of recommended items (Content Overspecialization). Existing WRS\u27s also fail to make full use of the semantic information about items and the relations (e.g., is-a, has-a, part-of) among them. A domain ontology, advocated by the Semantic Web, provides a formal representation of domain knowledge with relations, concepts and axioms.This thesis proposes SemAware system, which integrates domain ontology into web usage mining and web recommendation, and increases the effectiveness and efficiency of the system by solving problems of cold start, sparsity, content overspecialization and complexity-accuracy tradeoffs. SemAware technique includes enriching the web log with semantic information through a proposed semantic distance measure based on Jaccard coefficient. A matrix of semantic distances is then used in Semantics-aware Sequential Pattern Mining (SPM) of the web log, and is also integrated with the transition probability matrix of Markov models built from the web log. In the recommendation phase, the proposed SPM and Markov models are used to add interpretability. The proposed recommendation engine uses vector-space model to build anitem-concept correlation matrix in combination with user-provided tags to generate top-n recommendation.Experimental studies show that SemAware outperforms popular recommendation algorithms, and that its proposed components are effective and efficient for solving the contradicting predictions problem, the scalability and sparsity of SPM and top-n recommendations, and content overspecialization problems

    An infrastructure for the development of Semantic Desktop applications

    Get PDF
    In einem permanent wachsenden Ausmaß wird unser Leben digital organisiert. Viele tagtägliche Aktivitäten manifestieren sich (auch) in digitaler Form: einerseits explizit, wenn digitale Informationen für Arbeitsaufgaben oder in der Freizeit entstehen und verwendet werden; andererseits auch implizit, wenn Informationen indirekt, als Konsequenz unseres Handelns, erzeugt oder manipuliert wird. Ein großer Teil dieser Informationsbestände ist persönlicher Natur, d.h., diese Information hat einen bestimmten Bezug zu uns als Person. Die Speicher- und Rechenleistung der Geräte, mit denen wir üblicherweise mit solchen persönlichen Daten interagieren, wurde in den letzten Jahren kontinuierlich erhöht, und es besteht Grund zur Annahme, dass sich diese Entwicklung in der Zukunft fortsetzt. Während also die physische Leistung von Datenspeichern enorm erhöht wurde, hat deren logische und organisatorische Leistung seit der Erfindung der ersten Personal Computer praktisch stagniert. Nach wie vor sind hierarchische Dateisysteme der de-facto-Standard für die Organisation von persönlichen Daten. Solche Dateisysteme repräsentieren Daten als diskrete Einheiten (Dateien), die Blätter eines Baums von beschrifteten Knoten (Verzeichnisse) darstellen. Die Unterteilung des persönlichen Datenraums in kleine Einheiten unterstützt die Handhabung solcher Strukturen durch den Menschen, allerdings können viele Arten von Organisationsinformation nicht adäquat in einer Baumstruktur dargestellt werden. Dies wirkt sich negativ auf die Qualität der Datenorganisation aus. Aktuelle Forschung im Bereich Personal Information Management liefert zwar mögliche Ansätze, um hierarchische Systeme zu ersetzen, tendiert jedoch manchmal dazu, die Arbeit mit Information überzuformalisieren. Dies ist insbesondere kritisch, weil der durchschnittliche Anwender von PIM-Systemen über keine Erfahrung mit komplexen logischen Systemen verfügt. Diese Arbeit präsentiert ein alternatives Organisationsmodell für persönliche Daten, die darauf abzielt, eine Balance zwischen der unstrukturierten Charakteristik von Dateisystemen und den formalen Eigenschaften von logik-basierten Systemen zu finden. Nach einer vergleichenden Studie der aktuellen Forschungssituation im Bereich Semantic Desktop und Personal Information Management wird dieses Modell auf drei Ebenen vorgestellt. Zunächst wird ein abstraktes Modell sowie eine Abfrage-Algebra in Form von abstrakten Operationen auf dieses Modell vorgestellt. Dieses Modell erlaubt die Abbildung von im Personal Information Management gebräuchlichen Daten, aber erfordert keine völlige Umstellung auf Seiten des Benutzers. Anschließend wird dieses abstrakte Modell in konkreten Repräsentationen übergeführt, und es wird gezeigt, wie diese Repräsentationen effizient bearbeitet, gespeichert, und ausgetauscht werden können. Schließlich wird die Anwendung dieses Modells anhand von konkreten prototypischen Implementierungen gezeigt.The extent to which our daily lives are digitized is continuously growing. Many of our everyday activities manifest themselves in digital form; either in an explicit way, when we actively use digital information for work or spare time; or in an implicit way, when information is indirectly created or manipulated as a consequence of our action. A large fraction of these data volumes can be considered as personal information, that is, information that has a certain class of relationship to us as human beings. The storage and processing capacity of the devices that we use to interact with these data has been enormously increasing over the last years, and we can expect this development to continue in the future. However, while the power of physical data storage is permanently increasing, the development of logical data organization power of personal devices has been stagnating since the invention of the first personal computers. Still, hierarchical file systems are the de-facto standard for data organization on personal devices. File systems represent information as a set of discrete data units (files) that are arranged as leaves on a tree of labeled nodes (directories). This structure, on the one hand, can be easily understood by humans, since the separation into small information units supports the manual manageability of the personal data space, in comparison to systems that employ continuous data structures. On the other hand, hierarchical structures suffer from a number of deficiencies which have negative impact on the quality of personal information management, and it lacks of expressive mechanisms which in turn would help to improve information retrieval according to user needs. Significant research effort has been invested in order to improve the mechanisms for personal information management. The resulting works represent potential alternatives or supplements for systems in place, but sometimes run the risk of over-formalizing information management; a problem that is especially apparent in situations where a non-expert end user is the direct consumer of such services. The contribution of this thesis is to present an alternative organizational model for management of personal data that strikes a balance between the unstructured nature of file systems and the highly formal characteristics of logic-based systems. After a comparative analysis of the current situation and recent research effort in this direction, it describes this organizational metaphor on three levels: First, on a conceptual level, it discusses an abstract data model, a corresponding query algebra, and a set of abstract operations on this data model. This formal framework is suitable to represent common data structures and usage patterns that can be found in personal information management, but on the same time does not enforce a complete paradigm shift away from established systems. Second, on a representation level, it discusses how this model can be efficiently processed, stored, and exchanged between different systems. Third, on an implementation level, it describes how concrete realizations of this data model can be built and used in various application scenarios

    Technologies sémantiques pour un système actif d’apprentissage

    Get PDF
    Learning methods keep evolving and new paradigms are added to traditional teaching models where the information and communication systems, particularly the Web, are an essential part. In order to improve the processing capacity of information systems, the Semantic Web defines a model for describing resources (Resource Description Framework - RDF), and a language for defining ontologies (Web Ontology Language – OWL). Based on concepts, methods, learning theories, and following a systemic approach, we have used Semantic Web technologies in order to provide a learning system that is able to enrich and personalize the experience of the learner. As a result of our work we are proposing a prototype for an Active Semantic Learning System (SASA). Following the identification and modeling of entities involved in the learning process, we created the following six ontologies that summarize the characteristics of these entities: (1) learner ontology, (2) learning object ontology, (3) learning objective ontology, (4) evaluation object ontology, (5) annotation object ontology and (6) learning framework ontology. Integrating certain rules in the declared ontologies combined with reasoning capacities of the inference engines embedded in the kernel of the SASA, allow the adaptation of learning content to the characteristics of learners. The use of semantic technologies facilitates the identification of existing learning resources on the web as well as the interpretation and aggregation of these resources within the context of SASALes méthodes d’apprentissage évoluent et aux modèles classiques d’enseignement viennent s’ajouter de nouveaux paradigmes, dont les systèmes d’information et de communication, notamment le Web, sont une partie essentielle. Afin améliorer la capacité de traitement de l’information de ces systèmes, le Web sémantique définit un modèle de description de ressources (Resource Description Framework – RDF), ainsi qu’un langage pour la définition d’ontologies (Web Ontology Language – OWL). Partant des concepts, des méthodes, des théories d’apprentissage, en suivant une approche systémique, nous avons utilisé les technologies du Web sémantique pour réaliser une plateforme d’apprentissage capable d’enrichir et de personnaliser l’expérience de l’apprenant. Les résultats de nos travaux sont concrétisés dans la proposition d’un prototype pour un Système Actif et Sémantique d’Apprentissage (SASA). Suite à l’identification et la modélisation des entités participant à l’apprentissage, nous avons construit six ontologies, englobant les caractéristiques de ces entités. Elles sont les suivantes : (1) ontologie de l’apprenant, (2) ontologie de l’objet pédagogique, (3) ontologie de l’objectif d’apprentissage, (4) ontologie de l’objet d’évaluation, (5) ontologie de l’objet d’annotation et (6) ontologie du cadre d’enseignement. L’intégration des règles au niveau des ontologies déclarées, cumulée avec les capacités de raisonnement des moteurs d’inférences incorporés au niveau du noyau sémantique du SASA, permettent l’adaptation du contenu d’apprentissage aux particularités des apprenants. L’utilisation des technologies sémantiques facilite l’identification des ressources d’apprentissage existant sur le Web ainsi que l’interprétation et l’agrégation de ces ressources dans le cadre du SAS
    corecore