1,127 research outputs found

    Simple identification tools in FishBase

    Get PDF
    Simple identification tools for fish species were included in the FishBase information system from its inception. Early tools made use of the relational model and characters like fin ray meristics. Soon pictures and drawings were added as a further help, similar to a field guide. Later came the computerization of existing dichotomous keys, again in combination with pictures and other information, and the ability to restrict possible species by country, area, or taxonomic group. Today, www.FishBase.org offers four different ways to identify species. This paper describes these tools with their advantages and disadvantages, and suggests various options for further development. It explores the possibility of a holistic and integrated computeraided strategy

    A multi-agent system for on-the-fly web map generation and spatial conflict resolution

    Get PDF
    Résumé Internet est devenu un moyen de diffusion de l’information géographique par excellence. Il offre de plus en plus de services cartographiques accessibles par des milliers d’internautes à travers le monde. Cependant, la qualité de ces services doit être améliorée, principalement en matière de personnalisation. A cette fin, il est important que la carte générée corresponde autant que possible aux besoins, aux préférences et au contexte de l’utilisateur. Ce but peut être atteint en appliquant les transformations appropriées, en temps réel, aux objets de l’espace à chaque cycle de génération de la carte. L’un des défis majeurs de la génération d’une carte à la volée est la résolution des conflits spatiaux qui apparaissent entre les objets, essentiellement à cause de l’espace réduit des écrans d’affichage. Dans cette thèse, nous proposons une nouvelle approche basée sur la mise en œuvre d’un système multiagent pour la génération à la volée des cartes et la résolution des conflits spatiaux. Cette approche est basée sur l’utilisation de la représentation multiple et la généralisation cartographique. Elle résout les conflits spatiaux et génère les cartes demandées selon une stratégie innovatrice : la génération progressive des cartes par couches d’intérêt. Chaque couche d’intérêt contient tous les objets ayant le même degré d’importance pour l’utilisateur. Ce contenu est déterminé à la volée au début du processus de génération de la carte demandée. Notre approche multiagent génère et transfère cette carte suivant un mode parallèle. En effet, une fois une couche d’intérêt générée, elle est transmise à l’utilisateur. Dans le but de résoudre les conflits spatiaux, et par la même occasion générer la carte demandée, nous affectons un agent logiciel à chaque objet de l’espace. Les agents entrent ensuite en compétition pour l’occupation de l’espace disponible. Cette compétition est basée sur un ensemble de priorités qui correspondent aux différents degrés d’importance des objets pour l’utilisateur. Durant la résolution des conflits, les agents prennent en considération les besoins et les préférences de l’utilisateur afin d’améliorer la personnalisation de la carte. Ils améliorent la lisibilité des objets importants et utilisent des symboles qui pourraient aider l’utilisateur à mieux comprendre l’espace géographique. Le processus de génération de la carte peut être interrompu en tout temps par l’utilisateur lorsque les données déjà transmises répondent à ses besoins. Dans ce cas, son temps d’attente est réduit, étant donné qu’il n’a pas à attendre la génération du reste de la carte. Afin d’illustrer notre approche, nous l’appliquons au contexte de la cartographie sur le web ainsi qu’au contexte de la cartographie mobile. Dans ces deux contextes, nous catégorisons nos données, qui concernent la ville de Québec, en quatre couches d’intérêt contenant les objets explicitement demandés par l’utilisateur, les objets repères, le réseau routier et les objets ordinaires qui n’ont aucune importance particulière pour l’utilisateur. Notre système multiagent vise à résoudre certains problèmes liés à la génération à la volée des cartes web. Ces problèmes sont les suivants : 1. Comment adapter le contenu des cartes, à la volée, aux besoins des utilisateurs ? 2. Comment résoudre les conflits spatiaux de manière à améliorer la lisibilité de la carte tout en prenant en considération les besoins de l’utilisateur ? 3. Comment accélérer la génération et le transfert des données aux utilisateurs ? Les principales contributions de cette thèse sont : 1. La résolution des conflits spatiaux en utilisant les systèmes multiagent, la généralisation cartographique et la représentation multiple. 2. La génération des cartes dans un contexte web et dans un contexte mobile, à la volée, en utilisant les systèmes multiagent, la généralisation cartographique et la représentation multiple. 3. L’adaptation des contenus des cartes, en temps réel, aux besoins de l’utilisateur à la source (durant la première génération de la carte). 4. Une nouvelle modélisation de l’espace géographique basée sur une architecture multi-couches du système multiagent. 5. Une approche de génération progressive des cartes basée sur les couches d’intérêt. 6. La génération et le transfert, en parallèle, des cartes aux utilisateurs, dans les contextes web et mobile.Abstract Internet is a fast growing medium to get and disseminate geospatial information. It provides more and more web mapping services accessible by thousands of users worldwide. However, the quality of these services needs to be improved, especially in term of personalization. In order to increase map flexibility, it is important that the map corresponds as much as possible to the user’s needs, preferences and context. This may be possible by applying the suitable transformations, in real-time, to spatial objects at each map generation cycle. An underlying challenge of such on-the-fly map generation is to solve spatial conflicts that may appear between objects especially due to lack of space on display screens. In this dissertation, we propose a multiagent-based approach to address the problems of on-the-fly web map generation and spatial conflict resolution. The approach is based upon the use of multiple representation and cartographic generalization. It solves conflicts and generates maps according to our innovative progressive map generation by layers of interest approach. A layer of interest contains objects that have the same importance to the user. This content, which depends on the user’s needs and the map’s context of use, is determined on-the-fly. Our multiagent-based approach generates and transfers data of the required map in parallel. As soon as a given layer of interest is generated, it is transmitted to the user. In order to generate a given map and solve spatial conflicts, we assign a software agent to every spatial object. Then, the agents compete for space occupation. This competition is driven by a set of priorities corresponding to the importance of objects for the user. During processing, agents take into account users’ needs and preferences in order to improve the personalization of the final map. They emphasize important objects by improving their legibility and using symbols in order to help the user to better understand the geographic space. Since the user can stop the map generation process whenever he finds the required information from the amount of data already transferred, his waiting delays are reduced. In order to illustrate our approach, we apply it to the context of tourist web and mobile mapping applications. In these contexts, we propose to categorize data into four layers of interest containing: explicitly required objects, landmark objects, road network and ordinary objects which do not have any specific importance for the user. In this dissertation, our multiagent system aims at solving the following problems related to on-the-fly web mapping applications: 1. How can we adapt the contents of maps to users’ needs on-the-fly? 2. How can we solve spatial conflicts in order to improve the legibility of maps while taking into account users’ needs? 3. How can we speed up data generation and transfer to users? The main contributions of this thesis are: 1. The resolution of spatial conflicts using multiagent systems, cartographic generalization and multiple representation. 2. The generation of web and mobile maps, on-the-fly, using multiagent systems, cartographic generalization and multiple representation. 3. The real-time adaptation of maps’ contents to users’ needs at the source (during the first generation of the map). 4. A new modeling of the geographic space based upon a multi-layers multiagent system architecture. 5. A progressive map generation approach by layers of interest. 6. The generation and transfer of web and mobile maps at the same time to users

    Learner models in online personalized educational experiences: an infrastructure and some experim

    Get PDF
    Technologies are changing the world around us, and education is not immune from its influence: the field of teaching and learning supported by the use of Information and Communication Technologies (ICTs), also known as Technology Enhanced Learning (TEL), has witnessed a huge expansion in recent years. This wide adoption happened thanks to the massive diffusion of broadband connections and to the pervasive needs for education, highly connected to the evolution in sciences and technologies. Therefore, it has pushed up the usage of online education (distance and blended methodologies for educational experiences) to, even in lately years, unexpected rates. Alongside with the well known potentialities, digital-based educational tools come with a number of downsides, such as possible disengagement on the part of the learner, absence of the social pressures that normally exist in a classroom environment, difficulty or even inability from the learners to self-regulate and, last but not least, depletion of the stimulus to actively participate and cooperate with lectures and peers. These difficulties impact the teaching process and the outcomes of the educational experience (i.e. learning process), being a serious limit and questioning the broader applicability of TEL solutions. To overcome these issues, there is a need of tools to support the learning process. In the literature, one of the known approach to improve the situation is to rely on a user profile, that collects data during the use of the eLearning platforms or tool. The created profile can be used to adapt the behaviour and the contents proposed to the learner. On top of this model, some researches stressed the positive effects stimulated by the disclosure of the model itself for inspection purposes by the learner. This disclosed model is known as Open Learner Model (OLM). The idea of opening learners' profile and eventually integrate them with external on-line resources is not new and it has the ultimate goal of creating global and long-run indicators of the learner's profile. Also the representation aspect of the learner model plays a role, moving from the more traditional approach based on the textual and analytic/extensive representation to the graphical indicators that are able to summarise and to present one or more of the model characteristics in a way that is considered more effective and natural for the user consumption. Relying on the same learner models, and stressing the different aggregation and representation capabilities, it is possible to either support self-reflection of the learner or to foster the tutoring process to allow proper supervision by the tutor/teacher. Both the objectives can be reached through the graphical representation of the relevant information, presented in different ways. Furthermore, with such an open approach for the learner model, the concepts of personalisation and adaptation acquire a central role in the TEL experience, overcoming the previous limits related to the impossibility to observe and explain to the learner the reasons for such an intervention from the tool itself. As a consequence, the introduction of different tools, platforms, widgets and devices in the learning process, together with the adaptation process based on the learner profiles, can create a personal space for a potential fruitful usage of the rich and widespread amount of resources available to the learner. This work aimed at analysing the way a learner model could be represented in visual presentation to the system users, exploring the effects and performances for learners and teachers. Subsequently, it concentrated in investigating how the adoption of adaptive and social visualisations of OLM could affect the student experience within a TEL context. The motivation was twofold. On one side was to show that the approach of mixing data from heterogeneous and not already related data sources could have a meaningful didactic interpretations, whether on the other one was to measure the perceived impact of the introduction on online experiences of the adaptivity (and of social aspects) in the graphical visualisations produced by such a tool. In order to achieve these objectives, the present work analysed and addressed them through an approach that merged user data in learning platforms, implementing a learner profile. This was accomplished by means of the creation of a tool, named GVIS, to elaborate on the collected user actions in platforms enabling remote teaching. A number of test cases were performed and analysed, adopting the developed tool as the provider to extract, to aggregate and to represent the data for the learners' model. The GVIS tool impact was then estimated with self- evaluation questionnaires, with the analysis of log files and with knowledge quiz results. Dimensions such as the perceived usefulness, the impact on motivation and commitment, the cognitive overload generated, and the impact of social data disclosure were taken into account. The main result found by the application of the developed tool in TEL experiences was to have an impact on the behaviour of online learners when used to provide them with indicators around their activities, especially when enhanced with social capabilities. The effects appear to be amplifies in those cases where the widget usage is as simplified as possible. From the learner side, the results suggested that the learners seem to appreciate the tool and recognise its value. For them the introduction as part of the online learning experience could act as a positive pressure factor, enhanced by the peer comparison functionality. This functionality could also be used to reinforce the student engagement and positive commitment to the educational experience, by transmitting a sense of community and stimulating healthy competition between learners. From the teacher/tutor side, they seemed to be better supported by the presentation of compact, intuitive and just-in-time information (i.e. actions that have an educational interpretation or impact) about the monitored user or group. This gave them a clearer picture of how the class is currently performing and enabled them to address performance issues by adapting the resources and the teaching (and learning) approach accordingly. Although a drawback was identified regarding the cognitive overload, the data collected showed that users generally considered this kind of support useful. There is also indications that further analyses can be interesting to explore the effects introduced in the teaching practices by the availability and usage of such a tool

    Pervasive data science applied to the society of services

    Get PDF
    Dissertação de mestrado integrado em Information Systems Engineering and ManagementWith the technological progress that has been happening in the last few years, and now with the actual implementation of the Internet of Things concept, it is possible to observe an enormous amount of data being collected each minute. Well, this brings along a problem: “How can we process such amount of data in order to extract relevant knowledge in useful time?”. That’s not an easy issue to solve, because most of the time one needs to deal not just with tons but also with different kinds of data, which makes the problem even more complex. Today, and in an increasing way, huge quantities of the most varied types of data are produced. These data alone do not add value to the organizations that collect them, but when subjected to data analytics processes, they can be converted into crucial information sources in the core business. Therefore, the focus of this project is to explore this problem and try to give it a modular solution, adaptable to different realities, using recent technologies and one that allows users to access information where and whenever they wish. In the first phase of this dissertation, bibliographic research, along with a review of the same sources, was carried out in order to realize which kind of solutions already exists and also to try to solve the remaining questions. After this first work, a solution was developed, which is composed by four layers, and consists in getting the data to submit it to a treatment process (where eleven treatment functions are included to actually fulfill the multidimensional data model previously designed); and then an OLAP layer, which suits not just structured data but unstructured data as well, was constructed. In the end, it is possible to consult a set of four dashboards (available on a web application) based on more than twenty basic queries and that allows filtering data with a dynamic query. For this case study, and as proof of concept, the company IOTech was used, a company that provides the data needed to accomplish this dissertation, and based on which five Key Performance Indicators were defined. During this project two different methodologies were applied: Design Science Research, in the research field, and SCRUM, in the practical component.Com o avanço tecnológico que se tem vindo a notar nos últimos anos e, atualmente, com a implementação do conceito Internet of Things, é possível observar o enorme crescimento dos volumes de dados recolhidos a cada minuto. Esta realidade levanta uma problemática: “Como podemos processar grandes volumes dados e extrair conhecimento a partir deles em tempo útil?”. Este não é um problema fácil de resolver pois muitas vezes não estamos a lidar apenas com grandes volumes de dados, mas também com diferentes tipos dos mesmos, o que torna a problemática ainda mais complexa. Atualmente, grandes quantidades dos mais variados tipos de dados são geradas. Estes dados por si só não acrescentam qualquer valor às organizações que os recolhem. Porém, quando submetidos a processos de análise, podem ser convertidos em fontes de informação cruciais no centro do negócio. Assim sendo, o foco deste projeto é explorar esta problemática e tentar atribuir-lhe uma solução modular e adaptável a diferentes realidades, com base em tecnologias atuais que permitam ao utilizador aceder à informação onde e quando quiser. Na primeira fase desta dissertação, foi executada uma pesquisa bibliográfica, assim como, uma revisão da literatura recolhida nessas mesmas fontes, a fim de compreender que soluções já foram propostas e quais são as questões que requerem uma resposta. Numa segunda fase, foi desenvolvida uma solução, composta por quatro modulos, que passa por submeter os dados a um processo de tratamento (onde estão incluídas onze funções de tratamento, com o objetivo de preencher o modelo multidimensional previamente desenhado) e, posteriormente, desenvolver uma camada OLAP que seja capaz de lidar não só com dados estruturados, mas também dados não estruturados. No final, é possível consultar um conjunto de quatro dashboards disponibilizados numa plataforma web que tem como base mais de vinte queries iniciais, e filtros com base numa query dinamica. Para este caso de estudo e como prova de conceito foi utilizada a empresa IOTech, empresa que disponibilizará os dados necessários para suportar esta dissertação, e com base nos quais foram definidos cinco Key Performance Indicators. Durante este projeto foram aplicadas diferentes metodologias: Design Science Research, no que diz respeito à pesquisa, e SCRUM, no que diz respeito à componente prática

    TopX : efficient and versatile top-k query processing for text, structured, and semistructured data

    Get PDF
    TopX is a top-k retrieval engine for text and XML data. Unlike Boolean engines, it stops query processing as soon as it can safely determine the k top-ranked result objects according to a monotonous score aggregation function with respect to a multidimensional query. The main contributions of the thesis unfold into four main points, confirmed by previous publications at international conferences or workshops: • Top-k query processing with probabilistic guarantees. • Index-access optimized top-k query processing. • Dynamic and self-tuning, incremental query expansion for top-k query processing. • Efficient support for ranked XML retrieval and full-text search. Our experiments demonstrate the viability and improved efficiency of our approach compared to existing related work for a broad variety of retrieval scenarios.TopX ist eine Top-k Suchmaschine für Text und XML Daten. Im Gegensatz zu Boole\u27; schen Suchmaschinen terminiert TopX die Anfragebearbeitung, sobald die k besten Ergebnisobjekte im Hinblick auf eine mehrdimensionale Anfrage gefunden wurden. Die Hauptbeiträge dieser Arbeit teilen sich in vier Schwerpunkte basierend auf vorherigen Veröffentlichungen bei internationalen Konferenzen oder Workshops: • Top-k Anfragebearbeitung mit probabilistischen Garantien. • Zugriffsoptimierte Top-k Anfragebearbeitung. • Dynamische und selbstoptimierende, inkrementelle Anfrageexpansion für Top-k Anfragebearbeitung. • Effiziente Unterstützung für XML-Anfragen und Volltextsuche. Unsere Experimente bestätigen die Vielseitigkeit und gesteigerte Effizienz unserer Verfahren gegenüber existierenden, führenden Ansätzen für eine weite Bandbreite von Anwendungen in der Informationssuche

    MementoMap: A Web Archive Profiling Framework for Efficient Memento Routing

    Get PDF
    With the proliferation of public web archives, it is becoming more important to better profile their contents, both to understand their immense holdings as well as to support routing of requests in Memento aggregators. A memento is a past version of a web page and a Memento aggregator is a tool or service that aggregates mementos from many different web archives. To save resources, the Memento aggregator should only poll the archives that are likely to have a copy of the requested Uniform Resource Identifier (URI). Using the Crawler Index (CDX), we generate profiles of the archives that summarize their holdings and use them to inform routing of the Memento aggregator’s URI requests. Additionally, we use full text search (when available) or sample URI lookups to build an understanding of an archive’s holdings. Previous work in profiling ranged from using full URIs (no false positives, but with large profiles) to using only top-level domains (TLDs) (smaller profiles, but with many false positives). This work explores strategies in between these two extremes. For evaluation we used CDX files from Archive-It, UK Web Archive, Stanford Web Archive Portal, and Arquivo.pt. Moreover, we used web server access log files from the Internet Archive’s Wayback Machine, UK Web Archive, Arquivo.pt, LANL’s Memento Proxy, and ODU’s MemGator Server. In addition, we utilized historical dataset of URIs from DMOZ. In early experiments with various URI-based static profiling policies we successfully identified about 78% of the URIs that were not present in the archive with less than 1% relative cost as compared to the complete knowledge profile and 94% URIs with less than 10% relative cost without any false negatives. In another experiment we found that we can correctly route 80% of the requests while maintaining about 0.9 recall by discovering only 10% of the archive holdings and generating a profile that costs less than 1% of the complete knowledge profile. We created MementoMap, a framework that allows web archives and third parties to express holdings and/or voids of an archive of any size with varying levels of details to fulfil various application needs. Our archive profiling framework enables tools and services to predict and rank archives where mementos of a requested URI are likely to be present. In static profiling policies we predefined the maximum depth of host and path segments of URIs for each policy that are used as URI keys. This gave us a good baseline for evaluation, but was not suitable for merging profiles with different policies. Later, we introduced a more flexible means to represent URI keys that uses wildcard characters to indicate whether a URI key was truncated. Moreover, we developed an algorithm to rollup URI keys dynamically at arbitrary depths when sufficient archiving activity is detected under certain URI prefixes. In an experiment with dynamic profiling of archival holdings we found that a MementoMap of less than 1.5% relative cost can correctly identify the presence or absence of 60% of the lookup URIs in the corresponding archive without any false negatives (i.e., 100% recall). In addition, we separately evaluated archival voids based on the most frequently accessed resources in the access log and found that we could have avoided more than 8% of the false positives without introducing any false negatives. We defined a routing score that can be used for Memento routing. Using a cut-off threshold technique on our routing score we achieved over 96% accuracy if we accept about 89% recall and for a recall of 99% we managed to get about 68% accuracy, which translates to about 72% saving in wasted lookup requests in our Memento aggregator. Moreover, when using top-k archives based on our routing score for routing and choosing only the topmost archive, we missed only about 8% of the sample URIs that are present in at least one archive, but when we selected top-2 archives, we missed less than 2% of these URIs. We also evaluated a machine learning-based routing approach, which resulted in an overall better accuracy, but poorer recall due to low prevalence of the sample lookup URI dataset in different web archives. We contributed various algorithms, such as a space and time efficient approach to ingest large lists of URIs to generate MementoMaps and a Random Searcher Model to discover samples of holdings of web archives. We contributed numerous tools to support various aspects of web archiving and replay, such as MemGator (a Memento aggregator), Inter- Planetary Wayback (a novel archival replay system), Reconstructive (a client-side request rerouting ServiceWorker), and AccessLog Parser. Moreover, this work yielded a file format specification draft called Unified Key Value Store (UKVS) that we use for serialization and dissemination of MementoMaps. It is a flexible and extensible file format that allows easy interactions with Unix text processing tools. UKVS can be used in many applications beyond MementoMaps

    Yavaa: supporting data workflows from discovery to visualization

    Get PDF
    Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations

    Proceedings of the 3rd Open Source Geospatial Research & Education Symposium OGRS 2014

    Get PDF
    The third Open Source Geospatial Research & Education Symposium (OGRS) was held in Helsinki, Finland, on 10 to 13 June 2014. The symposium was hosted and organized by the Department of Civil and Environmental Engineering, Aalto University School of Engineering, in partnership with the OGRS Community, on the Espoo campus of Aalto University. These proceedings contain the 20 papers presented at the symposium. OGRS is a meeting dedicated to exchanging ideas in and results from the development and use of open source geospatial software in both research and education.  The symposium offers several opportunities for discussing, learning, and presenting results, principles, methods and practices while supporting a primary theme: how to carry out research and educate academic students using, contributing to, and launching open source geospatial initiatives. Participating in open source initiatives can potentially boost innovation as a value creating process requiring joint collaborations between academia, foundations, associations, developer communities and industry. Additionally, open source software can improve the efficiency and impact of university education by introducing open and freely usable tools and research results to students, and encouraging them to get involved in projects. This may eventually lead to new community projects and businesses. The symposium contributes to the validation of the open source model in research and education in geoinformatics
    • …
    corecore