989 research outputs found

    Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia

    Full text link
    Hyperlinks are an essential feature of the World Wide Web. They are especially important for online encyclopedias such as Wikipedia: an article can often only be understood in the context of related articles, and hyperlinks make it easy to explore this context. But important links are often missing, and several methods have been proposed to alleviate this problem by learning a linking model based on the structure of the existing links. Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia's navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which users must find a short path from a start to a target article by only clicking links encountered along the way. We harness human navigational traces to identify a set of candidates for missing links and then rank these candidates. Experiments show that our procedure identifies missing links of high quality

    Learner models in online personalized educational experiences: an infrastructure and some experim

    Get PDF
    Technologies are changing the world around us, and education is not immune from its influence: the field of teaching and learning supported by the use of Information and Communication Technologies (ICTs), also known as Technology Enhanced Learning (TEL), has witnessed a huge expansion in recent years. This wide adoption happened thanks to the massive diffusion of broadband connections and to the pervasive needs for education, highly connected to the evolution in sciences and technologies. Therefore, it has pushed up the usage of online education (distance and blended methodologies for educational experiences) to, even in lately years, unexpected rates. Alongside with the well known potentialities, digital-based educational tools come with a number of downsides, such as possible disengagement on the part of the learner, absence of the social pressures that normally exist in a classroom environment, difficulty or even inability from the learners to self-regulate and, last but not least, depletion of the stimulus to actively participate and cooperate with lectures and peers. These difficulties impact the teaching process and the outcomes of the educational experience (i.e. learning process), being a serious limit and questioning the broader applicability of TEL solutions. To overcome these issues, there is a need of tools to support the learning process. In the literature, one of the known approach to improve the situation is to rely on a user profile, that collects data during the use of the eLearning platforms or tool. The created profile can be used to adapt the behaviour and the contents proposed to the learner. On top of this model, some researches stressed the positive effects stimulated by the disclosure of the model itself for inspection purposes by the learner. This disclosed model is known as Open Learner Model (OLM). The idea of opening learners' profile and eventually integrate them with external on-line resources is not new and it has the ultimate goal of creating global and long-run indicators of the learner's profile. Also the representation aspect of the learner model plays a role, moving from the more traditional approach based on the textual and analytic/extensive representation to the graphical indicators that are able to summarise and to present one or more of the model characteristics in a way that is considered more effective and natural for the user consumption. Relying on the same learner models, and stressing the different aggregation and representation capabilities, it is possible to either support self-reflection of the learner or to foster the tutoring process to allow proper supervision by the tutor/teacher. Both the objectives can be reached through the graphical representation of the relevant information, presented in different ways. Furthermore, with such an open approach for the learner model, the concepts of personalisation and adaptation acquire a central role in the TEL experience, overcoming the previous limits related to the impossibility to observe and explain to the learner the reasons for such an intervention from the tool itself. As a consequence, the introduction of different tools, platforms, widgets and devices in the learning process, together with the adaptation process based on the learner profiles, can create a personal space for a potential fruitful usage of the rich and widespread amount of resources available to the learner. This work aimed at analysing the way a learner model could be represented in visual presentation to the system users, exploring the effects and performances for learners and teachers. Subsequently, it concentrated in investigating how the adoption of adaptive and social visualisations of OLM could affect the student experience within a TEL context. The motivation was twofold. On one side was to show that the approach of mixing data from heterogeneous and not already related data sources could have a meaningful didactic interpretations, whether on the other one was to measure the perceived impact of the introduction on online experiences of the adaptivity (and of social aspects) in the graphical visualisations produced by such a tool. In order to achieve these objectives, the present work analysed and addressed them through an approach that merged user data in learning platforms, implementing a learner profile. This was accomplished by means of the creation of a tool, named GVIS, to elaborate on the collected user actions in platforms enabling remote teaching. A number of test cases were performed and analysed, adopting the developed tool as the provider to extract, to aggregate and to represent the data for the learners' model. The GVIS tool impact was then estimated with self- evaluation questionnaires, with the analysis of log files and with knowledge quiz results. Dimensions such as the perceived usefulness, the impact on motivation and commitment, the cognitive overload generated, and the impact of social data disclosure were taken into account. The main result found by the application of the developed tool in TEL experiences was to have an impact on the behaviour of online learners when used to provide them with indicators around their activities, especially when enhanced with social capabilities. The effects appear to be amplifies in those cases where the widget usage is as simplified as possible. From the learner side, the results suggested that the learners seem to appreciate the tool and recognise its value. For them the introduction as part of the online learning experience could act as a positive pressure factor, enhanced by the peer comparison functionality. This functionality could also be used to reinforce the student engagement and positive commitment to the educational experience, by transmitting a sense of community and stimulating healthy competition between learners. From the teacher/tutor side, they seemed to be better supported by the presentation of compact, intuitive and just-in-time information (i.e. actions that have an educational interpretation or impact) about the monitored user or group. This gave them a clearer picture of how the class is currently performing and enabled them to address performance issues by adapting the resources and the teaching (and learning) approach accordingly. Although a drawback was identified regarding the cognitive overload, the data collected showed that users generally considered this kind of support useful. There is also indications that further analyses can be interesting to explore the effects introduced in the teaching practices by the availability and usage of such a tool

    Provenance of "after the fact" harmonised community-based demographic and HIV surveillance data from ALPHA cohorts

    Get PDF
    Background: Data about data, metadata, for describing Health and Demographic Surveillance System (HDSS) data have often received insufficient attention. This thesis studied how to develop provenance metadata within the context of HDSS data harmonisation - the network for Analysing Longitudinal Population-based HIV/ AIDS data on Africa (ALPHA). Technologies from the data documentation community were customised, among them: A process model - Generic Longitudinal Business Process Model (GLBPM), two metadata standards - Data Documentation Initiative (DDI) and Standard for Data and Metadata eXchange (SDMX) and a data transformations description language - Structured Data Transform Language (SDTL). Methods: A framework with three complementary facets was used: Creating a recipe for annotating primary HDSS data using the GLBPM and DDI; Approaches for documenting data transformations. At a business level, prospective and retrospective documentation using GLBPM and DDI and retrospectively recovering the more granular details using SDMX and SDTL; Requirements analysis for a user-friendly provenance metadata browser. Results: A recipe for the annotation of HDSS data was created outlining considerations to guide HDSS on metadata entry, staff training and software costs. Regarding data transformations, at a business level, a specialised process model for the HDSS domain was created. It has algorithm steps for each data transformation sub-process and data inputs and outputs. At a lower level, the SDMX and SDTL captured about 80% (17/21) of the variable level transformations. The requirements elicitation study yielded requirements for a provenance metadata browser to guide developers. Conclusions: This is a first attempt ever at creating detailed metadata for this resource or any other similar resources in this field. HDSS can implement these recipes to document their data. This will increase transparency and facilitate reuse thus potentially bringing down costs of data management. It will arguably promote the longevity and wide and accurate use of these data

    Pirates and Samaritans: A Decade of Measurements on Peer Production and their Implications for Net Neutrality and Copyright

    Get PDF
    This study traces the evolution of commons-based peer production by a measurementbased analysis of case studies and disusses the impact of peer production on net neutrality and copyright law. The measurements include websites such asSuprnova. org, Youtube.com, and Facebook.com, and the Peer-to-Peer (P2P) systems Kazaa, Bittorrent, and Tribler. The measurements show the two sides of peer production, the pirate side with free availability of Hollywood movies on these P2P systems and the samaritan side exhibited by the quick joining of 400,000+ people in a community to organize protests against events in Burma. The telecommunications and content industry are disrupted by this way of peer production. As a consequence, revenues of both industries are likely to suffer in the coming years. On the other hand, innovative P2P systems could win the battle on merit over classical distribution technologies. As a result, a continuation is expected of both legal actions against P2P and possible blocking actions of P2P traffic, violating net neutrality. It is argued that this hinders innovation and causes a large discrepancy between legal and user perspectives. A reform of copyright laws are clearly needed, otherwise they will be unenforceable around 2010. Key words: P2P, collaboration, commons-based peer production, copyright

    Establishing a Legitimate Expectation of Privacy in Clickstream Data

    Get PDF
    This Article argues that Web users should enjoy a legitimate expectation of privacy in clickstream data. Fourth Amendment jurisprudence as developed over the last half-century does not support an expectation of privacy. However, reference to the history of the Fourth Amendment and the intent of its drafters reveals that government investigation and monitoring of clickstream data is precisely the type of activity the Framers sought to limit. Courts must update outdated methods of expectation of privacy analysis to address the unique challenges posed by the Internet in order to fulfill the Amendment\u27s purpose. Part I provides an overview of the Internet and cickstream data collection, and explains the value of this data to law enforcement. Part II discusses general Fourth Amendment principles, then explores how these principles have been, and are likely to be, applied to the Internet. Part III explores the intent of the Fourth Amendment\u27s drafters, analogizes clickstream searches to the general searches the Framers sought to prohibit, and argues that the values underlying the Fourth Amendment require courts to eschew the traditional two-prong expectation of privacy test in favor of a normative inquiry which recognizes a legitimate expectation of privacy in clickstream data

    Connected Information Management

    Get PDF
    Society is currently inundated with more information than ever, making efficient management a necessity. Alas, most of current information management suffers from several levels of disconnectedness: Applications partition data into segregated islands, small notes don’t fit into traditional application categories, navigating the data is different for each kind of data; data is either available at a certain computer or only online, but rarely both. Connected information management (CoIM) is an approach to information management that avoids these ways of disconnectedness. The core idea of CoIM is to keep all information in a central repository, with generic means for organization such as tagging. The heterogeneity of data is taken into account by offering specialized editors. The central repository eliminates the islands of application-specific data and is formally grounded by a CoIM model. The foundation for structured data is an RDF repository. The RDF editing meta-model (REMM) enables form-based editing of this data, similar to database applications such as MS access. Further kinds of data are supported by extending RDF, as follows. Wiki text is stored as RDF and can both contain structured text and be combined with structured data. Files are also supported by the CoIM model and are kept externally. Notes can be quickly captured and annotated with meta-data. Generic means for organization and navigation apply to all kinds of data. Ubiquitous availability of data is ensured via two CoIM implementations, the web application HYENA/Web and the desktop application HYENA/Eclipse. All data can be synchronized between these applications. The applications were used to validate the CoIM ideas

    Establishing a Legitimate Expectation of Privacy in Clickstream Data

    Get PDF
    This Article argues that Web users should enjoy a legitimate expectation of privacy in clickstream data. Fourth Amendment jurisprudence as developed over the last half-century does not support an expectation of privacy. However, reference to the history of the Fourth Amendment and the intent of its drafters reveals that government investigation and monitoring of clickstream data is precisely the type of activity the Framers sought to limit. Courts must update outdated methods of expectation of privacy analysis to address the unique challenges posed by the Internet in order to fulfill the Amendment\u27s purpose. Part I provides an overview of the Internet and cickstream data collection, and explains the value of this data to law enforcement. Part II discusses general Fourth Amendment principles, then explores how these principles have been, and are likely to be, applied to the Internet. Part III explores the intent of the Fourth Amendment\u27s drafters, analogizes clickstream searches to the general searches the Framers sought to prohibit, and argues that the values underlying the Fourth Amendment require courts to eschew the traditional two-prong expectation of privacy test in favor of a normative inquiry which recognizes a legitimate expectation of privacy in clickstream data

    Spatial ontologies for architectural heritage

    Get PDF
    Informatics and artificial intelligence have generated new requirements for digital archiving, information, and documentation. Semantic interoperability has become fundamental for the management and sharing of information. The constraints to data interpretation enable both database interoperability, for data and schemas sharing and reuse, and information retrieval in large datasets. Another challenging issue is the exploitation of automated reasoning possibilities. The solution is the use of domain ontologies as a reference for data modelling in information systems. The architectural heritage (AH) domain is considered in this thesis. The documentation in this field, particularly complex and multifaceted, is well-known to be critical for the preservation, knowledge, and promotion of the monuments. For these reasons, digital inventories, also exploiting standards and new semantic technologies, are developed by international organisations (Getty Institute, ONU, European Union). Geometric and geographic information is essential part of a monument. It is composed by a number of aspects (spatial, topological, and mereological relations; accuracy; multi-scale representation; time; etc.). Currently, geomatics permits the obtaining of very accurate and dense 3D models (possibly enriched with textures) and derived products, in both raster and vector format. Many standards were published for the geographic field or in the cultural heritage domain. However, the first ones are limited in the foreseen representation scales (the maximum is achieved by OGC CityGML), and the semantic values do not consider the full semantic richness of AH. The second ones (especially the core ontology CIDOC – CRM, the Conceptual Reference Model of the Documentation Commettee of the International Council of Museums) were employed to document museums’ objects. Even if it was recently extended to standing buildings and a spatial extension was included, the integration of complex 3D models has not yet been achieved. In this thesis, the aspects (especially spatial issues) to consider in the documentation of monuments are analysed. In the light of them, the OGC CityGML is extended for the management of AH complexity. An approach ‘from the landscape to the detail’ is used, for considering the monument in a wider system, which is essential for analysis and reasoning about such complex objects. An implementation test is conducted on a case study, preferring open source applications
    • 

    corecore