Search CORE

2,825 research outputs found

XML content warehousing: Improving sociological studies of mailing lists and web data

Author: Colazzo Dario
Dudouet François-Xavier
Manolescu Ioana
Nguyen Benjamin
Senellart Pierre
Vion Antoine
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Crossref

INRIA a CCSD electronic archive server

HAL UVSQ

HAL-Rennes 1

Social Search with Missing Data: Which Ranking Algorithm?

Author: Denham Chris
Eisenstadt Marc
Goncalves Alexandre
Song Dawei
Uren Victoria
Zhu Jianhan
Publication venue
Publication date: 01/10/2007
Field of study

Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods

Open Access Institutional Repository at Robert Gordon University

Open Research Online (The Open University)

RACOFI: A Rule-Applying Collaborative Filtering System

Author: Boley Harold
Lemire Daniel
Publication venue
Publication date: 01/01/2003
Field of study

In this paper we give an overview of the RACOFI (Rule-Applying Collaborative Filtering) multidimensional rating system and its related technologies. This will be exemplified with RACOFI Music, an implemented collaboration agent that assists on-line users in the rating and recommendation of audio (Learning) Objects. It lets users rate contemporary Canadian music in the five dimensions of impression, lyrics, music, originality, and production. The collaborative filtering algorithms STI Pearson, STIN2, and the Per Item Average algorithms are then employed together with RuleML-based rules to recommend music objects that best match user queries. RACOFI has been on-line since August 2003 at http://racofi.elg.ca.

CogPrints Cognitive Sciences Eprint Archive

The impact of XML on library procedures and services

Author: Van Herwijnen Eric
Publication venue
Publication date: 16/01/2000
Field of study

CERN Document Server

Advanced Knowledge Technologies at the Midterm: Tools and Methods for the Semantic Web

Author: Ciravegna Fabio
Domingue John
Hall Wendy
Motta Enrico
O'Hara Kieron
Robertson David
Shadbolt Nigel
Sleeman Derek
Tate Austin
Wilks Yorick
Publication venue: School of Electronics and Computer Science, University of Southampton
Publication date: 01/01/2004
Field of study

The University of Edinburgh and research sponsors are authorised to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are the author’s and shouldn’t be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties.In a celebrated essay on the new electronic media, Marshall McLuhan wrote in 1962:Our private senses are not closed systems but are endlessly translated into each other in that experience which we call consciousness. Our extended senses, tools, technologies, through the ages, have been closed systems incapable of interplay or collective awareness. Now, in the electric age, the very instantaneous nature of co-existence among our technological instruments has created a crisis quite new in human history. Our extended faculties and senses now constitute a single field of experience which demands that they become collectively conscious. Our technologies, like our private senses, now demand an interplay and ratio that makes rational co-existence possible. As long as our technologies were as slow as the wheel or the alphabet or money, the fact that they were separate, closed systems was socially and psychically supportable. This is not true now when sight and sound and movement are simultaneous and global in extent. (McLuhan 1962, p.5, emphasis in original)Over forty years later, the seamless interplay that McLuhan demanded between our technologies is still barely visible. McLuhan’s predictions of the spread, and increased importance, of electronic media have of course been borne out, and the worlds of business, science and knowledge storage and transfer have been revolutionised. Yet the integration of electronic systems as open systems remains in its infancy.Advanced Knowledge Technologies (AKT) aims to address this problem, to create a view of knowledge and its management across its lifecycle, to research and create the services and technologies that such unification will require. Half way through its sixyear span, the results are beginning to come through, and this paper will explore some of the services, technologies and methodologies that have been developed. We hope to give a sense in this paper of the potential for the next three years, to discuss the insights and lessons learnt in the first phase of the project, to articulate the challenges and issues that remain.The WWW provided the original context that made the AKT approach to knowledge management (KM) possible. AKT was initially proposed in 1999, it brought together an interdisciplinary consortium with the technological breadth and complementarity to create the conditions for a unified approach to knowledge across its lifecycle. The combination of this expertise, and the time and space afforded the consortium by the IRC structure, suggested the opportunity for a concerted effort to develop an approach to advanced knowledge technologies, based on the WWW as a basic infrastructure.The technological context of AKT altered for the better in the short period between the development of the proposal and the beginning of the project itself with the development of the semantic web (SW), which foresaw much more intelligent manipulation and querying of knowledge. The opportunities that the SW provided for e.g., more intelligent retrieval, put AKT in the centre of information technology innovation and knowledge management services; the AKT skill set would clearly be central for the exploitation of those opportunities.The SW, as an extension of the WWW, provides an interesting set of constraints to the knowledge management services AKT tries to provide. As a medium for the semantically-informed coordination of information, it has suggested a number of ways in which the objectives of AKT can be achieved, most obviously through the provision of knowledge management services delivered over the web as opposed to the creation and provision of technologies to manage knowledge.AKT is working on the assumption that many web services will be developed and provided for users. The KM problem in the near future will be one of deciding which services are needed and of coordinating them. Many of these services will be largely or entirely legacies of the WWW, and so the capabilities of the services will vary. As well as providing useful KM services in their own right, AKT will be aiming to exploit this opportunity, by reasoning over services, brokering between them, and providing essential meta-services for SW knowledge service management.Ontologies will be a crucial tool for the SW. The AKT consortium brings a lot of expertise on ontologies together, and ontologies were always going to be a key part of the strategy. All kinds of knowledge sharing and transfer activities will be mediated by ontologies, and ontology management will be an important enabling task. Different applications will need to cope with inconsistent ontologies, or with the problems that will follow the automatic creation of ontologies (e.g. merging of pre-existing ontologies to create a third). Ontology mapping, and the elimination of conflicts of reference, will be important tasks. All of these issues are discussed along with our proposed technologies.Similarly, specifications of tasks will be used for the deployment of knowledge services over the SW, but in general it cannot be expected that in the medium term there will be standards for task (or service) specifications. The brokering metaservices that are envisaged will have to deal with this heterogeneity.The emerging picture of the SW is one of great opportunity but it will not be a wellordered, certain or consistent environment. It will comprise many repositories of legacy data, outdated and inconsistent stores, and requirements for common understandings across divergent formalisms. There is clearly a role for standards to play to bring much of this context together; AKT is playing a significant role in these efforts. But standards take time to emerge, they take political power to enforce, and they have been known to stifle innovation (in the short term). AKT is keen to understand the balance between principled inference and statistical processing of web content. Logical inference on the Web is tough. Complex queries using traditional AI inference methods bring most distributed computer systems to their knees. Do we set up semantically well-behaved areas of the Web? Is any part of the Web in which semantic hygiene prevails interesting enough to reason in? These and many other questions need to be addressed if we are to provide effective knowledge technologies for our content on the web

Southampton (e-Prints Soton)

Edinburgh Research Archive

Enticing Local Governments to Produce FAIR Freedom of Information Act Dossiers

Author: Kamps J.
Larooij M.
Marx M.
Perasedillo F.
Publication venue
Publication date: 01/01/2023
Field of study

Government transparency is central in a democratic society, and increasingly governments at all levels are required to publish records and data either proactively, or upon so-called Freedom of Information (FIA) requests. However, public bodies who are required by law to publish many of their documents turn out to have great difficulty to do so. And what they publish often is in a format that still breaches the requirements of the law, stipulating principles comparable to the FAIR data principles. Hence, this demo is addressing a timely problem: the FAIR publication of FIA dossiers, which is obligatory in The Netherlands since May 1st 2022.</p

International Migration, Integration and Social Cohesion online publications

Ontologies on the semantic web

Author: Ashburner
Berners-Lee
Berners-Lee
Bollobas
Borgida
Brachman
Brachman
Brooks
Buchanan
Burton-Jones
Bush
Cayzer
Chisholm
Copeland
Cost
Cruse
De Bruijn
Decker
Fensel
Fensel
Frege
Genesereth
Goble
Gruber
Gruber
Guha
Harré
Heery
Heflin
Hendler
Hendler
Horrocks
Horrocks
Kant
Kirk
Klein
Legg
Lenat
Lenat
Lenat
Lenat
Lindsay
Lowe
Lowe
Maedche
McCool
McGuinness
McIlraith
Minsky
Noy
Noy
Pease
Peirce
Peirce
Quillian
Quine
Rorty
Rozenberg
Schlick
Sicilia
Smith
Smith
Smith
Sowa
Sowa
Sowa
Weinberger
Weiss
Zalta
Publication venue: 'Wiley'
Publication date: 01/01/2007
Field of study

As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The “Semantic Web” was touted by its developers as equally revolutionary but has not yet achieved anything like the Web’s exponential uptake. This 17 000 word survey article explores why this might be so, from a perspective that bridges both philosophy and IT

Deakin Research Online

Crossref

Research Commons@Waikato

Private Enforcement in the States

Author: Guha Neel
Peters Austin
Xia Jeffrey
Zambrano Diego A.
Publication venue: Penn Carey Law: Legal Scholarship Repository
Publication date: 01/01/2024
Field of study

Scholarship on U.S. litigation and civil procedure has scarcely studied the role of private enforcement in the states. Over the past two decades, scholars have established that, almost uniquely in the world, the U.S. often relies on private parties rather than administrative agencies to enforce important statutory provisions. Take your pick of any area in American governance, and you will find private rights of action: environmental law, civil rights, employment discrimination, antitrust, consumer protection, business competition, securities fraud, and so on. In each of these areas, Congress has deliberately empowered private plaintiffs instead of, or in addition to, government agencies. Yet, despite the vast importance of private enforcement at the federal level, we have no account of how prevalent private rights of action are in state law. And this question is particularly pressing now that a number of states— triggered by the Texas abortion law S.B. 8—are using private enforcement to weaken constitutional rights. Is private enforcement a meaningful method of governance in the states or just at the federal level? Which political conditions lead to the adoption of state private enforcement? And why does it exist? In this Article, we conduct the first systematic empirical investigation of the hidden world of state private enforcement. Using computational linguistics and machine learning, we identify private-enforcement provisions across a unique dataset of all fifty states’ laws going back to 2003. Our results show that private enforcement is ubiquitous at the state level. Even by conservative estimates, there are more than 3,500 private-rights-of-action provisions in state law, ranging from traditional areas like antitrust and employment all the way to privacy violations, lawsuits against police, gravedigging, veterinary care, and waste disposal. Counterintuitively, private-enforcement provisions are expanding the most in an ideologically mixed group of small states like Utah, New Hampshire, Connecticut, Nebraska, and Wisconsin. One takeaway from these results is that state private enforcement is strikingly different from that of the federal system—it is sprawling, messy, and even chaotic. We also use our data to test conventional theories behind private-enforcement adoption. The most prominent one—the separation-of-powers theory—posits that Congress enacts private rights of action when the executive is controlled by another political party. Our empirical bottom line is that we broadly fail to find evidence in favor of any of the theories, including separation of powers. Regression analyses based on our best estimates of private-enforcement provisions do not yield a statistically meaningful relationship between divided government and private-enforcement adoption. And, while some of our measures for fee-shifting and damage clauses unearth some evidence pointing toward the separations-of-powers theory, our preferred measures of such clauses do not. We even find no correlation between an increased adoption of private enforcement and legislative control by either Democrats or Republicans. It appears the political economy of private enforcement in the states diverges radically from that of the federal government. With an eye toward future theorizing and empirical testing, we put forth three institutional differences between the states and federal government that may explain this divergence. And we sketch a future comparative research agenda focused on studying federal–state divergence. Reaffirming the central role that private enforcement plays in our system reveals the need to reorient civil procedure and incorporate state private rights of action more explicitly into its core teachings

bepress Legal Repository

Penn Law: Legal Scholarship Repository