2,825 research outputs found
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
Social Search with Missing Data: Which Ranking Algorithm?
Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods
RACOFI: A Rule-Applying Collaborative Filtering System
In this paper we give an overview of the RACOFI (Rule-Applying Collaborative Filtering) multidimensional rating system and its related technologies. This will be exemplified with RACOFI Music, an implemented collaboration agent that assists on-line users in the rating and recommendation of audio (Learning) Objects. It lets users rate contemporary Canadian music in the five dimensions of impression, lyrics, music, originality, and production. The collaborative filtering algorithms STI Pearson, STIN2, and the Per Item Average algorithms are then employed together with RuleML-based rules to recommend music objects that best match user queries. RACOFI has been on-line since August 2003 at http://racofi.elg.ca.
Advanced Knowledge Technologies at the Midterm: Tools and Methods for the Semantic Web
The University of Edinburgh and research sponsors are authorised to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are the authorâs and shouldnât be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties.In a celebrated essay on the new electronic media, Marshall McLuhan wrote in 1962:Our private senses are not closed systems but are endlessly translated into each other in that experience which we call consciousness. Our extended senses, tools, technologies, through the ages, have been closed systems incapable of interplay or collective awareness. Now, in the electric age, the very
instantaneous nature of co-existence among our technological instruments has created a crisis quite new in human history. Our extended faculties and senses now constitute a single field of experience which demands that they become collectively conscious. Our technologies, like our private senses, now demand an interplay and ratio that makes rational co-existence possible. As long as our technologies were as slow as the wheel or the alphabet or money, the fact that
they were separate, closed systems was socially and psychically supportable. This is not true now when sight and sound and movement are simultaneous and global in extent. (McLuhan 1962, p.5, emphasis in original)Over forty years later, the seamless interplay that McLuhan demanded between our
technologies is still barely visible. McLuhanâs predictions of the spread, and increased importance, of electronic media have of course been borne out, and the worlds of business, science and knowledge storage and transfer have been revolutionised. Yet
the integration of electronic systems as open systems remains in its infancy.Advanced Knowledge Technologies (AKT) aims to address this problem, to create a view of knowledge and its management across its lifecycle, to research and create the
services and technologies that such unification will require. Half way through its sixyear span, the results are beginning to come through, and this paper will explore some of the services, technologies and methodologies that have been developed. We hope to give a sense in this paper of the potential for the next three years, to discuss the insights and lessons learnt in the first phase of the project, to articulate the challenges and issues that remain.The WWW provided the original context that made the AKT approach to knowledge
management (KM) possible. AKT was initially proposed in 1999, it brought together an interdisciplinary consortium with the technological breadth and complementarity to create the conditions for a unified approach to knowledge across its lifecycle. The
combination of this expertise, and the time and space afforded the consortium by the
IRC structure, suggested the opportunity for a concerted effort to develop an approach
to advanced knowledge technologies, based on the WWW as a basic infrastructure.The technological context of AKT altered for the better in the short period between the development of the proposal and the beginning of the project itself with the development of the semantic web (SW), which foresaw much more intelligent manipulation and querying of knowledge. The opportunities that the SW provided for e.g., more intelligent retrieval, put AKT in the centre of information technology innovation and knowledge management services; the AKT skill set would clearly be central for the exploitation of those opportunities.The SW, as an extension of the WWW, provides an interesting set of constraints to
the knowledge management services AKT tries to provide. As a medium for the
semantically-informed coordination of information, it has suggested a number of ways in which the objectives of AKT can be achieved, most obviously through the
provision of knowledge management services delivered over the web as opposed to the creation and provision of technologies to manage knowledge.AKT is working on the assumption that many web services will be developed and provided for users. The KM problem in the near future will be one of deciding which services are needed and of coordinating them. Many of these services will be largely or entirely legacies of the WWW, and so the capabilities of the services will vary. As well as providing useful KM services in their own right, AKT will be aiming to exploit this opportunity, by reasoning over services, brokering between them, and providing essential meta-services for SW knowledge service management.Ontologies will be a crucial tool for the SW. The AKT consortium brings a lot of expertise on ontologies together, and ontologies were always going to be a key part of the strategy. All kinds of knowledge sharing and transfer activities will be mediated by ontologies, and ontology management will be an important enabling task. Different
applications will need to cope with inconsistent ontologies, or with the problems that will follow the automatic creation of ontologies (e.g. merging of pre-existing
ontologies to create a third). Ontology mapping, and the elimination of conflicts of
reference, will be important tasks. All of these issues are discussed along with our
proposed technologies.Similarly, specifications of tasks will be used for the deployment of knowledge services over the SW, but in general it cannot be expected that in the medium term there will be standards for task (or service) specifications. The brokering metaservices
that are envisaged will have to deal with this heterogeneity.The emerging picture of the SW is one of great opportunity but it will not be a wellordered, certain or consistent environment. It will comprise many repositories of legacy data, outdated and inconsistent stores, and requirements for common understandings across divergent formalisms. There is clearly a role for standards to play to bring much of this context together; AKT is playing a significant role in these efforts. But standards take time to emerge, they take political power to enforce, and they have been known to stifle innovation (in the short term). AKT is keen to understand the balance between principled inference and statistical processing of web content. Logical inference on the Web is tough. Complex queries using traditional AI inference methods bring most distributed computer systems to their knees. Do we set up semantically well-behaved areas of the Web? Is any part of the Web in which
semantic hygiene prevails interesting enough to reason in? These and many other
questions need to be addressed if we are to provide effective knowledge technologies
for our content on the web
Enticing Local Governments to Produce FAIR Freedom of Information Act Dossiers
Government transparency is central in a democratic society, and increasingly governments at all levels are required to publish records and data either proactively, or upon so-called Freedom of Information (FIA) requests. However, public bodies who are required by law to publish many of their documents turn out to have great difficulty to do so. And what they publish often is in a format that still breaches the requirements of the law, stipulating principles comparable to the FAIR data principles. Hence, this demo is addressing a timely problem: the FAIR publication of FIA dossiers, which is obligatory in The Netherlands since May 1st 2022.</p
Ontologies on the semantic web
As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The âSemantic Webâ was touted by its developers as equally revolutionary but has not yet achieved anything like the Webâs exponential uptake. This 17 000 word survey article explores why this might be so, from a perspective that bridges both philosophy and IT
Private Enforcement in the States
Scholarship on U.S. litigation and civil procedure has scarcely studied the role of private enforcement in the states. Over the past two decades, scholars have established that, almost uniquely in the world, the U.S. often relies on private parties rather than administrative agencies to enforce important statutory provisions. Take your pick of any area in American governance, and you will find private rights of action: environmental law, civil rights, employment discrimination, antitrust, consumer protection, business competition, securities fraud, and so on. In each of these areas, Congress has deliberately empowered private plaintiffs instead of, or in addition to, government agencies. Yet, despite the vast importance of private enforcement at the federal level, we have no account of how prevalent private rights of action are in state law. And this question is particularly pressing now that a number of statesâ triggered by the Texas abortion law S.B. 8âare using private enforcement to weaken constitutional rights. Is private enforcement a meaningful method of governance in the states or just at the federal level? Which political conditions lead to the adoption of state private enforcement? And why does it exist?
In this Article, we conduct the first systematic empirical investigation of the hidden world of state private enforcement. Using computational linguistics and machine learning, we identify private-enforcement provisions across a unique dataset of all fifty statesâ laws going back to 2003. Our results show that private enforcement is ubiquitous at the state level. Even by conservative estimates, there are more than 3,500 private-rights-of-action provisions in state law, ranging from traditional areas like antitrust and employment all the way to privacy violations, lawsuits against police, gravedigging, veterinary care, and waste disposal. Counterintuitively, private-enforcement provisions are expanding the most in an ideologically mixed group of small states like Utah, New Hampshire, Connecticut, Nebraska, and Wisconsin. One takeaway from these results is that state private enforcement is strikingly different from that of the federal systemâit is sprawling, messy, and even chaotic.
We also use our data to test conventional theories behind private-enforcement adoption. The most prominent oneâthe separation-of-powers theoryâposits that Congress enacts private rights of action when the executive is controlled by another political party. Our empirical bottom line is that we broadly fail to find evidence in favor of any of the theories, including separation of powers. Regression analyses based on our best estimates of private-enforcement provisions do not yield a statistically meaningful relationship between divided government and private-enforcement adoption. And, while some of our measures for fee-shifting and damage clauses unearth some evidence pointing toward the separations-of-powers theory, our preferred measures of such clauses do not. We even find no correlation between an increased adoption of private enforcement and legislative control by either Democrats or Republicans. It appears the political economy of private enforcement in the states diverges radically from that of the federal government. With an eye toward future theorizing and empirical testing, we put forth three institutional differences between the states and federal government that may explain this divergence. And we sketch a future comparative research agenda focused on studying federalâstate divergence. Reaffirming the central role that private enforcement plays in our system reveals the need to reorient civil procedure and incorporate state private rights of action more explicitly into its core teachings
- âŠ