1,529 research outputs found
Data Structures and Data Types in Object-Oriented Databases
The possibility of finding a static type system for object-oriented programming languages was initiated by Cardelli [Car88, CW85] who showed that it is possible to express the polymorphic nature of functions such a
The Infati Data
The ability to perform meaningful empirical studies is of essence in research
in spatio-temporal query processing. Such studies are often necessary to gain
detailed insight into the functional and performance characteristics of
proposals for new query processing techniques.
We present a collection of spatio-temporal data, collected during an
intelligent speed adaptation project, termed INFATI, in which some two dozen
cars equipped with GPS receivers and logging equipment took part. We describe
how the data was collected and how it was "modified" to afford the drivers some
degree of anonymity.
We also present the road network in which the cars were moving during data
collection.
The GPS data is publicly available for non-commercial purposes. It is our
hope that this resource will help the spatio-temporal research community in its
efforts to develop new and better query processing techniques
FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning
Federated learning (FL) has emerged as a new paradigm for privacy-preserving
computation in recent years. Unfortunately, FL faces two critical challenges
that hinder its actual performance: data distribution heterogeneity and high
resource costs brought by large foundation models. Specifically, the non-IID
data in different clients make existing FL algorithms hard to converge while
the high resource costs, including computational and communication costs that
increase the deployment difficulty in real-world scenarios. In this paper, we
propose an effective yet simple method, named FedCLIP, to achieve fast
generalization and personalization for CLIP in federated learning. Concretely,
we design an attention-based adapter for the large model, CLIP, and the rest
operations merely depend on adapters. Lightweight adapters can make the most
use of pretrained model information and ensure models be adaptive for clients
in specific tasks. Simultaneously, small-scale operations can mitigate the
computational burden and communication burden caused by large models. Extensive
experiments are conducted on three datasets with distribution shifts.
Qualitative and quantitative results demonstrate that FedCLIP significantly
outperforms other baselines (9% overall improvements on PACS) and effectively
reduces computational and communication costs (283x faster than FedAVG). Our
code will be available at: https://github.com/microsoft/PersonalizedFL.Comment: Accepted by IEEE Data Engineering Bulletin; code is at:
https://github.com/microsoft/PersonalizedF
Repositories for Institutional Open Access: Mandated Deposit Policies
Only 15% of articles are currently being made Open Access (OA) through spontaneous self-archiving efforts by their authors. They average 25%-250% more citations in all 12 disciplines tested so far. Ninety-four percent of journals endorse immediate OA self-archiving. There is no evidence that self-archiving induces subscription cancellations. The âOA advantageâ consists of: Early Advantage (early self-archiving produces both earlier and more citations), Usage Advantage (more downloads for OA articles, correlated with later citations), Competitive Advantage (relative citation advantage of OA over non-OA articles: disappears at 100% OA), Quality Advantage (OA advantage is higher, the higher the quality of the article) and Quality Bias (authors selectively self-archiving their higher quality articles â a non-causal component: disappears at 100% OA). We are currently comparing the OA advantage for mandated and spontaneous (self-selected) self-archiving. Deposit rates in Institutional Repositories (IRs) remain at 15% if unmandated, but climb toward 100% OA if mandated, confirming surveys that predicted 95% compliance. In the UK, 4 of the 8 research funding councils and the Wellcome Trust mandate self-archiving and it is being considered by the European Commission and the US federal FRPAA. There is no reason for universities to wait for the passage of the legislation. Five universities and two research institutions (including CERN) have already mandated it, with documented success. An Immediate-Deposit/Optional-Access Mandate covers all cases and moots all legal issues: metadata are immediately visible webwide and, where needed, access to the postprint can be set as Closed Access instead of OA throughout any embargo period. Software to support this approach (that allows the author to email individual copies of non-Open Access papers to individual requesters) has been created for both EPrints and DSpace repository platforms
QueRIE: Collaborative Database Exploration
Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the userâs querying behavior and finds matching patterns in the systemâs query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these âsimilarâ users and their queries to recommend queries that the current user may find interesting. In this work we describe an instantiation of the QueRIE framework, where the active userâs session is represented by a set of query fragments. The recorded fragments are used to identify similar query fragments in the previously recorded sessions, which are in turn assembled in potentially interesting queries for the active user. We show through experimentation that the proposed method generates meaningful recommendations on real-life traces from the SkyServer database and propose a scalable design that enables the incremental update of similarities, making real-time computations on large amounts of data feasible. Finally, we compare this fragment-based instantiation with our previously proposed tuple-based instantiation discussing the advantages and disadvantages of each approach
Distributed stream reasoning
Stream Reasoning is the combination of reasoning techniques with data streams. In this paper, we present our approach to enable rule-based reasoning on semantic data streams in a distributed manne
Within-Journal Demonstrations of the Open-Access Impact Advantage: PLoS, Pipe-Dreams and Peccadillos (LETTER)
Eysenbach's (2006) study in PloS Biology on 1492 articles published during one 6-month period in one journal (PNAS) found that the Open Access (OA) articles were more cited than the non-OA ones. The online bibliography on the OA citation advantage http://opcit.eprints.org/oacitation-biblio.html records a number of prior within-journal comparisons that found exactly the same effect: freely available articles are read and cited more. Eysenbachâs further finding that the OA advantage (in this particular 6-month, 3-option, 1-journal PloS/PNAS study) is greater for articles that have paid for OA publication than for those that have merely been self-archived will require replication on much larger samples as most of the prior evidence for the OA advantage comes from self-archived articles and is based on sample sizes four orders of magnitude larger for both the number of articles and the number of journals tested
- âŠ