3,508 research outputs found

    Efficiency-Quality Tradeoffs for Vector Score Aggregation

    Get PDF

    Multi-criteria assessment of ethical aspects in fresh tomato systems: Plant genomics technology innovation and food policy uses

    Get PDF
    Product assessment for imperceptible characteristics like environmental impact, healthfulness, naturalness, and fairness is a helpful tool in product innovation and for enhancing socially responsible conduct. In this study we apply multiple criteria analysis for the assessment of fresh tomatoes in terms of consumer perceptions regarding the above characteristics. The generated indices provide an explicit and comprehensive representation of consumer perceptions. Existing tomato products from the Dutch market are ranked alongside (reasonable conjectures of) potential products to be developed with the use of plant genomics technology. The results are interpreted to provide insights into the socially optimal use of (plant genomics) technology for fresh tomato production. Policy uses are highlighted.Ethical assessment, corporate societal responsibility, multiple criteria., Demand and Price Analysis, Research and Development/Tech Change/Emerging Technologies,

    Environmental efficiency of small farms in selected EU NMS

    Get PDF
    The objective of this paper is to investigate the relationship between the environmental efficiency and farm commercialisation in selected EU NMS (Bulgaria, Romania and Poland). Using a cross- section of agricultural households, environmental technical efficiency scores are calculated using hyperbolic distance function approach. The results indicate there is a negative relation between the increase in commercialisation of small farms and the production of negative externalities, like nitrogen surplus.distance function, nitrogen surplus, market integration, stochastic frontier analysis, Community/Rural/Urban Development,

    Pretrained Transformers for Text Ranking: BERT and Beyond

    Get PDF
    The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This survey provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing (NLP), information retrieval (IR), and beyond. In this survey, we provide a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. We cover a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. There are two themes that pervade our survey: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this survey also attempts to prognosticate where the field is heading

    Dynamic User-Defined Similarity Searching in Semi-Structured Text Retrieval

    Get PDF
    Modern text retrieval systems often provide a similarity search utility, that allows the user to find efficiently a fixed number k of documents in the data set that are most similar to a given query (here a query is either a simple sequence of keywords or the identifier of a full document found in previous searches that is considered of interest). We consider the case of a textual database made of semi-structured documents. For example, in a corpus of bibliographic records any record may be structured into three fields: title, authors and abstract, where each field is an unstructured free text. Each field, in turns, is modelled with a specific vector space. The problem is more complex when we also allow each such vector space to have an associated user-defined dynamic weight that influences its contribution to the overall dynamic aggregated and weighted similarity. This dynamic problem has been tackled in a recent paper by Singitham et al. in VLDB 2004. Their proposed solution, which we take as baseline, is a variant of the cluster-pruning technique that has the potential for scaling to very large corpora of documents, and is far more efficient than the naive exhaustive search. We devise an alternative way of embedding weights in the data structure, coupled with a non-trivial application of a clustering algorithm based on the furthest point first heuristic for the metric k-center problem. The validity of our approach is demonstrated experimentally by showing significant performance improvements over the scheme proposed in VLDB 2004 We improve significantly tradeoffs between query time and output quality with respect to the baseline method in VLDB 2004, and also with respect to a novel method by Chierichetti et al. to appear in ACM PODS 2007. We also speed up the pre-processing time by a factor at least thirty

    Measuring Eco-efficiency of Production: A Frontier Approach

    Get PDF
    Eco-efficiency of production is an important concept both from the viewpoint of society and business community; but as yet, there is no unambiguous way to its measurement. The purpose of this paper is to present a general measurement framework based on production theory and the activity analysis approach. Although we exploit the existing methods and techniques, our approach diverges essentially from the usual treatments of the environmental performance of firms in the productive efficiency analysis. The main difference between our approach and the earlier studies is that we build on the definition of eco-efficiency as the ratio of economic value added to the environmental damage index. Related to this orientation, we also approach eco-efficiency from a more aggregate perspective. Our general framework is illustrated by an empirical application to the evaluation of eco-efficiency of road transportation in Finland.Eco-efficiency, Environmental Pressures, Aggregation, Benefit of the Doubt Weighting, Distance Function, Activity Analysis, Data Envelopment Analysis, Road transportation

    Bringing Salary Transparency to the World: Computing Robust Compensation Insights via LinkedIn Salary

    Full text link
    The recently launched LinkedIn Salary product has been designed with the goal of providing compensation insights to the world's professionals and thereby helping them optimize their earning potential. We describe the overall design and architecture of the statistical modeling system underlying this product. We focus on the unique data mining challenges while designing and implementing the system, and describe the modeling components such as Bayesian hierarchical smoothing that help to compute and present robust compensation insights to users. We report on extensive evaluation with nearly one year of de-identified compensation data collected from over one million LinkedIn users, thereby demonstrating the efficacy of the statistical models. We also highlight the lessons learned through the deployment of our system at LinkedIn.Comment: Conference information: ACM International Conference on Information and Knowledge Management (CIKM 2017

    Graph Convolutional Neural Networks for Web-Scale Recommender Systems

    Full text link
    Recent advancements in deep neural networks for graph-structured data have led to state-of-the-art performance on recommender system benchmarks. However, making these methods practical and scalable to web-scale recommendation tasks with billions of items and hundreds of millions of users remains a challenge. Here we describe a large-scale deep recommendation engine that we developed and deployed at Pinterest. We develop a data-efficient Graph Convolutional Network (GCN) algorithm PinSage, which combines efficient random walks and graph convolutions to generate embeddings of nodes (i.e., items) that incorporate both graph structure as well as node feature information. Compared to prior GCN approaches, we develop a novel method based on highly efficient random walks to structure the convolutions and design a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model. We also develop an efficient MapReduce model inference algorithm to generate embeddings using a trained model. We deploy PinSage at Pinterest and train it on 7.5 billion examples on a graph with 3 billion nodes representing pins and boards, and 18 billion edges. According to offline metrics, user studies and A/B tests, PinSage generates higher-quality recommendations than comparable deep learning and graph-based alternatives. To our knowledge, this is the largest application of deep graph embeddings to date and paves the way for a new generation of web-scale recommender systems based on graph convolutional architectures.Comment: KDD 201
    • 

    corecore