22 research outputs found

    Invasionen av Irak 2003 - En studie om de bakomliggande orsakerna till de nordiska ländernas ställningstagande

    Get PDF
    Syftet med studien har varit att kartlägga varför de nordiska staterna Sverige, Norge och Danmark har valt olika ställningstaganden i Irakkonflikten. Studien är en komparativ undersökning som besvarar frågorna om vilka bakomliggande faktorer som påverkat ställningstagandena. I undersökningen av dessa har fem variabler med förankring i de teoretiska utgångspunkterna; system- och statsnivå studerats. Följande variabler har studerats: FN, EU, NATO, Inrikespolitiska förhållanden och Beslut till krig. Sverige och Norge hävdade att invasionen inte hade en folkrättslig förankring. Synen på FN som garant för små staters suveränitet och strävan efter multilaterala överenskommelser var vägledande för beslutet. För Sverige har även de inrikespolitiska förhållandena spelat en roll. Danmark valde att bistå invasionen militärt, vilket har sin grund i den förändrade utrikes- och säkerhetspolitiken som inneburit en degradering av synen på FN. Tillsammans med undantaget från försvarssamarbetet inom EU har policyförändringen lett till en nästan ovillkorlig identifiering med amerikanska säkerhetspolitiska målsättningar och strategier

    Hierarchical Forecasting at Scale

    Full text link
    Existing hierarchical forecasting techniques scale poorly when the number of time series increases. We propose to learn a coherent forecast for millions of time series with a single bottom-level forecast model by using a sparse loss function that directly optimizes the hierarchical product and/or temporal structure. The benefit of our sparse hierarchical loss function is that it provides practitioners a method of producing bottom-level forecasts that are coherent to any chosen cross-sectional or temporal hierarchy. In addition, removing the need for a post-processing step as required in traditional hierarchical forecasting techniques reduces the computational cost of the prediction phase in the forecasting pipeline. On the public M5 dataset, our sparse hierarchical loss function performs up to 10% (RMSE) better compared to the baseline loss function. We implement our sparse hierarchical loss function within an existing forecasting model at bol, a large European e-commerce platform, resulting in an improved forecasting performance of 2% at the product level. Finally, we found an increase in forecasting performance of about 5-10% when evaluating the forecasting performance across the cross-sectional hierarchies that we defined. These results demonstrate the usefulness of our sparse hierarchical loss applied to a production forecasting system at a major e-commerce platform

    A Comparison of Supervised Learning to Match Methods for Product Search

    Get PDF
    The vocabulary gap is a core challenge in information retrieval (IR). In e-commerce applications like product search, the vocabulary gap is reported to be a bigger challenge than in more traditional application areas in IR, such as news search or web search. As recent learning to match methods have made important advances in bridging the vocabulary gap for these traditional IR areas, we investigate their potential in the context of product search. In this paper we provide insights into using recent learning to match methods for product search. We compare both effectiveness and efficiency of these methods in a product search setting and analyze their performance on two product search datasets, with 50,000 queries each. One is an open dataset made available as part of a community benchmark activity at CIKM 2016. The other is a proprietary query log obtained from a European e-commerce platform. This comparison is conducted towards a better understanding of trade-offs in choosing a preferred model for this task. We find that (1) models that have been specifically designed for short text matching, like MV-LSTM and DRMMTKS, are consistently among the top three methods in all experiments; however, taking efficiency and accuracy into account at the same time, ARC-I is the preferred model for real world use cases; and (2) the performance from a state-of-the-art BERT-based model is mediocre, which we attribute to the fact that the text BERT is pre-trained on is very different from the text we have in product search. We also provide insights into factors that can influence model behavior for different types of query, such as the length of retrieved list, and query complexity, and discuss the implications of our findings for e-commerce practitioners, with respect to choosing a well performing method.Comment: 10 pages, 5 figures, Accepted at SIGIR Workshop on eCommerce 202

    Improving Retrieval-Augmented Large Language Models via Data Importance Learning

    Full text link
    Retrieval augmentation enables large language models to take advantage of external knowledge, for example on tasks like question answering and data imputation. However, the performance of such retrieval-augmented models is limited by the data quality of their underlying retrieval corpus. In this paper, we propose an algorithm based on multilinear extension for evaluating the data importance of retrieved data points. There are exponentially many terms in the multilinear extension, and one key contribution of this paper is a polynomial time algorithm that computes exactly, given a retrieval-augmented model with an additive utility function and a validation set, the data importance of data points in the retrieval corpus using the multilinear extension of the model's utility function. We further proposed an even more efficient ({\epsilon}, {\delta})-approximation algorithm. Our experimental results illustrate that we can enhance the performance of large language models by only pruning or reweighting the retrieval corpus, without requiring further training. For some tasks, this even allows a small model (e.g., GPT-JT), augmented with a search engine API, to outperform GPT-3.5 (without retrieval augmentation). Moreover, we show that weights based on multilinear extension can be computed efficiently in practice (e.g., in less than ten minutes for a corpus with 100 million elements)

    Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?

    Full text link
    Many machine learning projects for new application areas involve teams of humans who label data for a particular purpose, from hiring crowdworkers to the paper's authors labeling the data themselves. Such a task is quite similar to (or a form of) structured content analysis, which is a longstanding methodology in the social sciences and humanities, with many established best practices. In this paper, we investigate to what extent a sample of machine learning application papers in social computing --- specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data --- give specific details about whether such best practices were followed. Our team conducted multiple rounds of structured content analysis of each paper, making determinations such as: Does the paper report who the labelers were, what their qualifications were, whether they independently labeled the same items, whether inter-rater reliability metrics were disclosed, what level of training and/or instructions were given to labelers, whether compensation for crowdworkers is disclosed, and if the training data is publicly available. We find a wide divergence in whether such practices were followed and documented. Much of machine learning research and education focuses on what is done once a "gold standard" of training data is available, but we discuss issues around the equally-important aspect of whether such data is reliable in the first place.Comment: 18 pages, includes appendi

    Full-Length L1CAM and Not Its Δ2Δ27 Splice Variant Promotes Metastasis through Induction of Gelatinase Expression

    Get PDF
    Tumour-specific splicing is known to contribute to cancer progression. In the case of the L1 cell adhesion molecule (L1CAM), which is expressed in many human tumours and often linked to bad prognosis, alternative splicing results in a full-length form (FL-L1CAM) and a splice variant lacking exons 2 and 27 (SV-L1CAM). It has not been elucidated so far whether SV-L1CAM, classically considered as tumour-associated, or whether FL-L1CAM is the metastasis-promoting isoform. Here, we show that both variants were expressed in human ovarian carcinoma and that exposure of tumour cells to pro-metastatic factors led to an exclusive increase of FL-L1CAM expression. Selective overexpression of one isoform in different tumour cells revealed that only FL-L1CAM promoted experimental lung and/or liver metastasis in mice. In addition, metastasis formation upon up-regulation of FL-L1CAM correlated with increased invasive potential and elevated Matrix metalloproteinase (MMP)-2 and -9 expression and activity in vitro as well as enhanced gelatinolytic activity in vivo. In conclusion, we identified FL-L1CAM as the metastasis-promoting isoform, thereby exemplifying that high expression of a so-called tumour-associated variant, here SV-L1CAM, is not per se equivalent to a decisive role of this isoform in tumour progression

    Skalierbare Datenanalyse in massiv-parallelen Datenflusssystemen

    No full text
    This thesis lays the ground work for enabling scalable data mining in massively parallel dataflow systems, using large datasets. Such datasets have become ubiquitous. We illustrate common fallacies with respect to scalable data mining: It is in no way sufficient to naively implement textbook algorithms on parallel systems; bottlenecks on all layers of the stack prevent the scalability of such naive implementations. We argue that scalability in data mining is a multi-leveled problem and must therefore be approached on the interplay of algorithms, systems, and applications. We therefore discuss a selection of scalability problems on these different levels. We investigate algorithm-specific scalability aspects of collaborative filtering algorithms for computing recommendations, a popular data mining use case with many industry deployments. We show how to efficiently execute the two most common approaches, namely neighborhood methods and latent factor models on MapReduce, and describe a specialized architecture for scaling collaborative filtering to extremely large datasets which we implemented at Twitter. We turn to system-specific scalability aspects, where we improve system performance during the distributed execution of a special class of iterative algorithms by drastically reducing the overhead required for guaranteeing fault tolerance. Therefore we propose a novel optimistic approach to fault-tolerance which exploits the robust convergence properties of a large class of fixpoint algorithms and does not incur measurable overhead in failure-free cases. Finally, we present work on an application-specific scalability aspect of scalable data mining. A common problem when deploying machine learning applications in real-world scenarios is that the prediction quality of ML models heavily depends on hyperparameters that have to be chosen in advance. We propose an algorithmic framework for an important subproblem occuring during hyperparameter search at scale: efficiently generating samples from block-partitioned matrices in a shared-nothing environment. For every selected problem, we show how to execute the resulting computation automatically in a parallel and scalable manner, and evaluate our proposed solution on large datasets with billions of datapoints.Diese Doktorarbeit befasst sich mit den technischen Grundlagen, die notwendig sind, um skalierbares Data Mining auf heutigen, großen Datensätzen mithilfe massiv-paralleler Datenflusssysteme zu ermöglichen. Sie beschreibt gängige Fehlannahmen im Bezug auf skalierbares Data Mining. Es reicht in keinster Weise aus, Algorithmen in ihrer herkömmlichen Formulierung auf parallelen Systemen zu implementieren. Engpässe auf allen Schichten verhindern die Skalierbarkeit solch naiver Implementierungen. Die Arbeit legt dar, dass Skalierbarkeit im Data Mining ein mehrschichtiges Problem ist und daher im Zusammenspiel von Algorithmen, Systemen und Anwendungen angegangen werden muss. Deshalb befasst sich diese Arbeit mit einer Auswahl von Skalierbarkeitsproblemen in verschiedenen Schichten. Die Arbeit untersucht algorithmus-spezifische Skalierbarkeitsaspekte von "Collaborative Filtering"-Algorithmen zur Empfehlungsberechnung, eine beliebte Data Mining-Technik, die häufig in der Industrie Anwendung findet. Es wird dargelegt, wie die beiden vorherrschenden Ansätze, "Neighborhood Methods" und "Latent Factor Models" mithilfe des MapReduce Paradigmas skaliert werden können. Desweiteren beschreibt die Arbeit ein spezialisierte Architektur, die bei Twitter implementiert wurde, um Collaborative Filtering auf extrem große Datensätze anwenden zu können. Im Folgenden wird sich mit system-spezischen Skalierbarkeitsaspekten befasst: die Arbeit beschreibt, wie man die Systemleistung während der verteilten Ausführung einer speziellen Klasse iterativer Algorithmen verbessern kann, indem man den Mehraufwand drastisch reduziert, der für die Garantie von Fehlertoleranz notwendig ist. Die Arbeit führt einen neuartigen optimistischen Ansatz zur Fehlertoleranz ein, der die robusten Konvergenzeigenschaften einer großen Klasse von Fixpunktalgorithmen ausnutzt und während fehlerfreier Ausführung keinen messbaren Mehraufwand verursacht. Schlussendlich widmet sich die Arbeit einem anwendungsspezifischen Skalierbarkeitsaspekt. Ein gängiges Problem beim Einsatz von Anwendungen des Maschinellen Lernens ist, dass die Vorhersagequalität der Modelle häufig stark von Hyperparametern abhängt, die im Vorhinein gewählt werden müssen. Die Arbeit beschreibt ein algorithmisches Framework für ein wichtiges Unterproblem bei der Suche nach Hyperparameter auf großen Datensätzen: die effiziente Generierung von Stichproben aus block-partitionierten Matrizen in verteilten Systemen. Für jedes ausgewählte Problem legt die Arbeit dar, wie die zugrundeliegenden Berechnungen automatisch auf eine parallele und skalierbare Weise ausgeführt werden können. Die präsentierten Lösungen werden experimentell auf großen Datensätzen bestehend aus Milliarden einzelner Datenpunkte evaluiert
    corecore