5 research outputs found
AUTOMATED KEYWORD GENERATION IN THE PUBLIC RADIO SECTOR USING WORD EMBEDDINGS
Public broadcasters find themselves in a difficult situation when it comes to digital offers. In more and more use cases, metadata is needed, e.g. to allow radio editors to search for content pieces, to set up content-based recommendation services, to allow users to browse by categories or tags, or to optimize content for search engines. They are in need of proper metadata to manage digital products and to offer new and timely services. Public broadcasters often have their own taxonomy of keywords at hand. The manual distilling of metadata in particular in form of keywords may however become a bottleneck in operation, whereas automatic keyword generation does not always provide the desired accuracy and also requires continuous human effort for training classifiers and controlling the accuracy. Building upon more recent techniques of word embedding we present a novel approach to assign keywords from a taxonomy to documents on the basis of distributed representation of words and documents that does not require annotation by human experts and evaluate it with a large dataset of a German nation-wide broadcaster. Preliminary results are promising that keywords can be automatically generated in an unsupervised way in the public radio sector
Designing Radio in a Personalized World
Radio broadcasting is currently undergoing major changes. Radio broadcasting agencies experience a shift from the classic linear stream to non-linear personalized playouts on smart devices. Many broadcasting agencies are experimenting how to innovate their offerings, but designing personalized radio is not straightforward. We contribute a design science artefact that considers requirements from different stakeholders – listeners, broadcasting agencies and the public – and present design requirements and design principles with a corresponding architecture for personalized radio
Improving Recall and Precision in Unsupervised Multi-Label Document Classification Tasks by Combining Word Embeddings with TF-IDF
Multi-label document classification is a common task and has become increasingly important for current business needs. However, generating keywords is not easily done as, next to methodological challenges, labeled training data for supervised classification does not always exist in the desired amount or quality. Therefore, methods that do not require labeled training data (e.g., unsupervised learning or statistical approaches) are valuable for practice. As none of these approaches alone provides optimal results in terms of recall and precision, we show that it is worth examining existing approaches for complementary strengths in order to combine them. We found such complementary strengths for an unsupervised word embedding method and the term frequency–inverse document frequency method (tfidf) and propose a combined approach. For evaluation, we test the combined approach on a data set from a public broadcaster in Germany and show that recall and precision can be significantly improved
Profiling Cryptocurrency Pump and Dump Schemes in DeFi: A Chain-Level Analysis of Coins and Participants
Pump and dump (P&D) schemes are a phenomenon known from stock trading that quickly expanded into cryptocurrency trading. Several studies dealt with analysis and prediction models related to such schemes in cryptocurrencies already, but only few researchers made use of data related to the underlying transactions available on decentralized exchanges (DEXs). In line with current calls to action, we apply both value and transaction analytics to profile P&D schemes and derive insights on coins as well as participant behavior during such activities. Based on the analysis of over 1.3M transactions related to 288 cryptocurrencies traded on DEXs and 314 P&D attempts, we find that particular cryptocurrencies are more frequently targeted by P&D activities than others and some seem specifically designed for that purpose. Our work contributes to the growing research on fraud in financial markets and characterizes actors and their behavior at the blockchain level