Search CORE

25 research outputs found

Guess who? Multilingual approach for the automated generation of author-stylized poetry

Author: Tikhonov Alexey
Yamshchikov Ivan P.
Publication venue
Publication date: 17/09/2018
Field of study

This paper addresses the problem of stylized text generation in a multilingual setup. A version of a language model based on a long short-term memory (LSTM) artificial neural network with extended phonetic and semantic embeddings is used for stylized poetry generation. The quality of the resulting poems generated by the network is estimated through bilingual evaluation understudy (BLEU), a survey and a new cross-entropy based metric that is suggested for the problems of such type. The experiments show that the proposed model consistently outperforms random sample and vanilla-LSTM baselines, humans also tend to associate machine generated texts with the target author

arXiv.org e-Print Archive

Portfolio optimization in the case of an asset with a given liquidation time distribution

Author: Bordag Ljudmila A.
Yamshchikov Ivan P.
Zhelezov Dmitry
Publication venue
Publication date: 11/07/2014
Field of study

Management of the portfolios containing low liquidity assets is a tedious problem. The buyer proposes the price that can differ greatly from the paper value estimated by the seller, the seller, on the other hand, can not liquidate his portfolio instantly and waits for a more favorable offer. To minimize losses in this case we need to develop new methods. One of the steps moving the theory towards practical needs is to take into account the time lag of the liquidation of an illiquid asset. This task became especially significant for the practitioners in the time of the global financial crises. Working in the Merton's optimal consumption framework with continuous time we consider an optimization problem for a portfolio with an illiquid, a risky and a risk-free asset. While a standard Black-Scholes market describes the liquid part of the investment the illiquid asset is sold at a random moment with prescribed liquidation time distribution. In the moment of liquidation it generates additional liquid wealth dependent on illiquid assets paper value. The investor has the logarithmic utility function as a limit case of a HARA-type utility. Different distributions of the liquidation time of the illiquid asset are under consideration - a classical exponential distribution and Weibull distribution that is more practically relevant. Under certain conditions we show the existence of the viscosity solution in both cases. Applying numerical methods we compare classical Merton's strategies and the optimal consumption-allocation strategies for portfolios with different liquidation-time distributions of an illiquid asset.Comment: 30 pages, 1 figur

arXiv.org e-Print Archive

ORB Academic Journals

What is Wrong with Language Models that Can Not Tell a Story?

Author: Tikhonov Alexey
Yamshchikov Ivan P.
Publication venue
Publication date: 10/11/2022
Field of study

This paper argues that a deeper understanding of narrative and the successful generation of longer subjectively interesting texts is a vital bottleneck that hinders the progress in modern Natural Language Processing (NLP) and may even be in the whole field of Artificial Intelligence. We demonstrate that there are no adequate datasets, evaluation methods, and even operational concepts that could be used to start working on narrative processing

arXiv.org e-Print Archive

Vocabulary Transfer for Medical Texts

Author: Mosin Vladislav D.
Yamshchikov Ivan P.
Publication venue
Publication date: 04/08/2022
Field of study

Vocabulary transfer is a transfer learning subtask in which language models fine-tune with the corpus-specific tokenization instead of the default one, which is being used during pretraining. This usually improves the resulting performance of the model, and in the paper, we demonstrate that vocabulary transfer is especially beneficial for medical text processing. Using three different medical natural language processing datasets, we show vocabulary transfer to provide up to ten extra percentage points for the downstream classifier accuracy

arXiv.org e-Print Archive

Pragmatic Constraint on Distributional Semantics

Author: Filippov Nikolai
Yamshchikov Ivan P.
Zhemchuzhina Elizaveta
Publication venue
Publication date: 20/11/2022
Field of study

This paper studies the limits of language models' statistical learning in the context of Zipf's law. First, we demonstrate that Zipf-law token distribution emerges irrespective of the chosen tokenization. Second, we show that Zipf distribution is characterized by two distinct groups of tokens that differ both in terms of their frequency and their semantics. Namely, the tokens that have a one-to-one correspondence with one semantic concept have different statistical properties than those with semantic ambiguity. Finally, we demonstrate how these properties interfere with statistical learning procedures motivated by distributional semantics

arXiv.org e-Print Archive