6 research outputs found
Handwritten text generation and strikethrough characters augmentation
We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate and Character Error Rate beyond best-reported results on handwriting text recognition tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix), which proved to be very effective in handwriting text recognition tasks. StackMix uses weakly-supervised framework to get character boundaries. Because these data augmentation techniques are independent of the network used, they could also be applied to enhance the performance of other networks and approaches to handwriting text recognition. Extensive experiments on ten handwritten text datasets show that HandWritten Blots augmentation and StackMix significantly improve the quality of handwriting text recognition models
Many heads but one brain: FusionBrain β a single multimodal multitask architecture and a competition
Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called FusionBrain, the first competition which is targeted to make a universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The FusionBrain Challenge combines the following specific tasks: Code2code Translation, Handwritten Text recognition, Zero-shot Object Detection, and Visual Question Answering. We have created datasets for each task to test the participantsβ submissions on it. Moreover, we have collected and made publicly available a new handwritten dataset in both English and Russian, which consists of 94,128 pairs of images and texts. We also propose a multimodal and multitask architecture β a baseline solution, in the centre of which is a frozen foundation model and which has been trained in Fusion mode along with Single-task mode. The proposed Fusion approach proves to be competitive and more energy-efficient compared to the task-specific one.We would like to thank Sber and SberCloud for granting the GPU-resources to us to experiment with different architectures and also to the participants to train their models, and for supporting the FusionBrain Challenge in general
Multiwavelength monitoring and reverberation mapping of a changing look event in the Seyfert galaxy NGC 3516
We present the results of photometric and spectroscopic monitoring campaigns of the changing look AGN NGC 3516 carried out in 2018 to 2020 covering the wavelength range from the X-ray to the optical. The facilities included the telescopes of the CMO SAI MSU, the 2.3-m WIRO telescope, and the XRT and UVOT of Swift. We found that NGC 3516 brightened to a high state and could be classified as Sy1.5 during the late spring of 2020. We have measured time delays in the responses of the Balmer and He ii Ξ»4686 lines to continuum variations. In the case of the best-characterized broad H Ξ² line, the delay to continuum variability is about 17 d in the blue wing and is clearly shorter, 9 d, in the red, which is suggestive of inflow. As the broad lines strengthened, the blue side came to dominate the Balmer lines, resulting in very asymmetric profiles with blueshifted peaks during this high state. During the outburst the X-ray flux reached its maximum on 2020 April 1 and it was the highest value ever observed for NGC 3516 by the Swift observatory. The X-ray hard photon index became softer, βΌ1.8 in the maximum on 2020 April 21 compared to the mean βΌ0.7 during earlier epochs before 2020. We have found that the UV and optical variations correlated well (with a small time delay of 1β2 d) with the X-ray until the beginning of 2020 April, but later, until the end of 2020 June, these variations were not correlated. We suggest that this fact may be a consequence of partial obscuration by Compton-thick clouds crossing the line of sight.</p
Handwritten text generation and strikethrough characters augmentation
We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate and Character Error Rate beyond best-reported results on handwriting text recognition tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix), which proved to be very effective in handwriting text recognition tasks. StackMix uses weakly-supervised framework to get character boundaries. Because these data augmentation techniques are independent of the network used, they could also be applied to enhance the performance of other networks and approaches to handwriting text recognition. Extensive experiments on ten handwritten text datasets show that HandWritten Blots augmentation and StackMix significantly improve the quality of handwriting text recognition models
Many heads but one brain: FusionBrain β a single multimodal multitask architecture and a competition
Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called FusionBrain, the first competition which is targeted to make a universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The FusionBrain Challenge combines the following specific tasks: Code2code Translation, Handwritten Text recognition, Zero-shot Object Detection, and Visual Question Answering. We have created datasets for each task to test the participants' submissions on it. Moreover, we have collected and made publicly available a new handwritten dataset in both English and Russian, which consists of 94,128 pairs of images and texts. We also propose a multimodal and multitask architecture β a baseline solution, in the centre of which is a frozen foundation model and which has been trained in Fusion mode along with Single-task mode. The proposed Fusion approach proves to be competitive and more energy-efficient compared to the task-specific one