3,967 research outputs found
Holographic dual of the Standard Model on the throat
We apply recent techniques to construct geometries, based on local Calabi-Yau
manifolds, leading to warped throats with 3-form fluxes in string theory, with
interesting structure at their bottom. We provide their holographic dual
description in terms of RG flows for gauge theories with almost conformal
duality cascades and infrared confinement. We describe a model of a throat with
D-branes at its bottom, realizing a 3-family Standard Model like chiral sector.
We provide the explicit holographic dual gauge theory RG flow, and describe the
appearance of the SM degrees of freedom after confinement. As a second
application, we describe throats within throats, namely warped throats with
discontinuous warp factor in different regions of the radial coordinate, and
discuss possible model building applications.Comment: 46 pages, 21 figures, reference adde
Pseudolocalization Realization for Skype Software
KĂ€esoleva bakalaureusetöö eesmĂ€rgiks on luua pseudolokaliseerimise rakendus Skypeâi jaoks. Lokaliseerimine on tarkvara kultuurile kohandamine, mis hĂ”lmab endas nii tĂ”lkimist kui ka numbriformaatide, piltide jm kohandamist kohalikule kultuurile. Peamised lokaliseerimise probleemid on seotud teksti pikenemisega tĂ”lgetes (eriti kui sisendkeeleks on inglise keel), koodi sisse kirjutatud tekstiga, mis jÀÀb seetĂ”ttu tĂ”lkimata, mitmest eri sĂ”nest kokkupandud tekstiga, vale mĂ€rgikodeerimisega (ingl k. encoding), erinevate numbriformaatide ning sobimatute piltidega. Ăks viis lokaliseerimise kĂ€igus tekkivad probleeme ennetada on pseudolokaliseerimine.
Pseudolokaliseerimine kujutab endast lokaliseerimisprotsessi imiteerimist enne teksti pÀriselt tÔlkesse saatmist. See tÀhendab tekstifaili tekstivÀljade muutmist teatud sobival viisil ja seejÀrel faili vaatamist kasutajaliideses. Erinevad pseudolokaliseerimise meetodid aitavad ennetada kodeeritud tÔlkimata teksti, teksti pikenemise, mÀrgikodeerimise- ja mitmest eri sÔnest kokkupandud tekstiga seotud probleeme.
Olemasolevad pseudolokaliseerimise rakendused, mida on ka antud töös hinnatud, ei ole Skypeâi jaoks sobilikud. Seega arendati eraldiseisev pseudolokaliseerimisrakendus, mis realiseerib sulumeetodi (sĂ”ne asendatakse âX sĂ”ne Xâ-ga), vĂ”tmemeetodi (sĂ”ne asendatakse tema keelefaili identifikaatoriga) ja masintĂ”lkemeetodi Microsoft Translatorâi abil (sĂ”ne asendatakse masintĂ”lkega). Sulumeetod aitab leida kokku kirjutatud sĂ”nesid ning koodi kirjutatud teksti, masintĂ”lke meetod aitab leida potentsiaalseid probleeme teksti pikenemisega ning mĂ€rgikodeeringuga seotud probleeme. VĂ”tmemeetod aitab keelefailist kiiresti leida sĂ”ne, mis kasutajaliideses probleeme pĂ”hjustab.
Loodud rakendus on juba Skypeâi lokaliseerimismeeskonna poolt kasutuses ning probleemide ennetamisega aitab see kokku hoida probleemide hilisemaks lahendamiseks kuluvaid ressursse.
Muu hulgas programmeeriti töö kĂ€igus ka ĂŒhendus Microsoft Translatorâiga programmeerimiskeeles Java, mis varem puudus, ning mida on vĂ”imalik ka teistel kasutada.The aim of the current work is to develop a pseudolocalization application for solving possible issues in user interfaces for Skype clients.
Localization is the process of adapting a product to a target market. Localization involves translation as well customization related to different number formats, graphical imagery etc.
Localization is carried out at the end of the development process in order to avoid the need to re-localize due to changes in the design or text. Therefore all mistakes found during the localization process come with a high price. Common problems are text expansion in translated texts compared to the original (especially if the source text is in English), hard-coded text, character encoding problems, format issues, unsuitable imagery etc.
One of the methods to foresee localization problems is pseudolocalization. Pseudolocalization is a practice of simulating the localization process before the actual translation work by replacing the translatable text with modified text [6]. Different pseudolocalization methods address various localization problems including text expansion, hard coded strings, concatenated strings, encoding problems, etc
This work gives a short overview of common localization problems as well as a number of well-known pseudolocalization methods that can be used to prevent those problems. The existing solutions for pseudolocalization were analysed as well. This information can be useful when choosing which solution to use for pseudolocalization or deciding how pseudolocalization can be of help in the development process.
The application developed in the practical part of this thesis implements three pseudolocalization methods â machine translation, prefix/suffix and string identifier methods and works with XML file types used internally in Skype. Using these methods helps to prevent localization issues related to text expansion, encoding, hard-coded text and string concatenation as well as helps finding the problematic strings from the language file quicker when using string identifier method. The application is already in use by the Skype localization team.
During the development of the pseudolocalization application for Skype, the connection to the Microsoft translator was developed in Java that did not exist before. This can also be used by other users to connect to Microsoft Translator with Java
Universal Compressed Text Indexing
The rise of repetitive datasets has lately generated a lot of interest in
compressed self-indexes based on dictionary compression, a rich and
heterogeneous family that exploits text repetitions in different ways. For each
such compression scheme, several different indexing solutions have been
proposed in the last two decades. To date, the fastest indexes for repetitive
texts are based on the run-length compressed Burrows-Wheeler transform and on
the Compact Directed Acyclic Word Graph. The most space-efficient indexes, on
the other hand, are based on the Lempel-Ziv parsing and on grammar compression.
Indexes for more universal schemes such as collage systems and macro schemes
have not yet been proposed. Very recently, Kempa and Prezza [STOC 2018] showed
that all dictionary compressors can be interpreted as approximation algorithms
for the smallest string attractor, that is, a set of text positions capturing
all distinct substrings. Starting from this observation, in this paper we
develop the first universal compressed self-index, that is, the first indexing
data structure based on string attractors, which can therefore be built on top
of any dictionary-compressed text representation. Let be the size of a
string attractor for a text of length . Our index takes
words of space and supports locating the
occurrences of any pattern of length in
time, for any constant . This is, in particular, the first index
for general macro schemes and collage systems. Our result shows that the
relation between indexing and compression is much deeper than what was
previously thought: the simple property standing at the core of all dictionary
compressors is sufficient to support fast indexed queries.Comment: Fixed with reviewer's comment
The study of probability model for compound similarity searching
Information Retrieval or IR system main task is to retrieve relevant documents according to the users query. One of IR most popular retrieval model is the Vector Space Model. This model assumes relevance based on similarity, which is defined as the distance between query and document in the concept space. All currently existing chemical compound database systems have adapt the vector space model to calculate the similarity of a database entry to a query compound. However, it assumes that fragments represented by the bits are independent of one another, which is not necessarily true. Hence, the possibility of applying another IR model is explored, which is the Probabilistic Model, for chemical compound searching. This model estimates the probabilities of a chemical structure to have the same bioactivity as a target compound. It is envisioned that by ranking chemical structures in decreasing order of their probability of relevance to the query structure, the effectiveness of a molecular similarity searching system can be increased. Both fragment dependencies and independencies assumption are taken into consideration in achieving improvement towards compound similarity searching system. After conducting a series of simulated similarity searching, it is concluded that PM approaches really did perform better than the existing similarity searching. It gave better result in all evaluation criteria to confirm this statement. In terms of which probability model performs better, the BD model shown improvement over the BIR model
- âŠ