Search CORE

9,212 research outputs found

Limitations of Cross-Lingual Learning from Image Search

Author: Hartmann Mareike
Soegaard Anders
Publication venue
Publication date: 18/09/2017
Field of study

Cross-lingual representation learning is an important step in making NLP scale to all the world's languages. Recent work on bilingual lexicon induction suggests that it is possible to learn cross-lingual representations of words based on similarities between images associated with these words. However, that work focused on the translation of selected nouns only. In our work, we investigate whether the meaning of other parts-of-speech, in particular adjectives and verbs, can be learned in the same way. We also experiment with combining the representations learned from visual data with embeddings learned from textual data. Our experiments across five language pairs indicate that previous work does not scale to the problem of learning cross-lingual representations beyond simple nouns

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs

Author: Li Xiaopeng
Luo Lannan
Young Patrick
Zeng Qiang
Zhang Zhexin
Zuo Fei
Publication venue: 'Internet Society'
Publication date: 16/12/2018
Field of study

Binary code analysis allows analyzing binary code without having access to the corresponding source code. A binary, after disassembly, is expressed in an assembly language. This inspires us to approach binary analysis by leveraging ideas and techniques from Natural Language Processing (NLP), a rich area focused on processing text of various natural languages. We notice that binary code analysis and NLP share a lot of analogical topics, such as semantics extraction, summarization, and classification. This work utilizes these ideas to address two important code similarity comparison problems. (I) Given a pair of basic blocks for different instruction set architectures (ISAs), determining whether their semantics is similar or not; and (II) given a piece of code of interest, determining if it is contained in another piece of assembly code for a different ISA. The solutions to these two problems have many applications, such as cross-architecture vulnerability discovery and code plagiarism detection. We implement a prototype system INNEREYE and perform a comprehensive evaluation. A comparison between our approach and existing approaches to Problem I shows that our system outperforms them in terms of accuracy, efficiency and scalability. And the case studies utilizing the system demonstrate that our solution to Problem II is effective. Moreover, this research showcases how to apply ideas and techniques from NLP to large-scale binary code analysis.Comment: Accepted by Network and Distributed Systems Security (NDSS) Symposium 201

arXiv.org e-Print Archive

Crossref

Program Transformations for Asynchronous and Batched Query Submission

Author: Chavan Mahendra
Guravannavar Ravindra
Ramachandra Karthik
Sudarshan S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2014
Field of study

The performance of database/Web-service backed applications can be significantly improved by asynchronous submission of queries/requests well ahead of the point where the results are needed, so that results are likely to have been fetched already when they are actually needed. However, manually writing applications to exploit asynchronous query submission is tedious and error-prone. In this paper we address the issue of automatically transforming a program written assuming synchronous query submission, to one that exploits asynchronous query submission. Our program transformation method is based on data flow analysis and is framed as a set of transformation rules. Our rules can handle query executions within loops, unlike some of the earlier work in this area. We also present a novel approach that, at runtime, can combine multiple asynchronous requests into batches, thereby achieving the benefits of batching in addition to that of asynchronous submission. We have built a tool that implements our transformation techniques on Java programs that use JDBC calls; our tool can be extended to handle Web service calls. We have carried out a detailed experimental study on several real-life applications, which shows the effectiveness of the proposed rewrite techniques, both in terms of their applicability and the performance gains achieved.Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Research Archive of Indian Institute of Technology Hyderabad

Dspace at IIT Bombay

EOSDB: The Database for Nuclear EoS

Author: Ishizuka Chikako
Ohnishi Akira
Suda Takuma
Sumiyoshi Kohsuke
Suzuki Hideyuki
Toki Hiroshi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 29/09/2014
Field of study

Nuclear equation of state (EoS) plays an important role in understanding the formation of compact objects such as neutron stars and black holes. The true nature of the EoS has been a matter of debate at any density range not only in the nuclear physics but also in the astronomy and astrophysics. We have constructed a database of EoSs by compiling data from the literature. Our database contains the basic properties of the nuclear EoS of symmetric nuclear matter and of pure neutron matter. It also includes the detailed information about the theoretical models, for example the adopted methods and assumptions in individual models. The novelty of the database is to consider new experimental probes such as the symmetry energy, its slope relative to the baryon density, and the incompressibility, which enables the users to check their model dependences. We demonstrate the performance of the EOSDB through the examinations of the model dependence among different nuclear EoSs. It is reveled that some theoretical EoSs, which is commonly used in astrophysics, do not satisfactorily agree with the experimental constraints.Comment: 30 pages, 5 figures, submitted to Publications of the Astronomical Society of Japan (revised

arXiv.org e-Print Archive

Kyoto University Research Information Repository

Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus

Author: Sawant Uma
Garg Saurabh
Chakrabarti Soumen
Ramakrishnan Ganesh
Publication venue
Publication date: 06/12/2018
Field of study

In Web search, entity-seeking queries often trigger a special Question Answering (QA) system. It may use a parser to interpret the question to a structured query, execute that on a knowledge graph (KG), and return direct entity responses. QA systems based on precise parsing tend to be brittle: minor syntax variations may dramatically change the response. Moreover, KG coverage is patchy. At the other extreme, a large corpus may provide broader coverage, but in an unstructured, unreliable form. We present AQQUCN, a QA system that gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of query syntax, between well-formed questions to short `telegraphic' keyword sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals from KGs and large corpora to directly rank KG entities, rather than commit to one semantic interpretation of the query. AQQUCN models the ideal interpretation as an unobservable or latent variable. Interpretations and candidate entity responses are scored as pairs, by combining signals from multiple convolutional networks that operate collectively on the query, KG and corpus. On four public query workloads, amounting to over 8,000 queries with diverse query syntax, we see 5--16% absolute improvement in mean average precision (MAP), compared to the entity ranking performance of recent systems. Our system is also competitive at entity set retrieval, almost doubling F1 scores for challenging short queries.Comment: Accepted to Information Retrieval Journa

arXiv.org e-Print Archive

Biblioteca Digital de la Comunidad de Madrid