274 research outputs found
Pointer Disambiguation via Strict Inequalities
International audienceThe design and implementation of static analyses that disambiguate pointershas been a focus of research since the early days of compiler construction.One of the challenges that arise in this context is the analysis of languagesthat support pointer arithmetics, such as C, C++ and assembly dialects.This paper contributes to solve this challenge.We start from an obvious, yet unexplored, observation: if a pointeris strictly less than another, they cannot alias.Motivated by this remark, we use the abstract interpretation framework tobuild strict less-than relations between pointers.To this end, we construct a program representation that bestows the StaticSingle Information (SSI) property onto our dataflow analysis.SSI gives us an efficient sparse algorithm, which, once seen as aform of abstract interpretation, is correct by construction.We have implemented our static analysis in LLVM.It runs in time linear on the number of program variables, and, depending onthe benchmark, it can be as much as six times more precise than the pointerdisambiguation techniques already in place in that compiler
Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
This paper proposes a new neural architecture for collaborative ranking with
implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning})
is a novel metric learning approach for recommendation. More specifically,
instead of simple push-pull mechanisms between user and item pairs, we propose
to learn latent relations that describe each user item interaction. This helps
to alleviate the potential geometric inflexibility of existing metric learing
approaches. This enables not only better performance but also a greater extent
of modeling capability, allowing our model to scale to a larger number of
interactions. In order to do so, we employ a augmented memory module and learn
to attend over these memory blocks to construct latent relations. The
memory-based attention module is controlled by the user-item interaction,
making the learned relation vector specific to each user-item pair. Hence, this
can be interpreted as learning an exclusive and optimal relational translation
for each user-item interaction. The proposed architecture demonstrates the
state-of-the-art performance across multiple recommendation benchmarks. LRML
outperforms other metric learning models by in terms of Hits@10 and
nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover,
qualitative studies also demonstrate evidence that our proposed model is able
to infer and encode explicit sentiment, temporal and attribute information
despite being only trained on implicit feedback. As such, this ascertains the
ability of LRML to uncover hidden relational structure within implicit
datasets.Comment: WWW 201
POS Tagging Using Relaxation Labelling
Relaxation labelling is an optimization technique used in many fields to
solve constraint satisfaction problems. The algorithm finds a combination of
values for a set of variables such that satisfies -to the maximum possible
degree- a set of given constraints. This paper describes some experiments
performed applying it to POS tagging, and the results obtained. It also ponders
the possibility of applying it to word sense disambiguation.Comment: compressed & uuencoded postscript file. Paper length: 39 page
A Case Study in Modular Programming: Using AspectJ and OCaml in an Undergraduate Compiler Project
We report our experience in using two different languages to build the same software project. Specifically, we have converted an entire undergraduate compiler course from using AspectJ, an aspect-oriented language, to using OCaml, a functional language. The course has evolved over a period of eight years with, on average, 60 students completing it every year. In this article, we analyze our usage of the two programming languages and we compare and contrast the two software projects on a number of parameters, including how they enable students to write and test individual compiler phases in a modular way.
An Algorithm to Determine Peer-Reviewers
The peer-review process is the most widely accepted certification mechanism
for officially accepting the written results of researchers within the
scientific community. An essential component of peer-review is the
identification of competent referees to review a submitted manuscript. This
article presents an algorithm to automatically determine the most appropriate
reviewers for a manuscript by way of a co-authorship network data structure and
a relative-rank particle-swarm algorithm. This approach is novel in that it is
not limited to a pre-selected set of referees, is computationally efficient,
requires no human-intervention, and, in some instances, can automatically
identify conflict of interest situations. A useful application of this
algorithm would be to open commentary peer-review systems because it provides a
weighting for each referee with respects to their expertise in the domain of a
manuscript. The algorithm is validated using referee bid data from the 2005
Joint Conference on Digital Libraries.Comment: Rodriguez, M.A., Bollen, J., "An Algorithm to Determine
Peer-Reviewers", Conference on Information and Knowledge Management, in
press, ACM, LA-UR-06-2261, October 2008; ISBN:978-1-59593-991-
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Modelling, Detection And Exploitation Of Lexical Functions For Analysis.
Lexical functions (LF) model relations between terms
in the lexicon. These relations can be knowledge about
the world (Napoleon was an emperor) or knowledge about
the language (‘destiny’ is synonym of ‘fate’)
Adaptable Pulse Compression in φ-OTDR With Direct Digital Synthesis of Probe Waveforms and Rigorously Defined Nonlinear Chirping
Recent research in Phase-Sensitive Optical Time Doman Reflectometry (φ-OTDR) has been focused, among others, on performing spatially resolved measurements with various methods including the use of frequency modulated probes. However, conventional schemes either rely on phase-coded sequences, involve inflexible generation of the probe frequency modulation or mostly employ simple linear frequency modulated (LFM) pulses which suffer from elevated sidelobes introducing degradation in range resolution. In this contribution, we propose and experimentally demonstrate a novel φ-OTDR scheme which employs a readily adaptable Direct Digital Synthesis (DDS) of pulses with custom frequency modulation formats and demonstrate advanced optical pulse compression with a nonlinear frequency modulated (NLFM) waveform containing a complex, rigorously defined modulation law optimized for bandwidth-limited synthesis and sidelobe suppression. The proposed method offers high fidelity chirped waveforms, and when employed in resolving a 50-cm event at ∼1.13 km using a 1.2-μs probe pulse, matched filtering with the DDS-generated NLFM waveform results in a significant reduction in range ambiguity owing to autocorrelation sidelobe suppression of ∼20 dB with no averages and windowing functions, for an improvement of ∼16 dB compared to conventional linear chirping. Experimental results also show that the contribution of autocorrelation sidelobes to the power in the compressed backscattering responses around localized events is suppressed by up to ∼18 dB when advanced pulse compression with an optical NLFM pulse is employed
On Extracting Course-Grained Function Parallelism from C Programs
To efficiently utilize the emerging heterogeneous multi-core architecture, it is essential to exploit the inherent coarse-grained parallelism in applications. In addition to data parallelism, applications like telecommunication, multimedia, and gaming can also benefit from the exploitation of coarse-grained function parallelism. To exploit coarse-grained function parallelism, the common wisdom is to rely on programmers to explicitly express the coarse-grained data-flow between coarse-grained functions using data-flow or streaming languages.
This research is set to explore another approach to exploiting coarse-grained function parallelism, that is to rely on compiler to extract coarse-grained data-flow from imperative programs. We believe imperative languages and the von Neumann programming model will still be the dominating programming languages programming model in the future.
This dissertation discusses the design and implementation of a memory data-flow analysis system which extracts coarse-grained data-flow from C programs. The memory data-flow analysis system partitions a C program into a hierarchy of program regions. It then traverses the program region hierarchy from bottom up, summarizing the exposed memory access patterns for each program region, meanwhile deriving a conservative producer-consumer relations between program regions. An ensuing top-down traversal of the program region hierarchy will refine the producer-consumer relations by pruning spurious relations.
We built an in-lining based prototype of the memory data-flow analysis system on top of the IMPACT compiler infrastructure. We applied the prototype to analyze the memory data-flow of several MediaBench programs. The experiment results showed that while the prototype performed reasonably well for the tested programs, the in-lining based implementation may not efficient for larger programs. Also, there is still room in improving the effectiveness of the memory data-flow analysis system. We did root cause analysis for the inaccuracy in the memory data-flow analysis results, which provided us insights on how to improve the memory data-flow analysis system in the future
A Hybrid Environment for Syntax-Semantic Tagging
The thesis describes the application of the relaxation labelling algorithm to
NLP disambiguation. Language is modelled through context constraint inspired on
Constraint Grammars. The constraints enable the use of a real value statind
"compatibility". The technique is applied to POS tagging, Shallow Parsing and
Word Sense Disambigation. Experiments and results are reported. The proposed
approach enables the use of multi-feature constraint models, the simultaneous
resolution of several NL disambiguation tasks, and the collaboration of
linguistic and statistical models.Comment: PhD Thesis. 120 page
- …