5,285 research outputs found
On Optimization Modulo Theories, MaxSMT and Sorting Networks
Optimization Modulo Theories (OMT) is an extension of SMT which allows for
finding models that optimize given objectives. (Partial weighted) MaxSMT --or
equivalently OMT with Pseudo-Boolean objective functions, OMT+PB-- is a
very-relevant strict subcase of OMT. We classify existing approaches for MaxSMT
or OMT+PB in two groups: MaxSAT-based approaches exploit the efficiency of
state-of-the-art MAXSAT solvers, but they are specific-purpose and not always
applicable; OMT-based approaches are general-purpose, but they suffer from
intrinsic inefficiencies on MaxSMT/OMT+PB problems.
We identify a major source of such inefficiencies, and we address it by
enhancing OMT by means of bidirectional sorting networks. We implemented this
idea on top of the OptiMathSAT OMT solver. We run an extensive empirical
evaluation on a variety of problems, comparing MaxSAT-based and OMT-based
techniques, with and without sorting networks, implemented on top of
OptiMathSAT and {\nu}Z. The results support the effectiveness of this idea, and
provide interesting insights about the different approaches.Comment: 17 pages, submitted at Tacas 1
Enhancing Sensitivity Classification with Semantic Features using Word Embeddings
Government documents must be reviewed to identify any sensitive information
they may contain, before they can be released to the public. However,
traditional paper-based sensitivity review processes are not practical for reviewing
born-digital documents. Therefore, there is a timely need for automatic sensitivity
classification techniques, to assist the digital sensitivity review process.
However, sensitivity is typically a product of the relations between combinations
of terms, such as who said what about whom, therefore, automatic sensitivity
classification is a difficult task. Vector representations of terms, such as word
embeddings, have been shown to be effective at encoding latent term features
that preserve semantic relations between terms, which can also be beneficial to
sensitivity classification. In this work, we present a thorough evaluation of the
effectiveness of semantic word embedding features, along with term and grammatical
features, for sensitivity classification. On a test collection of government
documents containing real sensitivities, we show that extending text classification
with semantic features and additional term n-grams results in significant improvements
in classification effectiveness, correctly classifying 9.99% more sensitive
documents compared to the text classification baseline
Recommended from our members
Night-time oxidation of surfactants at the air–water interface: effects of chain length, head group and saturation
Reactions of the key atmospheric night-time oxidant NO3 with organic monolayers at the air–water interface are used as proxies for the ageing of organic-coated aqueous aerosols. The surfactant molecules chosen for this study are oleic acid (OA), palmitoleic acid (POA), methyl oleate (MO) and stearic acid (SA) to investigate the effects of chain length, head group and degree of unsaturation on the reaction kinetics and products formed. Fully and partially deuterated surfactants were studied using neutron reflectometry (NR) to determine the reaction kinetics of organic monolayers with NO3 at the air–water interface for the first time. Kinetic modelling allowed us to determine the rate coefficients for the oxidation of OA, POA and MO monolayers to be (2.8 ± 0.7) × 10−8 cm2 molecule−1 s−1, (2.4 ± 0.5) × 10−8 cm2 molecule−1 s−1 and (3.3 ± 0.6) × 10−8 cm2 molecule−1 s−1, respectively. The corresponding uptake coefficients were found to be (2.1 ± 0.5) × 10−3, (1.7 ± 0.3) × 10−3 and (2.1 ± 0.4) × 10−3. For the much slower NO3-initiated oxidation of the saturated surfactant SA we found a loss rate of (5 ± 1) × 10−12 cm2 molecule−1 s−1 which we consider to be an upper limit for the reactive loss, and estimated an uptake coefficient of (5 ± 1) × 10−7. Our investigations demonstrate that NO3 will contribute substantially to the processing of unsaturated surfactants at the air–water interface during night-time given its reactivity is ca. two orders of magnitude higher than that of O3. Furthermore, the relative contributions of NO3 and O3 to the oxidative losses vary massively between species that are closely related in structure: NO3 reacts ca. 400 times faster than O3 with the common model surfactant oleic acid, but only ca. 60 times faster with its methyl ester MO. It is therefore necessary to perform a case-by-case assessment of the relative contributions of the different degradation routes for any specific surfactant. The overall impact of NO3 on the fate of saturated surfactants is slightly less clear given the lack of prior kinetic data for comparison, but NO3 is likely to contribute significantly to the loss of saturated species and dominate their loss during night-time. The retention of the organic character at the air–water interface differs fundamentally between the different surfactant species: the fatty acids studied (OA and POA) form products with a yield of ∼ 20% that are stable at the interface while NO3-initiated oxidation of the methyl ester MO rapidly and effectively removes the organic character (≤ 3% surface-active products). The film-forming potential of reaction products in real aerosol is thus likely to depend on the relative proportions of saturated and unsaturated surfactants as well as the head group properties. Atmospheric lifetimes of unsaturated species are much longer than those determined with respect to their reactions at the air–water interface, so that they must be protected from oxidative attack e.g. by incorporation into a complex aerosol matrix or in mixed surface films with yet unexplored kinetic behaviour
From LTL and Limit-Deterministic B\"uchi Automata to Deterministic Parity Automata
Controller synthesis for general linear temporal logic (LTL) objectives is a
challenging task. The standard approach involves translating the LTL objective
into a deterministic parity automaton (DPA) by means of the Safra-Piterman
construction. One of the challenges is the size of the DPA, which often grows
very fast in practice, and can reach double exponential size in the length of
the LTL formula. In this paper we describe a single exponential translation
from limit-deterministic B\"uchi automata (LDBA) to DPA, and show that it can
be concatenated with a recent efficient translation from LTL to LDBA to yield a
double exponential, \enquote{Safraless} LTL-to-DPA construction. We also report
on an implementation, a comparison with the SPOT library, and performance on
several sets of formulas, including instances from the 2016 SyntComp
competition
Insights from the classical MD simulations
Salt bridges and ionic interactions play an important role in protein
stability, protein-protein interactions, and protein folding. Here, we provide
the classical MD simulations of the structure and IR signatures of the
arginine (Arg)–glutamate (Glu) salt bridge. The Arg-Glu model is based on the
infinite polyalanine antiparallel two-stranded β-sheet structure. The 1 μs NPT
simulations show that it preferably exists as a salt bridge (a contact ion
pair). Bidentate (the end-on and side-on structures) and monodentate (the
backside structure) configurations are localized [Donald et al., Proteins 79,
898–915 (2011)]. These structures are stabilized by the short +N–H⋯O− bonds.
Their relative stability depends on a force field used in the MD simulations.
The side-on structure is the most stable in terms of the OPLS-AA force field.
If AMBER ff99SB-ILDN is used, the backside structure is the most stable.
Compared with experimental data, simulations using the OPLS all-atom (OPLS-AA)
force field describe the stability of the salt bridge structures quite
realistically. It decreases in the following order: side-on > end-on >
backside. The most stable side-on structure lives several nanoseconds. The
less stable backside structure exists a few tenth of a nanosecond. Several
short-living species (solvent shared, completely separately solvated ionic
groups ion pairs, etc.) are also localized. Their lifetime is a few tens of
picoseconds or less. Conformational flexibility of amino acids forming the
salt bridge is investigated. The spectral signature of the Arg-Glu salt bridge
is the IR-intensive band around 2200 cm−1. It is caused by the asymmetric
stretching vibrations of the +N–H⋯O− fragment. Result of the present paper
suggests that infrared spectroscopy in the 2000–2800 frequency region may be a
rapid and quantitative method for the study of salt bridges in peptides and
ionic interactions between proteins. This region is usually not considered in
spectroscopic studies of peptides and proteins
On Security and Sparsity of Linear Classifiers for Adversarial Settings
Machine-learning techniques are widely used in security-related applications,
like spam and malware detection. However, in such settings, they have been
shown to be vulnerable to adversarial attacks, including the deliberate
manipulation of data at test time to evade detection. In this work, we focus on
the vulnerability of linear classifiers to evasion attacks. This can be
considered a relevant problem, as linear classifiers have been increasingly
used in embedded systems and mobile devices for their low processing time and
memory requirements. We exploit recent findings in robust optimization to
investigate the link between regularization and security of linear classifiers,
depending on the type of attack. We also analyze the relationship between the
sparsity of feature weights, which is desirable for reducing processing cost,
and the security of linear classifiers. We further propose a novel octagonal
regularizer that allows us to achieve a proper trade-off between them. Finally,
we empirically show how this regularizer can improve classifier security and
sparsity in real-world application examples including spam and malware
detection
Evolving rules for document classification
We describe a novel method for using Genetic Programming to create compact classification rules based on combinations of N-Grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that because the induced rules are meaningful to a human analyst they may have a number of other uses beyond classification and provide a basis for text mining applications
Applying machine learning to the problem of choosing a heuristic to select the variable ordering for cylindrical algebraic decomposition
Cylindrical algebraic decomposition(CAD) is a key tool in computational
algebraic geometry, particularly for quantifier elimination over real-closed
fields. When using CAD, there is often a choice for the ordering placed on the
variables. This can be important, with some problems infeasible with one
variable ordering but easy with another. Machine learning is the process of
fitting a computer model to a complex function based on properties learned from
measured data. In this paper we use machine learning (specifically a support
vector machine) to select between heuristics for choosing a variable ordering,
outperforming each of the separate heuristics.Comment: 16 page
Strong Anisotropy in Liquid Water upon Librational Excitation using Terahertz Laser Fields
Tracking the excitation of water molecules in the homogeneous liquid is
challenging due to the ultrafast dissipation of rotational excitation energy
through the hydrogen-bonded network. Here we demonstrate strong transient
anisotropy of liquid water through librational excitation using single-color
pump-probe experiments at 12.3 THz. We deduce a third order response of chi^3
exceeding previously reported values in the optical range by three orders of
magnitude. Using a theory that replaces the nonlinear response with a material
response property amenable to molecular dynamics simulation, we show that the
rotationally damped motion of water molecules in the librational band is
resonantly driven at this frequency, which could explain the enhancement of the
anisotropy in the liquid by the external Terahertz field. By addition of salt
(MgSO4), the hydration water is instead dominated by the local electric field
of the ions, resulting in reduction of water molecules that can be dynamically
perturbed by THz pulses
An efficient k.p method for calculation of total energy and electronic density of states
An efficient method for calculating the electronic structure in large systems
with a fully converged BZ sampling is presented. The method is based on a
k.p-like approximation developed in the framework of the density functional
perturbation theory. The reliability and efficiency of the method are
demostrated in test calculations on Ar and Si supercells
- …
