154 research outputs found
design of selective peptide antibiotics by using the sequence moment concept
New antibiotics against multidrug-resistant bacteria are urgently needed, but rapid acquisition of resistance limits their usefulness. Endogenous antimicrobial peptides (AMPs) with moderate selectivity, but multimodal mechanism of action, have remained effective against bacteria for millions of years. Their therapeutic application, however, requires optimizing the balance between antibacterial activity and selectivity, so that rational design methods for increasing selectivity are highly desirable. We have created training (n=36) and testing (n=37) sets from frog-derived AMPs with determined therapeutic index (TI). The 'sequence moments' concept then enabled us to find a one-parameter linear model resulting in a good correlation between measured and predicted TI (r2=0.83 and 0.64 for each set, respectively). The concept was then used in the AMP-Designer algorithm to propose primary structures for highly selective AMPs against Gram-negative bacteria. Testing the activity of one such peptide produced a TI>200 as compared to the best AMP in the data-base, with TI=125
Efficient HTTP based I/O on very large datasets for high performance computing with the libdavix library
Remote data access for data analysis in high performance computing is
commonly done with specialized data access protocols and storage systems. These
protocols are highly optimized for high throughput on very large datasets,
multi-streams, high availability, low latency and efficient parallel I/O. The
purpose of this paper is to describe how we have adapted a generic protocol,
the Hyper Text Transport Protocol (HTTP) to make it a competitive alternative
for high performance I/O and data analysis applications in a global computing
grid: the Worldwide LHC Computing Grid. In this work, we first analyze the
design differences between the HTTP protocol and the most common high
performance I/O protocols, pointing out the main performance weaknesses of
HTTP. Then, we describe in detail how we solved these issues. Our solutions
have been implemented in a toolkit called davix, available through several
recent Linux distributions. Finally, we describe the results of our benchmarks
where we compare the performance of davix against a HPC specific protocol for a
data analysis use case.Comment: Presented at: Very large Data Bases (VLDB) 2014, Hangzho
GR@PPA 2.8: initial-state jet matching for weak boson production processes at hadron collisions
The initial-state jet matching method introduced in our previous studies has
been applied to the event generation of single and production processes
and diboson (, and ) production processes at hadron
collisions in the framework of the GR@PPA event generator. The generated events
reproduce the transverse momentum spectra of weak bosons continuously in the
entire kinematical region. The matrix elements (ME) for hard interactions are
still at the tree level. As in previous versions, the decays of weak bosons are
included in the matrix elements. Therefore, spin correlations and phase-space
effects in the decay of weak bosons are exact at the tree level. The program
package includes custom-made parton shower programs as well as ME-based hard
interaction generators in order to achieve self-consistent jet matching. The
generated events can be passed to general-purpose event generators to make the
simulation proceed down to the hadron level.Comment: 29 pages, 14 figures; minor changes to clarify the discussions, and
corrections of typo
OPUCEM: A Library with Error Checking Mechanism for Computing Oblique Parameters
After a brief review of the electroweak radiative corrections to gauge-boson
self-energies, otherwise known as the direct and oblique corrections, a tool
for calculation of the oblique parameters is presented. This tool, named
OPUCEM, brings together formulas from multiple physics models and provides an
error-checking machinery to improve reliability of numerical results. It also
sets a novel example for an "open-formula" concept, which is an attempt to
improve the reliability and reproducibility of computations in scientific
publications by encouraging the authors to open-source their numerical
calculation programs. Finally, we demonstrate the use of OPUCEM in two detailed
case studies related to the fourth Standard Model family. The first is a
generic fourth family study to find relations between the parameters compatible
with the EW precision data and the second is the particular study of the Flavor
Democracy predictions for both Dirac and Majorana-type neutrinos.Comment: 10 pages, 19 figures, section 3 and 4 reviewed, results unchanged,
typo correction
Resource provisioning in Science Clouds: Requirements and challenges
Cloud computing has permeated into the information technology industry in the
last few years, and it is emerging nowadays in scientific environments. Science
user communities are demanding a broad range of computing power to satisfy the
needs of high-performance applications, such as local clusters,
high-performance computing systems, and computing grids. Different workloads
are needed from different computational models, and the cloud is already
considered as a promising paradigm. The scheduling and allocation of resources
is always a challenging matter in any form of computation and clouds are not an
exception. Science applications have unique features that differentiate their
workloads, hence, their requirements have to be taken into consideration to be
fulfilled when building a Science Cloud. This paper will discuss what are the
main scheduling and resource allocation challenges for any Infrastructure as a
Service provider supporting scientific applications
Plotting the Differences Between Data and Expectation
This article proposes a way to improve the presentation of histograms where
data are compared to expectation. Sometimes, it is difficult to judge by eye
whether the difference between the bin content and the theoretical expectation
(provided by either a fitting function or another histogram) is just due to
statistical fluctuations. More importantly, there could be statistically
significant deviations which are completely invisible in the plot. We propose
to add a small inset at the bottom of the plot, in which the statistical
significance of the deviation observed in each bin is shown. Even though the
numerical routines which we developed have only illustration purposes, it comes
out that they are based on formulae which could be used to perform statistical
inference in a proper way. An implementation of our computation is available at
https://github.com/dcasadei/psde .Comment: 10 pages, 7 figures. CODE: https://github.com/dcasadei/psd
Type Ia supernova parameter estimation: a comparison of two approaches using current datasets
By using the Sloan Digital Sky Survey (SDSS) first year type Ia supernova (SN
Ia) compilation, we compare two different approaches (traditional \chi^2 and
complete likelihood) to determine parameter constraints when the magnitude
dispersion is to be estimated as well. We consider cosmological constant + Cold
Dark Matter (\Lambda CDM) and spatially flat, constant w Dark Energy + Cold
Dark Matter (FwCDM) cosmological models and show that, for current data, there
is a small difference in the best fit values and 30% difference in
confidence contour areas in case the MLCS2k2 light-curve fitter is adopted. For
the SALT2 light-curve fitter the differences are less significant (
13% difference in areas). In both cases the likelihood approach gives more
restrictive constraints. We argue for the importance of using the complete
likelihood instead of the \chi^2 approach when dealing with parameters in the
expression for the variance.Comment: 16 pages, 5 figures. More complete analysis by including peculiar
velocities and correlations among SALT2 parameters. Use of 2D contours
instead of 1D intervals for comparison. There can be now a significant
difference between the approaches, around 30% in contour area for MLCS2k2 and
up to 13% for SALT2. Generic streamlining of text and suppression of section
on model selectio
ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization
ROOT is an object-oriented C++ framework conceived in the high-energy physics
(HEP) community, designed for storing and analyzing petabytes of data in an
efficient way. Any instance of a C++ class can be stored into a ROOT file in a
machine-independent compressed binary format. In ROOT the TTree object
container is optimized for statistical data analysis over very large data sets
by using vertical data storage techniques. These containers can span a large
number of files on local disks, the web, or a number of different shared file
systems. In order to analyze this data, the user can chose out of a wide set of
mathematical and statistical functions, including linear algebra classes,
numerical algorithms such as integration and minimization, and various methods
for performing regression analysis (fitting). In particular, ROOT offers
packages for complex data modeling and fitting, as well as multivariate
classification based on machine learning techniques. A central piece in these
analysis tools are the histogram classes which provide binning of one- and
multi-dimensional data. Results can be saved in high-quality graphical formats
like Postscript and PDF or in bitmap formats like JPG or GIF. The result can
also be stored into ROOT macros that allow a full recreation and rework of the
graphics. Users typically create their analysis macros step by step, making use
of the interactive C++ interpreter CINT, while running over small data samples.
Once the development is finished, they can run these macros at full compiled
speed over large data sets, using on-the-fly compilation, or by creating a
stand-alone batch program. Finally, if processing farms are available, the user
can reduce the execution time of intrinsically parallel tasks - e.g. data
mining in HEP - by using PROOF, which will take care of optimally distributing
the work over the available resources in a transparent way
Sqrt{shat}_{min} resurrected
We discuss the use of the variable sqrt{shat}_{min}, which has been proposed
in order to measure the hard scale of a multi parton final state event using
inclusive quantities only, on a SUSY data sample for a 14 TeV LHC. In its
original version, where this variable was proposed on calorimeter level, the
direct correlation to the hard scattering scale does not survive when effects
from soft physics are taken into account. We here show that when using
reconstructed objects instead of calorimeter energy and momenta as input, we
manage to actually recover this correlation for the parameter point considered
here. We furthermore discuss the effect of including W + jets and t tbar+jets
background in our analysis and the use of sqrt{shat}_{min} for the suppression
of SM induced background in new physics searches.Comment: 23 pages, 9 figures; v2: 1 figure, several subsections and references
as well as new author affiliation added. Corresponds to published versio
- …