15,102 research outputs found
DutchHatTrick: semantic query modeling, ConText, section detection, and match score maximization
This report discusses the collaborative work of the ErasmusMC, University of Twente, and the University of Amsterdam on the TREC 2011 Medical track. Here, the task is to retrieve patient visits from the University of Pittsburgh NLP Repository for 35 topics. The repository consists of 101,711 patient reports, and a patient visit was recorded in one or more reports
An Extended Relevance Model for Session Search
The session search task aims at best serving the user's information need
given her previous search behavior during the session. We propose an extended
relevance model that captures the user's dynamic information need in the
session. Our relevance modelling approach is directly driven by the user's
query reformulation (change) decisions and the estimate of how much the user's
search behavior affects such decisions. Overall, we demonstrate that, the
proposed approach significantly boosts session search performance
The EU, the WTO and indirect land use change
Efforts to meet the European Union’s (EU) alternative energy targets have resulted in increased production of biofuels. This production has resulted in deforestation-related emissions through displacement of agricultural production, a problem known as indirect land-use change. The European Commission (EC) has proposed regulatory options to respond to this problem, but all risk not being in conformity with World Trade Organization (WTO) law.Trade law challenges result from the underlying methodological uncertainty, and the attempt to address a systemic problem on the level of individual producers.Yet, this does not necessarily indicate that the intent of these regulations is to protect EU markets.Thus, this is an instructive case study to examine the relationship between WTO law and complex, emerging environmental problems
Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality
We survey diverse approaches to the notion of information: from Shannon
entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov
complexity are presented: randomness and classification. The survey is divided
in two parts published in a same volume. Part II is dedicated to the relation
between logic and information system, within the scope of Kolmogorov
algorithmic information theory. We present a recent application of Kolmogorov
complexity: classification using compression, an idea with provocative
implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses
how Kolmogorov complexity, besides being a foundation to randomness, is also
related to classification. Another approach to classification is also
considered: the so-called "Google classification". It uses another original and
attractive idea which is connected to the classification using compression and
to Kolmogorov complexity from a conceptual point of view. We present and unify
these different approaches to classification in terms of Bottom-Up versus
Top-Down operational modes, of which we point the fundamental principles and
the underlying duality. We look at the way these two dual modes are used in
different approaches to information system, particularly the relational model
for database introduced by Codd in the 70's. This allows to point out diverse
forms of a fundamental duality. These operational modes are also reinterpreted
in the context of the comprehension schema of axiomatic set theory ZF. This
leads us to develop how Kolmogorov's complexity is linked to intensionality,
abstraction, classification and information system.Comment: 43 page
Finding Cycles and Trees in Sublinear Time
We present sublinear-time (randomized) algorithms for finding simple cycles
of length at least and tree-minors in bounded-degree graphs. The
complexity of these algorithms is related to the distance of the graph from
being -minor-free (resp., free from having the corresponding tree-minor).
In particular, if the graph is far (i.e., -far) {from} being
cycle-free, i.e. if one has to delete a constant fraction of edges to make it
cycle-free, then the algorithm finds a cycle of polylogarithmic length in time
\tildeO(\sqrt{N}), where denotes the number of vertices. This time
complexity is optimal up to polylogarithmic factors.
The foregoing results are the outcome of our study of the complexity of {\em
one-sided error} property testing algorithms in the bounded-degree graphs
model. For example, we show that cycle-freeness of -vertex graphs can be
tested with one-sided error within time complexity
\tildeO(\poly(1/\e)\cdot\sqrt{N}). This matches the known
query lower bound, and contrasts with the fact that any minor-free property
admits a {\em two-sided error} tester of query complexity that only depends on
the proximity parameter \e. For any constant , we extend this result
to testing whether the input graph has a simple cycle of length at least .
On the other hand, for any fixed tree , we show that -minor-freeness has
a one-sided error tester of query complexity that only depends on the proximity
parameter \e.
Our algorithm for finding cycles in bounded-degree graphs extends to general
graphs, where distances are measured with respect to the actual number of
edges. Such an extension is not possible with respect to finding tree-minors in
complexity.Comment: Keywords: Sublinear-Time Algorithms, Property Testing, Bounded-Degree
Graphs, One-Sided vs Two-Sided Error Probability Updated versio
3D mapping of young stars in the solar neighbourhood with Gaia DR2
We study the three dimensional arrangement of young stars in the solar
neighbourhood using the second release of the Gaia mission (Gaia DR2) and we
provide a new, original view of the spatial configuration of the star forming
regions within 500 pc from the Sun. By smoothing the star distribution through
a gaussian filter, we construct three dimensional density maps for early-type
stars (upper-main sequence, UMS) and pre-main sequence (PMS) sources. The PMS
and the UMS samples are selected through a combination of photometric and
astrometric criteria. A side product of the analysis is a three dimensional,
G-band extinction map, which we use to correct our colour-magnitude diagram for
extinction and reddening. Both density maps show three prominent structures,
Scorpius-Centaurus, Orion, and Vela. The PMS map shows a plethora of lower mass
star forming regions, such as Taurus, Perseus, Cepheus, Cassiopeia, and
Lacerta, which are less visible in the UMS map, due to the lack of large
numbers of bright, early-type stars. We report the finding of a candidate new
open cluster towards , which could be
related to the Orion star forming complex. We estimate ages for the PMS sample
and we study the distribution of PMS stars as a function of their age. We find
that younger stars cluster in dense, compact clumps, and are surrounded by
older sources, whose distribution is instead more diffuse. The youngest groups
that we find are mainly located in Scorpius-Centaurus, Orion, Vela, and Taurus.
Cepheus, Cassiopeia, and Lacerta are instead more evolved and less numerous.
Finally, we find that the three dimensional density maps show no evidence for
the existence of the ring-like structure which is usually referred to as the
Gould Belt.Comment: 17 pages, 17 figures, 6 appendixes; accepted for publication in A&A;
image quality decreased to comply with the arXiv.org rules on file siz
THE KINETIC CHARACTERIZATION OF TWO PUTATIVE GALACTOSIDASES, BT2857 AND SCO6594, AND THE PRELIMINARY ANALYSES OF SCO6595-97
Bacteroides thetaiotaomicron is a prolific bacterium found in the distal intestinal tract of humans that possesses the ability to breakdown complex polysaccharides through the release of carbohydrate-active enzymes (CAZymes). Overall, the saccharolytic prowess of B. thetaiotaomicron presents an intriguing model for understanding microbial-host nutrient exchange within the gut and to identify novel mechanisms for accessing carbohydrates. The successful expression, purification, and preliminary kinetic characterization of BT2857 are reported. BT2857 was found to be enzymatically active against the artificial substrate, 4-nitrophenyl β-D-galactopyranoside (pNP-Gal); suggesting its role as a putative β-galactosidase and allowed for the quantification of BT2857 activity by the determination of the Michaelis-Menten parameters kcat, KM, and Vmax. A combinatorial molecular modeling approach of genomic context predicted a two-domain structure formed primarily of beta-sheets and loops. Polysaccharide Utilization Loci (PUL) analysis revealed BT2851-55 collectively belonging to GH36, GH42, GH3 and GH2 providing insight on potential substrates downstream from BT2857. The natural substrate remains unknown despite experiments demonstrating the upregulation of this region of the B.thetaiotaomicron genome in the presence of varying carbohydrates. Crystallization of the truncations of BT2857 will be a key finding for structure and functional characterization.
Streptomyces coelicolor is the representative species within its genus and predominantly found in soil with access to a diverse environment of carbohydrate sources. The main contribution of S. coelicolor is the metabolism of insoluble remains from plants ii and animals indicating positive improvements on plant growth and the rhizosphere. To date, the kinetic characterization of a putative galactosidase, SCO6594 has been explored including the determination of Michaelis-Menten parameters. Solubility screening of SCO6595-67 also display potential expression and characterization of this gene cluster will enable an understanding of protein regulation and carbohydrate interaction
- …