474 research outputs found
Экотуризм на примере природоохранной деятельности на муниципальном уровне: Хостенин
Описывается поселение Хостенин, в котором при поддержке муниципальных органов власти используются альтернативные источники энергии. Данное поселение приобрело известность за счет экологических проектов, участвующих в использовании местных ресурсов, сохранения и использования возобновляемых источников энергии, в частности солнца и биомассы, а также экологически безопасных технологий, поддерживающих устойчивое развитие местности с середины 1990-х годов
Matching Subsequences in Trees
Given two rooted, labeled trees and the tree path subsequence problem
is to determine which paths in are subsequences of which paths in . Here
a path begins at the root and ends at a leaf. In this paper we propose this
problem as a useful query primitive for XML data, and provide new algorithms
improving the previously best known time and space bounds.Comment: Minor correction of typos, et
Supermetric search with the four-point property
Metric indexing research is concerned with the efficient evaluation of queries in metric spaces. In general, a large space of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most such mechanisms rely upon the triangle inequality property of the metric governing the space. The triangle inequality property is equivalent to a finite embedding property, which states that any three points of the space can be isometrically embedded in two-dimensional Euclidean space. In this paper, we examine a class of semimetric space which is finitely 4-embeddable in three-dimensional Euclidean space. In mathematics this property has been extensively studied and is generally known as the four-point property. All spaces with the four-point property are metric spaces, but they also have some stronger geometric guarantees. We coin the term supermetric space as, in terms of metric search, they are significantly more tractable. We show some stronger geometric guarantees deriving from the four-point property which can be used in indexing to great effect, and show results for two of the SISAP benchmark searches that are substantially better than any previously published
Accelerating Metric Filtering by Improving Bounds on Estimated Distances
Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triple of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be set as the difference and the sum of these two already known distances, due to the triangle inequality rule of the metric space. For efficiency reasons, the tightness of bounds is crucial, but as angles within triangles of distances can be arbitrary, the worst case with zero and straight angles must also be considered for correctness. However, in data of real-life applications, the distribution of possible angles is skewed and extremes are very unlikely to occur. In this paper, we enhance the existing definition of bounds on the unknown distance with information about possible angles within triangles. We show that two lower bounds and one upper bound on each distance exist in case of limited angles. We analyse their filtering power and confirm high improvements of efficiency by experiments on several real-life datasets
Using MILOS to Build a Multimedia Digital Library Application: The PhotoBook Experience
Abstract. The digital library field is recently broadening its scope of applica-bility and it is also continuously adapting to the frequent changes occurring in the internet society. Accordingly, digital libraries are slightly moving from a controlled environment accessible only to professionals and domain-experts, to environments accessible to casual users that want to exploit the potentialities offered by the digital library technology. These new trends require, for instance, new search paradigms to be offered, new media content to be managed, and new description extraction techniques to be used. Building digital library applications, and effectively adapting them to new emerging trends, requires to develop a platform that offers standard and powerful building blocks to support application developers. In this paper we discuss our experience of using MILOS, a multimedia content management system oriented to the construction of digital libraries, to build a demanding application dedicated to non-professional users. Specifically, we discuss the design and implementation of an on-line photo album (PhotoBook), which is a digital library application that allows people to manage their own photos, to share them with friends, and to make them publicly available and searchable. PhotoBook, uses a complex internal metadata schema (MPEG-7) and allows users to simply express complex queries (combining similarity search and fielded search), enabling them to retrieve material of interest even if metadata are impre-cise or missing.
The Tree Inclusion Problem: In Linear Space and Faster
Given two rooted, ordered, and labeled trees and the tree inclusion
problem is to determine if can be obtained from by deleting nodes in
. This problem has recently been recognized as an important query primitive
in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}]
presented the first polynomial time algorithm using quadratic time and space.
Since then several improved results have been obtained for special cases when
and have a small number of leaves or small depth. However, in the worst
case these algorithms still use quadratic time and space. Let , , and
denote the number of nodes, the number of leaves, and the %maximum depth
of a tree . In this paper we show that the tree inclusion
problem can be solved in space and time: O(\min(l_Pn_T, l_Pl_T\log
\log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or
matches the best known time complexities while using only linear space instead
of quadratic. This is particularly important in practical applications, such as
XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim
Detecting Advanced Network Threats Using a Similarity Search
In this paper, we propose a novel approach for the detection of advanced network threats. We combine knowledge-based detections with similarity search techniques commonly utilized for automated image annotation. This unique combination could provide effective detection of common network anomalies together with their unknown variants. In addition, it offers a similar approach to network data analysis as a security analyst does. Our research is focused on understanding the similarity of anomalies in network traffic and their representation within complex behaviour patterns. This will lead to a proposal of a system for the realtime analysis of network data based on similarity. This goal should be achieved within a period of three years as a part of a PhD thesis
Techniques for Complex Analysis of Contemporary Data
Contemporary data objects are typically complex, semi-structured, or unstructured at all. Besides, objects are also related to form a network. In such a situation, data analysis requires not only the traditional attribute-based access but also access based on similarity as well as data mining operations. Though tools for such operations do exist, they usually specialise in operation and are available for specialized data structures supported by specific computer system environments. In contrary, advance analyses are obtained by application of several elementary access operations which in turn requires expert knowledge in multiple areas. In this paper, we propose a unification platform for various data analytical operators specified as a general-purpose analytical system ADAMiSS. An extensible data-mining and similarity-based set of operators over a common versatile data structure allow the recursive application of heterogeneous operations, thus allowing the definition of complex analytical processes, necessary to solve the contemporary analytical tasks. As a proof-of-concept, we present results that were obtained by our prototype implementation on two real-world data collections: the Twitter Higg's boson and the Kosarak datasets
Socioeconomic indicators and ethnicity as determinants of regional mortality rates in Slovakia
Regional differences in mortality might reflect socioeconomic and ethnic differences between regions. The present study examines the relationship between education, unemployment, income, Roma population and regional mortality in the Slovak Republic. Separately for males and females, data on standardised mortality in the Slovak population aged 20-64 years in the year 2002 were calculated for each of the 79 districts. Similarly the proportions of respondents with tertiary education, unemployed status, Roma ethnicity and income data were calculated per district. A linear regression model was used to analyse the data. Socioeconomic differences in regional mortality were found among males, but not among females. While education and unemployment rate significantly contributed to mortality differences between regions, income and the proportion of Roma population did not. The model explained 32.9% of the variance in standardised mortality rate among districts for males and 7.6% for females. Low education and high unemployment rate seems to be an indicator of regions with high mortality of male and therefore should be targeted by policy measures aimed at decreasing mortality in productive age
Reference point hyperplane trees
Our context of interest is tree-structured exact search in metric spaces. We make the simple observation that, the deeper a data item is within the tree, the higher the probability of that item being excluded from a search. Assuming a fixed and independent probability p of any subtree being excluded at query time, the probability of an individual data item being accessed is (1−p)d for a node at depth d. In a balanced binary tree half of the data will be at the maximum depth of the tree so this effect should be significant and observable. We test this hypothesis with two experiments on partition trees. First, we force a balance by adjusting the partition/exclusion criteria, and compare this with unbalanced trees where the mean data depth is greater. Second, we compare a generic hyperplane tree with a monotone hyperplane tree, where also the mean depth is greater. In both cases the tree with the greater mean data depth performs better in high-dimensional spaces. We then experiment with increasing the mean depth of nodes by using a small, fixed set of reference points to make exclusion decisions over the whole tree, so that almost all of the data resides at the maximum depth. Again this can be seen to reduce the overall cost of indexing. Furthermore, we observe that having already calculated reference point distances for all data, a final filtering can be applied if the distance table is retained. This reduces further the number of distance calculations required, whilst retaining scalability. The final structure can in fact be viewed as a hybrid between a generic hyperplane tree and a LAESA search structure
- …