4,258 research outputs found
A solution for secure use of Kibana and Elasticsearch in multi-user environment
Monitoring is indispensable to check status, activities, or resource usage of
IT services. A combination of Kibana and Elasticsearch is used for monitoring
in many places such as KEK, CC-IN2P3, CERN, and also non-HEP communities.
Kibana provides a web interface for rich visualization, and Elasticsearch is a
scalable distributed search engine. However, these tools do not support
authentication and authorization features by default. In the case of single
Kibana and Elasticsearch services shared among many users, any user who can
access Kibana can retrieve other's information from Elasticsearch. In
multi-user environment, in order to protect own data from others or share part
of data among a group, fine-grained access control is necessary.
The CERN cloud service group had provided cloud utilization dashboard to each
user by Elasticsearch and Kibana. They had deployed a homemade Elasticsearch
plugin to restrict data access based on a user authenticated by the CERN Single
Sign On system. It enabled each user to have a separated Kibana dashboard for
cloud usage, and the user could not access to other's one. Based on the
solution, we propose an alternative one which enables user/group based
Elasticsearch access control and Kibana objects separation. It is more flexible
and can be applied to not only the cloud service but also the other various
situations. We confirmed our solution works fine in CC-IN2P3. Moreover, a
pre-production platform for CC-IN2P3 has been under construction.
We will describe our solution for the secure use of Kibana and Elasticsearch
including integration of Kerberos authentication, development of a Kibana
plugin which allows Kibana objects to be separated based on user/group, and
contribution to Search Guard which is an Elasticsearch plugin enabling
user/group based access control. We will also describe the effect on
performance from using Search Guard.Comment: International Symposium on Grids and Clouds 2017 (ISGC 2017
Towards the cloudification of the social networks analytics
In the last years, with the increase of the available data from social networks and the rise of big data technologies, social data has emerged as one of the most profitable market for companies to increase their benefits. Besides, social computation scientists see such data as a vast ocean of information to study modern human societies. Nowadays, enterprises and researchers are developing their own mining tools in house, or they are outsourcing their social media mining needs to specialised companies with its consequent economical cost. In this paper, we present the first cloud computing service to facilitate the deployment of social media analytics applications to allow data practitioners to use social mining tools as a service. The main advantage of this service is the possibility to run different queries at the same time and combine their results in real time. Additionally, we also introduce twearch, a prototype to develop twitter mining algorithms as services in the cloud.Peer ReviewedPostprint (author’s final draft
Substring filtering for low-cost linked data interfaces
Recently, Triple Pattern Fragments (TPFS) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate SPARQL queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the TPFS interface purposely does not support complex constructs such as SPARQL filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the TPFS interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of SPARQL queries on multiple implementations, including Elastic Search and case-insensitive FM-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on TPF servers allows users to obtain faster responses for filter-based SPARQL queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries
EAGLE—A Scalable Query Processing Engine for Linked Sensor Data
Recently, many approaches have been proposed to manage sensor data using semantic web technologies for effective heterogeneous data integration. However, our empirical observations revealed that these solutions primarily focused on semantic relationships and unfortunately paid less attention to spatio–temporal correlations. Most semantic approaches do not have spatio–temporal support. Some of them have attempted to provide full spatio–temporal support, but have poor performance for complex spatio–temporal aggregate queries. In addition, while the volume of sensor data is rapidly growing, the challenge of querying and managing the massive volumes of data generated by sensing devices still remains unsolved. In this article, we introduce EAGLE, a spatio–temporal query engine for querying sensor data based on the linked data model. The ultimate goal of EAGLE is to provide an elastic and scalable system which allows fast searching and analysis with respect to the relationships of space, time and semantics in sensor data. We also extend SPARQL with a set of new query operators in order to support spatio–temporal computing in the linked sensor data context.EC/H2020/732679/EU/ACTivating InnoVative IoT smart living environments for AGEing well/ACTIVAGEEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE
- …