13,990 research outputs found
From access and integration to mining of secure genomic data sets across the grid
The UK Department of Trade and Industry (DTI) funded BRIDGES project (Biomedical Research Informatics Delivered by Grid Enabled Services) has developed a Grid infrastructure to support cardiovascular research. This includes the provision of a compute Grid and a data Grid infrastructure with security at its heart. In this paper we focus on the BRIDGES data Grid. A primary aim of the BRIDGES data Grid is to help control the complexity in access to and integration of a myriad of genomic data sets through simple Grid based tools. We outline these tools, how they are delivered to the end user scientists. We also describe how these tools are to be extended in the BBSRC funded Grid Enabled Microarray Expression Profile Search (GEMEPS) to support a richer vocabulary of search capabilities to support mining of microarray data sets. As with BRIDGES, fine grain Grid security underpins GEMEPS
Digital curation and the cloud
Digital curation involves a wide range of activities, many of which could benefit from cloud
deployment to a greater or lesser extent. These range from infrequent, resource-intensive tasks
which benefit from the ability to rapidly provision resources to day-to-day collaborative activities
which can be facilitated by networked cloud services. Associated benefits are offset by risks
such as loss of data or service level, legal and governance incompatibilities and transfer
bottlenecks. There is considerable variability across both risks and benefits according to the
service and deployment models being adopted and the context in which activities are
performed. Some risks, such as legal liabilities, are mitigated by the use of alternative, e.g.,
private cloud models, but this is typically at the expense of benefits such as resource elasticity
and economies of scale. Infrastructure as a Service model may provide a basis on which more
specialised software services may be provided.
There is considerable work to be done in helping institutions understand the cloud and its
associated costs, risks and benefits, and how these compare to their current working methods,
in order that the most beneficial uses of cloud technologies may be identified. Specific
proposals, echoing recent work coordinated by EPSRC and JISC are the development of
advisory, costing and brokering services to facilitate appropriate cloud deployments, the
exploration of opportunities for certifying or accrediting cloud preservation providers, and
the targeted publicity of outputs from pilot studies to the full range of stakeholders within the
curation lifecycle, including data creators and owners, repositories, institutional IT support
professionals and senior manager
Quality-aware model-driven service engineering
Service engineering and service-oriented architecture as an integration and platform technology is a recent approach to software systems integration. Quality aspects
ranging from interoperability to maintainability to performance are of central importance for the integration of heterogeneous, distributed service-based systems. Architecture models can substantially influence quality attributes of the implemented software systems. Besides the benefits of explicit architectures on maintainability and reuse, architectural constraints such as styles, reference architectures and architectural patterns can influence observable software properties such as performance. Empirical performance evaluation is a process of measuring and evaluating the performance of implemented software. We present an approach for addressing the quality of services and service-based systems at the model-level in the context of model-driven service engineering. The focus on architecture-level models is a consequence of the black-box
character of services
The Systematization of Disturbances Act upon E-commerce Systems
There are many processes on Internet, on web servers, in ERP and company running an e-commerce system which can be influenced by disturbances. In order to minimize their impact it is necessary to identify and collect all disturbances, to determine their evaluation metric and to propose necessary remedies. Modifications proposed should be tested by means of modeling taking internal and external environment needs into consideration. Necessary information can be captured using the e-commerce system components monitoring. Particular system environment properties like company structure, system architecture, hardware, software, methods of connection with the supplier´s e-commerce system, customer communication interface are to be taken into account. Important social indicators like legislative and economic development, development of the global information society and others should also be considered. Disturbance and failure models can be designed using various methods like e.g. multi-agents modeling, simulations, fuzzy methods modeling etc. Generic ecommerce system model using control circuit as a fundamental notion can be used as a base for modeling.e-commerce system, disturbances, categorization of disturbances, modeling of disturbances, agent, simulation of disturbances
Reasoning about Independence in Probabilistic Models of Relational Data
We extend the theory of d-separation to cases in which data instances are not
independent and identically distributed. We show that applying the rules of
d-separation directly to the structure of probabilistic models of relational
data inaccurately infers conditional independence. We introduce relational
d-separation, a theory for deriving conditional independence facts from
relational models. We provide a new representation, the abstract ground graph,
that enables a sound, complete, and computationally efficient method for
answering d-separation queries about relational models, and we present
empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related
wor
Leveraging Decision Making in Cyber Security Analysis through Data Cleaning
Security Operations Centers (SOCs) have been built in many institutions for intrusion detection and incident response. A SOC employs various cyber defense technologies to continually monitor and control network traffic. Given the voluminous monitoring data, cyber security analysts need to identify suspicious network activities to detect potential attacks. As the network monitoring data are generated at a rapid speed and contain a lot of noise, analysts are so bounded by tedious and repetitive data triage tasks that they can hardly concentrate on in-depth analysis for further decision making. Therefore, it is critical to employ data cleaning methods in cyber situational awareness. In this paper, we investigate the main characteristics and categories of cyber security data with a special emphasis on its heterogeneous features. We also discuss how cyber analysts attempt to understand the incoming data through the data analytical process. Based on this understanding, this paper discusses five categories of data cleaning methods for heterogeneous data and addresses the main challenges for applying data cleaning in cyber situational awareness. The goal is to create a dataset that contains accurate information for cyber analysts to work with and thus achieving higher levels of data-driven decision making in cyber defense
- âŚ