763 research outputs found
Recommended from our members
Towards an aspect weaving BPEL engine
This position paper proposes the use of dynamic aspects and
the visitor design pattern to obtain a highly configurable and
extensible BPEL engine. Using these two techniques, the
core of this infrastructural software can be customised to
meet new requirements and add features such as debugging,
execution monitoring, or changing to another Web Service
selection policy. Additionally, it can easily be extended to
cope with customer-specific BPEL extensions. We propose
the use of dynamic aspects not only on the engine itself
but also on the workflow in order to tackle the problems of
Web Service hot deployment and hot fixes to long running
processes. In this way, composing aWeb Service "on-the-fly"
means weaving its choreography interface into the workflow
Process control and configuration of a reconfigurable production system using a multi-agent software system
Thesis (M. Tech. (Information Technology)) -- Central University of technology, Free State, 2011Traditional designs for component-handling platforms are rigidly linked to the product being produced. Control and monitoring methods for these platforms consist of various proprietary hardware controllers containing the control logic for the production process. Should the configuration of the component handling platform change, the controllers need to be taken offline and reprogrammed to take the changes into account.
The current thinking in component-handling system design is the notion of re-configurability. Reconfigurability means that with minimum or no downtime the system can be adapted to produce another product type or overcome a device failure. The re-configurable component handling platform is built-up from groups of independent devices. These groups or cells are each responsible for some aspect of the overall production process. By moving or swopping different versions of these cells within the component-handling platform, re-configurability is achieved. Such a dynamic system requires a flexible communications platform and high-level software control architecture to accommodate the reconfigurable nature of the system.
This work represents the design and testing of the core of a re-configurable production control software platform. Multiple software components work together to control and monitor a re-configurable component handling platform.
The design and implementation of a production database, production ontology, communications architecture and the core multi-agent control application linking all these components together is presented
Statistical properties and privacy guarantees of an original distance-based fully synthetic data generation method
Introduction: The amount of data generated by original research is growing
exponentially. Publicly releasing them is recommended to comply with the Open
Science principles. However, data collected from human participants cannot be
released as-is without raising privacy concerns. Fully synthetic data represent
a promising answer to this challenge. This approach is explored by the French
Centre de Recherche en {\'E}pid{\'e}miologie et Sant{\'e} des Populations in
the form of a synthetic data generation framework based on Classification and
Regression Trees and an original distance-based filtering. The goal of this
work was to develop a refined version of this framework and to assess its
risk-utility profile with empirical and formal tools, including novel ones
developed for the purpose of this evaluation.Materials and Methods: Our
synthesis framework consists of four successive steps, each of which is
designed to prevent specific risks of disclosure. We assessed its performance
by applying two or more of these steps to a rich epidemiological dataset.
Privacy and utility metrics were computed for each of the resulting synthetic
datasets, which were further assessed using machine learning
approaches.Results: Computed metrics showed a satisfactory level of protection
against attribute disclosure attacks for each synthetic dataset, especially
when the full framework was used. Membership disclosure attacks were formally
prevented without significantly altering the data. Machine learning approaches
showed a low risk of success for simulated singling out and linkability
attacks. Distributional and inferential similarity with the original data were
high with all datasets.Discussion: This work showed the technical feasibility
of generating publicly releasable synthetic data using a multi-step framework.
Formal and empirical tools specifically developed for this demonstration are a
valuable contribution to this field. Further research should focus on the
extension and validation of these tools, in an effort to specify the intrinsic
qualities of alternative data synthesis methods.Conclusion: By successfully
assessing the quality of data produced using a novel multi-step synthetic data
generation framework, we showed the technical and conceptual soundness of the
Open-CESP initiative, which seems ripe for full-scale implementation
Big Data Analytics for Complex Systems
The evolution of technology in all fields led to the generation of vast amounts of data by modern systems. Using data to extract information, make predictions, and make decisions is the current trend in artificial intelligence. The advancement of big data analytics tools made accessing and storing data easier and faster than ever, and machine learning algorithms help to identify patterns in and extract information from data. The current tools and machines in health, computer technologies, and manufacturing can generate massive raw data about their products or samples. The author of this work proposes a modern integrative system that can utilize big data analytics, machine learning, super-computer resources, and industrial health machines’ measurements to build a smart system that can mimic the human intelligence skills of observations, detection, prediction, and decision-making. The applications of the proposed smart systems are included as case studies to highlight the contributions of each system. The first contribution is the ability to utilize big data revolutionary and deep learning technologies on production lines to diagnose incidents and take proper action. In the current digital transformational industrial era, Industry 4.0 has been receiving researcher attention because it can be used to automate production-line decisions. Reconfigurable manufacturing systems (RMS) have been widely used to reduce the setup cost of restructuring production lines. However, the current RMS modules are not linked to the cloud for online decision-making to take the proper decision; these modules must connect to an online server (super-computer) that has big data analytics and machine learning capabilities. The online means that data is centralized on cloud (supercomputer) and accessible in real-time. In this study, deep neural networks are utilized to detect the decisive features of a product and build a prediction model in which the iFactory will make the necessary decision for the defective products. The Spark ecosystem is used to manage the access, processing, and storing of the big data streaming. This contribution is implemented as a closed cycle, which for the best of our knowledge, no one in the literature has introduced big data analysis using deep learning on real-time applications in the manufacturing system. The code shows a high accuracy of 97% for classifying the normal versus defective items. The second contribution, which is in Bioinformatics, is the ability to build supervised machine learning approaches based on the gene expression of patients to predict proper treatment for breast cancer. In the trial, to personalize treatment, the machine learns the genes that are active in the patient cohort with a five-year survival period. The initial condition here is that each group must only undergo one specific treatment. After learning about each group (or class), the machine can personalize the treatment of a new patient by diagnosing the patients’ gene expression. The proposed model will help in the diagnosis and treatment of the patient. The future work in this area involves building a protein-protein interaction network with the selected genes for each treatment to first analyze the motives of the genes and target them with the proper drug molecules. In the learning phase, a couple of feature-selection techniques and supervised standard classifiers are used to build the prediction model. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges around 100%. The third contribution is the ability to build semi-supervised learning for the breast cancer survival treatment that advances the second contribution. By understanding the relations between the classes, we can design the machine learning phase based on the similarities between classes. In the proposed research, the researcher used the Euclidean matrix distance among each survival treatment class to build the hierarchical learning model. The distance information that is learned through a non-supervised approach can help the prediction model to select the classes that are away from each other to maximize the distance between classes and gain wider class groups. The performance measurement of this approach shows a slight improvement from the second model. However, this model reduced the number of discriminative genes from 47 to 37. The model in the second contribution studies each class individually while this model focuses on the relationships between the classes and uses this information in the learning phase. Hierarchical clustering is completed to draw the borders between groups of classes before building the classification models. Several distance measurements are tested to identify the best linkages between classes. Most of the nodes show a high-performance measurement where accuracy, sensitivity, specificity, and F-measure ranges from 90% to 100%. All the case study models showed high-performance measurements in the prediction phase. These modern models can be replicated for different problems within different domains. The comprehensive models of the newer technologies are reconfigurable and modular; any newer learning phase can be plugged-in at both ends of the learning phase. Therefore, the output of the system can be an input for another learning system, and a newer feature can be added to the input to be considered for the learning phase
The Benefits and Costs of Online Privacy Legislation
Many people are concerned that information about their private life is more readily available and more easily captured on the Internet as compared to offline technologies. Specific concerns include unwanted email, credit card fraud, identity theft, and harassment. This paper analyzes key issues surrounding the protection of online privacy. It makes three important contributions: First, it provides the most comprehensive assessment to date of the estimated benefits and costs of regulating online privacy. Second, it provides the most comprehensive evaluation of legislation and legislative proposals in the U.S. aimed at protecting online privacy. Finally, it offers some policy prescriptions for the regulation of online privacy and suggests areas for future research. After analyzing the current debate on online privacy and assessing the potential costs and benefits of proposed regulations, our specific recommendations concerning the government's involvement in protecting online privacy include the following: The government should fund research that evaluates the effectiveness of existing privacy legislation before considering new regulations. The government should not generally regulate matters of privacy differently based on whether an issue arises online or offline. The government should not require a Web site to provide notification of its privacy policy because the vast majority of commercial U.S.-based Web sites already do so. The government should distinguish between how it regulates the use and dissemination of highly sensitive information, such as certain health records or Social Security numbers, versus more general information, such as consumer name and purchasing habits. The government should not require companies to provide consumers broad access to the personal information that is collected online for marketing purposes because the benefits do not appear to be significant and the costs could be quite high. The government should make it easier for the public to obtain information on online privacy and the tools available for consumers to protect their own privacy. The message of this paper is not that online privacy should be unregulated, but rather that policy makers should think through their options carefully, weighing the likely costs and benefits of each proposal.
Recommended from our members
Simulation modelling: Problem understanding in healthcare management
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the main problems that face decision makers in healthcare systems is complexity and the lack of a well-defined problem. This causes a lack of understanding about the system. Another problem associated with healthcare systems is that usually there are several stakeholders involved in decision making. In such cases different stakeholders may have different views about the problem. In addition to the lack of understanding and intercommunication, there is the tendency in healthcare management to use quantitative methods for analysing the system. These methods are highly data dependant and usually based on historical data, which may not reflect the system's performance under the present circumstances, given the changing pace of healthcare services and structure. Also data may not be available in the first place.
This research looks at how modelling techniques may help healthcare stakeholders to understand their system and increase their level of intercommunication (in the case of multiple stakeholders) with minimum dependency on data. Two main aspects are considered in this research: first appraising the existing modelling techniques with regard to problem understanding and intercommunication, and second, looking for an effective modelling approach for achieving such objectives. Discrete Event Simulation (DES) offers good facilities for modelling for understanding. However, DES could be used more effectively to enable viable understanding and means of communication. It is assumed that in order to enhance stakeholders' understanding and intercommunication, that it is better to involve them in the process of modelling from the beginning, using an iterative modelling process, and without being restricted to logical steps.
To achieve this a case study strategy is followed in order to devise a modelling framework that helps in enhancing stakeholders' understanding and intercommunication. In this particular research Single Case approach is employed using two case studies. The first case study is used as an attempt to evaluate the hypotheses and tackle research questions which are raised based on an analysis of findings from the literature. The experimentation and analysis part are used to refine the initial hypotheses. These hypotheses are then examined using the second case study to establish a picture about how to achieve the research objectives. In both case studies simulation modelling is examined with regard to the research questions.
The thesis concludes by identifying a modelling approach that has high versatility and flexibility to enhance stakeholders understanding and intercommunication. The approach is called MAPIU2, which stands for a Modelling Approach that is Iterative Participative for Understanding. From its name it can be deducted that the main factors of this approach are based on involving the stakeholders in the modelling process from the beginning in an iterative behaviour. One of the main lessons learned is that to achieve better results from the simulation modelling it is important that stakeholders should be involved with modelling process rather than just getting the final results, which helps implanting any decisions or recommendations arising from the model
Space Station Freedom data management system growth and evolution report
The Information Sciences Division at the NASA Ames Research Center has completed a 6-month study of portions of the Space Station Freedom Data Management System (DMS). This study looked at the present capabilities and future growth potential of the DMS, and the results are documented in this report. Issues have been raised that were discussed with the appropriate Johnson Space Center (JSC) management and Work Package-2 contractor organizations. Areas requiring additional study have been identified and suggestions for long-term upgrades have been proposed. This activity has allowed the Ames personnel to develop a rapport with the JSC civil service and contractor teams that does permit an independent check and balance technique for the DMS
New Statistical Algorithms for the Analysis of Mass Spectrometry Time-Of-Flight Mass Data with Applications in Clinical Diagnostics
Mass spectrometry (MS) based techniques have emerged as a standard forlarge-scale protein analysis. The ongoing progress in terms of more sensitive
machines and improved data analysis algorithms led to a constant expansion of
its fields of applications. Recently, MS was introduced into clinical proteomics
with the prospect of early disease detection using proteomic pattern matching.
Analyzing biological samples (e.g. blood) by mass spectrometry generates
mass spectra that represent the components (molecules) contained in a
sample as masses and their respective relative concentrations.
In this work, we are interested in those components that are constant within a
group of individuals but differ much between individuals of two distinct groups.
These distinguishing components that dependent on a particular medical condition
are generally called biomarkers. Since not all biomarkers found by the
algorithms are of equal (discriminating) quality we are only interested in a
small biomarker subset that - as a combination - can be used as a
fingerprint for a disease. Once a fingerprint for a particular disease
(or medical condition) is identified, it can be used in clinical diagnostics to
classify unknown spectra.
In this thesis we have developed new algorithms for automatic extraction of
disease specific fingerprints from mass spectrometry data. Special emphasis has
been put on designing highly sensitive methods with respect to signal detection.
Thanks to our statistically based approach our methods are able to
detect signals even below the noise level inherent in data acquired by common MS
machines, such as hormones.
To provide access to these new classes of algorithms to collaborating groups
we have created a web-based analysis platform that provides all necessary
interfaces for data transfer, data analysis and result inspection.
To prove the platform's practical relevance it has been utilized in several
clinical studies two of which are presented in this thesis. In these studies it
could be shown that our platform is superior to commercial systems with respect
to fingerprint identification. As an outcome of these studies several
fingerprints for different cancer types (bladder, kidney, testicle, pancreas,
colon and thyroid) have been detected and validated. The clinical partners in
fact emphasize that these results would be impossible with a less sensitive
analysis tool (such as the currently available systems).
In addition to the issue of reliably finding and handling signals in noise we
faced the problem to handle very large amounts of data, since an average dataset
of an individual is about 2.5 Gigabytes in size and we have data of hundreds to
thousands of persons. To cope with these large datasets, we developed a new
framework for a heterogeneous (quasi) ad-hoc Grid - an infrastructure that
allows to integrate thousands of computing resources (e.g. Desktop Computers,
Computing Clusters or specialized hardware, such as IBM's Cell Processor in a
Playstation 3)
- …