1,368 research outputs found
A survey on cost-effective context-aware distribution of social data streams over energy-efficient data centres
Social media have emerged in the last decade as a viable and ubiquitous means of communication. The ease of user content generation within these platforms, e.g. check-in information, multimedia data, etc., along with the proliferation of Global Positioning System (GPS)-enabled, always-connected capture devices lead to data streams of unprecedented amount and a radical change in information sharing. Social data streams raise a variety of practical challenges, including derivation of real-time meaningful insights from effectively gathered social information, as well as a paradigm shift for content distribution with the leverage of contextual data associated with user preferences, geographical characteristics and devices in general. In this article we present a comprehensive survey that outlines the state-of-the-art situation and organizes challenges concerning social media streams and the infrastructure of the data centres supporting the efficient access to data streams in terms of content distribution, data diffusion, data replication, energy efficiency and network infrastructure. We systematize the existing literature and proceed to identify and analyse the main research points and industrial efforts in the area as far as modelling, simulation and performance evaluation are concerned
Contextual Analysis of Large-Scale Biomedical Associations for the Elucidation and Prioritization of Genes and their Roles in Complex Disease
Vast amounts of biomedical associations are easily accessible in public resources, spanning gene-disease associations, tissue-specific gene expression, gene function and pathway annotations, and many other data types. Despite this mass of data, information most relevant to the study of a particular disease remains loosely coupled and difficult to incorporate into ongoing research. Current public databases are difficult to navigate and do not interoperate well due to the plethora of interfaces and varying biomedical concept identifiers used. Because no coherent display of data within a specific problem domain is available, finding the latent relationships associated with a disease of interest is impractical.
This research describes a method for extracting the contextual relationships embedded within associations relevant to a disease of interest. After applying the method to a small test data set, a large-scale integrated association network is constructed for application of a network propagation technique that helps uncover more distant latent relationships. Together these methods are adept at uncovering highly relevant relationships without any a priori knowledge of the disease of interest.
The combined contextual search and relevance methods power a tool which makes pertinent biomedical associations easier to find, easier to assimilate into ongoing work, and more prominent than currently available databases. Increasing the accessibility of current information is an important component to understanding high-throughput experimental results and surviving the data deluge
Recommended from our members
Designing Resilient Manufacturing Systems In the Presence of Change
Economic and technical changes force manufacturers to redesign and enhance their operational systems. The implications of such changes within a complex system such as manufacturing and the supply chain can be very challenging. In particular, where the number of system elements and their connections result in a high level of complexity, the potential effects of a change can be expensive concerning the delivery time and cost targets, as a change to one part or element of a design requires additional changes throughout the system.
Companies need to understand the characteristics of their manufacturing systems that make them resilient to change. Considered from a system perspective, the structures of the system, and its elements and connections, contribute greatly to the characteristics and behavior of the system and hence potential resilience. A change prediction method can help to analyse the change properties and improve complex systems by focusing on the underlying structural elements and dependencies.
This thesis proposes a novel system change method that can enable the review of the current manufacturing system and understand how to design a more robust or adaptable system that addresses resilience. This method is a combination of matrix-based approaches and methods to assess the interaction between elements of the product and its manufacturing process in order to understand the risk of changes propagating through the system. Risk assessment across layers of a system can give valuable insight into how an element change interacts within the system. The goal of this thesis is to contribute to gaining a fundamental understanding of manufacturing systems resilience by developing a method to evaluate capabilities of changes, performance robustness or adaptability, and achieving high resilience.Universal Oil Products (a Honeywell Company);
Laing O’Rourk
Recommended from our members
CLOI-NET: Class segmentation of industrial facilities' point cloud datasets
Shape segmentation from point cloud data is a core step of the digital twinning process for industrial facilities. However, it is also a very labor intensive step, which counteracts the perceived value of the resulting model. The state-of-the-art method for automating cylinder detection can detect cylinders with 62% precision and 70% recall, while other shapes must then be segmented manually and shape segmentation is not achieved. This performance is promising, but it is far from drastically eliminating the manual labor cost. We argue that the use of class segmentation deep learning algorithms has the theoretical potential to perform better in terms of per point accuracy and less manual segmentation time needed. However, such algorithms could not be used so far due to the lack of a pre-trained dataset of laser scanned industrial shapes as well as the lack of appropriate geometric features in order to learn these shapes. In this paper, we tackle both problems in three steps. First, we parse the industrial point cloud through a novel class segmentation solution (CLOI-NET) that consists of an optimized PointNET++ based deep learning network and post-processing algorithms that enforce stronger contextual relationships per point. We then allow the user to choose the optimal manual annotation of a test facility by means of active learning to further improve the results. We achieve the first step by clustering points in meaningful spatial 3D windows based on their location. Then, we apply a class segmentation deep network, and output a probability distribution of all label categories per point and improve the predicted labels by enforcing post-processing rules. We finally optimize the results by finding the optimal amount of data to be used for training experiments. We validate our method on the largest richly annotated dataset of the most important to model industrial shapes (CLOI) and yield 82% average accuracy per point, 95.6% average AUC among all classes and estimated 70% labor hour savings in class segmentation. This proves that it is the first to automatically segment industrial point cloud shapes with no prior knowledge at commercially viable performance and is the foundation for efficient industrial shape modeling in cluttered point clouds
The Use and Misuse of Biomedical Data: Is Bigger Really Better?”
Very large biomedical research databases, containing electronic health records (HER) and genomic data from millions of patients, have been heralded recently for their potential to accelerate scientific discovery and produce dramatic improvements in medical treatments. Research enabled by these databases may also lead to profound changes in law, regulation, social policy, and even litigation strategies. Yet, is “big data” necessarily better data?
This paper makes an original contribution to the legal literature by focusing on what can go wrong in the process of biomedical database research and what precautions are necessary to avoid critical mistakes. We address three main reasons for a cautious approach to such research and to relying on its outcomes for purposes of public policy or litigation. First, the data contained in databases is surprisingly likely to be incorrect or incomplete. Second, systematic biases, arising from both the nature of the data and the preconceptions of investigators, are serious threats to the validity of biomedical database research, especially in answering causal questions. Third, data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policy makers.
In short, this paper sheds much-needed light on the problems of credulous and uninformed uses of biomedical databases. An understanding of the pitfalls of big data analysis is of critical importance to anyone who will rely on or dispute its outcomes, including lawyers, policy makers, and the public at large. The article also recommends technical, methodological, and educational interventions to combat the dangers of database errors and abuses
- …