Search CORE

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

User oriented access to secure biomedical resources through the grid

Author: Ajayi O.
Jiang J.
Sinnott R.O.
Stell A.J.
Watt J.
Publication venue
Publication date: 01/01/2006
Field of study

The life science domain is typified by heterogeneous data sets that are evolving at an exponential rate. Numerous post-genomic databases and areas of post-genomic life science research have been established and are being actively explored. Whilst many of these databases are public and freely accessible, it is often the case that researchers have data that is not so freely available and access to this data needs to be strictly controlled when distributed collaborative research is undertaken. Grid technologies provide one mechanism by which access to and integration of federated data sets is possible. Combining such data access and integration technologies with fine grained security infrastructures facilitates the establishment of virtual organisations (VO). However experience has shown that the general research (non-Grid) community are not comfortable with the Grid and its associated security models based upon public key infrastructures (PKIs). The Internet2 Shibboleth technology helps to overcome this through users only having to log in to their home site to gain access to resources across a VO – or in Shibboleth terminology a federation. In this paper we outline how we have applied the combination of Grid technologies, advanced security infrastructures and the Internet2 Shibboleth technology in several biomedical projects to provide a user-oriented model for secure access to and usage of Grid resources. We believe that this model may well become the de facto mechanism for undertaking e-Research on the Grid across numerous domains including the life sciences

CiteSeerX

User-oriented security supporting inter-disciplinary life science research across the grid

Author: Ajayi O.
Jiang J.
Sinnott R.O.
Stell A.J.
Watt J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Understanding potential genetic factors in disease or development of personalised e-Health solutions require scientists to access a multitude of data and compute resources across the Internet from functional genomics resources through to epidemiological studies. The Grid paradigm provides a compelling model whereby seamless access to these resources can be achieved. However, the acceptance of Grid technologies in this domain by researchers and resource owners must satisfy particular constraints from this community - two of the most critical of these constraints being advanced security and usability. In this paper we show how the Internet2 Shibboleth technology combined with advanced authorisation infrastructures can help address these constraints. We demonstrate the viability of this approach through a selection of case studies across the complete life science spectrum

CiteSeerX

Making Thin Data Thick: User Behavior Analysis with Minimum Information

Author
Publication venue
Publication date: 01/01/2015
Field of study

abstract: With the rise of social media, user-generated content has become available at an unprecedented scale. On Twitter, 1 billion tweets are posted every 5 days and on Facebook, 20 million links are shared every 20 minutes. These massive collections of user-generated content have introduced the human behavior's big-data. This big data has brought about countless opportunities for analyzing human behavior at scale. However, is this data enough? Unfortunately, the data available at the individual-level is limited for most users. This limited individual-level data is often referred to as thin data. Hence, researchers face a big-data paradox, where this big-data is a large collection of mostly limited individual-level information. Researchers are often constrained to derive meaningful insights regarding online user behavior with this limited information. Simply put, they have to make thin data thick. In this dissertation, how human behavior's thin data can be made thick is investigated. The chief objective of this dissertation is to demonstrate how traces of human behavior can be efficiently gleaned from the, often limited, individual-level information; hence, introducing an all-inclusive user behavior analysis methodology that considers social media users with different levels of information availability. To that end, the absolute minimum information in terms of both link or content data that is available for any social media user is determined. Utilizing only minimum information in different applications on social media such as prediction or recommendation tasks allows for solutions that are (1) generalizable to all social media users and that are (2) easy to implement. However, are applications that employ only minimum information as effective or comparable to applications that use more information? In this dissertation, it is shown that common research challenges such as detecting malicious users or friend recommendation (i.e., link prediction) can be effectively performed using only minimum information. More importantly, it is demonstrated that unique user identification can be achieved using minimum information. Theoretical boundaries of unique user identification are obtained by introducing social signatures. Social signatures allow for user identification in any large-scale network on social media. The results on single-site user identification are generalized to multiple sites and it is shown how the same user can be uniquely identified across multiple sites using only minimum link or content information. The findings in this dissertation allows finding the same user across multiple sites, which in turn has multiple implications. In particular, by identifying the same users across sites, (1) patterns that users exhibit across sites are identified, (2) how user behavior varies across sites is determined, and (3) activities that are observed only across sites are identified and studied.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

Security-oriented data grids for microarray expression profiles

Author: Bayliss C.
Jiang J.
Sinnott R.O.
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

Microarray experiments are one of the key ways in which gene activity can be identified and measured thereby shedding light and understanding for example on biological processes. The BBSRC funded Grid enabled Microarray Expression Profile Search (GEMEPS) project has developed an infrastructure which allows post-genomic life science researchers to ask and answer the following questions: who has undertaken microarray experiments that are in some way similar or relevant to mine; and how similar were these relevant experiments? Given that microarray experiments are expensive to undertake and may possess crucial information for future exploitation (both academically and commercially), scientists are wary of allowing unrestricted access to their data by the wider community until fully exploited locally. A key requirement is thus to have fine grained security that is easy to establish and simple (or ideally transparent) to use across inter-institutional virtual organisations. In this paper we present an enhanced security-oriented data Grid infrastructure that supports the definition of these kinds of queries and the analysis and comparison of microarray experiment results

arXiv.org e-Print Archive

An Army of Me: Sockpuppets in Online Discussion Communities

Author: Argamon S.
Cheng J.
Gilbert R. L.
Hutto C. J.
Kafai Y. B.
Mukherjee A.
Paul P. P.
Pennebaker J. W.
Qian T.
Seife C.
Silvestri G.
Solorio T.
Stone B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/03/2017
Field of study

In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.Comment: 26th International World Wide Web conference 2017 (WWW 2017

arXiv.org e-Print Archive

Web Tracking: Mechanisms, Implications, and Defenses

Author: Barlet-Ros Pere
Bujlow Tomasz
Carela-Español Valentín
Solé-Pareta Josep
Publication venue
Publication date: 01/01/2015
Field of study

This articles surveys the existing literature on the methods currently used by web services to track the user online as well as their purposes, implications, and possible user's defenses. A significant majority of reviewed articles and web resources are from years 2012-2014. Privacy seems to be the Achilles' heel of today's web. Web services make continuous efforts to obtain as much information as they can about the things we search, the sites we visit, the people with who we contact, and the products we buy. Tracking is usually performed for commercial purposes. We present 5 main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, or yet other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users (price discrimination, assessing financial credibility, determining insurance coverage, government surveillance, and identity theft). For each of the tracking methods, we present possible defenses. Apart from describing the methods and tools used for keeping the personal data away from being tracked, we also present several tools that were used for research purposes - their main goal is to discover how and by which entity the users are being tracked on their desktop computers or smartphones, provide this information to the users, and visualize it in an accessible and easy to follow way. Finally, we present the currently proposed future approaches to track the user and show that they can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Towards a virtual research environment for paediatric endocrinology across Europe

Author: Ahmed F.
Jiang J.
Sinnott R.O.
Stell A.
Watt J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Paediatric endocrinology is a medical specialty dealing with variations of physical growth and sexual development in childhood. Genetic anomalies that can cause disorders of sexual development in children are rare. Given this, sharing and collaboration on the small number of cases that occur is needed by clinical experts in the field. The EU-funded EuroDSD project (www.eurodsd.eu) is one such collaboration involving clinical centres and clinical and genetic experts across Europe. Through the establishment of a virtual research environment (VRE) supporting sharing of data and a variety of clinical and bioinformatics analysis tools, EuroDSD aims to provide a research infrastructure for research into disorders of sex development. Security, ethics and information governance are at the heart of this infrastructure. This paper describes the infrastructure that is being built and the inherent challenges in security, availability and dependability that must be overcome for the enterprise to succeed