283 research outputs found

    Optimal Computation of Avoided Words

    Get PDF
    The deviation of the observed frequency of a word ww from its expected frequency in a given sequence xx is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of ww, denoted by std(w)std(w), effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word ww of length k>2k>2 is a ρ\rho-avoided word in xx if std(w)ρstd(w) \leq \rho, for a given threshold ρ<0\rho < 0. Notice that such a word may be completely absent from xx. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large kk. In this article, we propose an O(n)O(n)-time and O(n)O(n)-space algorithm to compute all ρ\rho-avoided words of length kk in a given sequence xx of length nn over a fixed-sized alphabet. We also present a time-optimal O(σn)O(\sigma n)-time and O(σn)O(\sigma n)-space algorithm to compute all ρ\rho-avoided words (of any length) in a sequence of length nn over an alphabet of size σ\sigma. Furthermore, we provide a tight asymptotic upper bound for the number of ρ\rho-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation

    Big Data Analysis

    Get PDF
    The value of big data is predicated on the ability to detect trends and patterns and more generally to make sense of the large volumes of data that is often comprised of a heterogeneous mix of format, structure, and semantics. Big data analysis is the component of the big data value chain that focuses on transforming raw acquired data into a coherent usable resource suitable for analysis. Using a range of interviews with key stakeholders in small and large companies and academia, this chapter outlines key insights, state of the art, emerging trends, future requirements, and sectorial case studies for data analysis

    Bootstrapping Trust in Online Dating: Social Verification of Online Dating Profiles

    Full text link
    Online dating is an increasingly thriving business which boasts billion-dollar revenues and attracts users in the tens of millions. Notwithstanding its popularity, online dating is not impervious to worrisome trust and privacy concerns raised by the disclosure of potentially sensitive data as well as the exposure to self-reported (and thus potentially misrepresented) information. Nonetheless, little research has, thus far, focused on how to enhance privacy and trustworthiness. In this paper, we report on a series of semi-structured interviews involving 20 participants, and show that users are significantly concerned with the veracity of online dating profiles. To address some of these concerns, we present the user-centered design of an interface, called Certifeye, which aims to bootstrap trust in online dating profiles using existing social network data. Certifeye verifies that the information users report on their online dating profile (e.g., age, relationship status, and/or photos) matches that displayed on their own Facebook profile. Finally, we present the results of a 161-user Mechanical Turk study assessing whether our veracity-enhancing interface successfully reduced concerns in online dating users and find a statistically significant trust increase.Comment: In Proceedings of Financial Cryptography and Data Security (FC) Workshop on Usable Security (USEC), 201

    Naturally Rehearsing Passwords

    Full text link
    We introduce quantitative usability and security models to guide the design of password management schemes --- systematic strategies to help users create and remember multiple passwords. In the same way that security proofs in cryptography are based on complexity-theoretic assumptions (e.g., hardness of factoring and discrete logarithm), we quantify usability by introducing usability assumptions. In particular, password management relies on assumptions about human memory, e.g., that a user who follows a particular rehearsal schedule will successfully maintain the corresponding memory. These assumptions are informed by research in cognitive science and validated through empirical studies. Given rehearsal requirements and a user's visitation schedule for each account, we use the total number of extra rehearsals that the user would have to do to remember all of his passwords as a measure of the usability of the password scheme. Our usability model leads us to a key observation: password reuse benefits users not only by reducing the number of passwords that the user has to memorize, but more importantly by increasing the natural rehearsal rate for each password. We also present a security model which accounts for the complexity of password management with multiple accounts and associated threats, including online, offline, and plaintext password leak attacks. Observing that current password management schemes are either insecure or unusable, we present Shared Cues--- a new scheme in which the underlying secret is strategically shared across accounts to ensure that most rehearsal requirements are satisfied naturally while simultaneously providing strong security. The construction uses the Chinese Remainder Theorem to achieve these competing goals

    Context Is Everything Sociality and Privacy in Online Social Network Sites

    Full text link
    International audienceSocial Network Sites (SNSs) pose many privacy issues. Apart from the fact that privacy in an online social network site may sound like an oxymoron, significant privacy issues are caused by the way social structures are currently handled in SNSs. Conceptually different social groups are generally conflated into the singular notion of 'friend'. This chapter argues that attention should be paid to the social dynamics of SNSs and the way people handle social contexts. It shows that SNS technology can be designed to support audience segregation, which should mitigate at least some of the privacy issues in Social Network Sites

    Muslim Diaspora in the West and International HRM

    Get PDF
    Interest in Islam and how Muslims organise themselves within the so-called Western world has largely stemmed from the flow of Muslim immigration since the 1960s and the 1970s (Loobuyck, Debeer, & Meier, 2013). Many of these immigrants have come to these new lands in the hope of making a better life for themselves economically, or to escape the political or religious pressures of their homeland (Lebl, 2014). Initially, deeming the influx of these foreigners to be largely irrelevant, there was little interest in their presence by the different governments across many jurisdictions. Typically, scant interest was shown towards entering into dialogue with the Muslim immigrant community. Indeed, until the 1990s, it was not uncommon for Islam to be perceived as a strange, foreign religion that was best managed through outsourcing to respective consulates (Loobuyck et al., 2013). Yet, migration and work-based mobility has a significant influence on the world of work and societies in which organisations are embedded. Many individuals migrate for better employment perspectives, as well as due to chain migration, betterment in the quality of life and based on fleeing famine, war and terror zones globally (Sharma & Reimer-Kirkham, 2015; Valiūnienė, 2016). Migration could involve upward as well as downward mobility/ wages, depending on the country and organisation. For example, minimum wages differ from € 184 in Bulgaria up to € 1923 in Luxembourg (Valiūnienė, 2016). Migration also contributes to the lived religion of diasporic communities as they navigate their faith at work (Sharma & Reimer-Kirkha

    Market research & the ethics of big data

    Get PDF
    The term ‘big data’ has recently emerged to describe a range of technological and commercial trends enabling the storage and analysis of huge amounts of customer data, such as that generated by social networks and mobile devices. Much of the commercial promise of big data is in the ability to generate valuable insights from collecting new types and volumes of data in ways that were not previously economically viable. At the same time a number of questions have been raised about the implications for individual privacy. This paper explores key perspectives underlying the emergence of big data and considers both the opportunities and ethical challenges raised for market research

    Pseudonymization risk analysis in distributed systems

    Get PDF
    In an era of big data, online services are becoming increasingly data-centric; they collect, process, analyze and anonymously disclose growing amounts of personal data in the form of pseudonymized data sets. It is crucial that such systems are engineered to both protect individual user (data subject) privacy and give back control of personal data to the user. In terms of pseudonymized data this means that unwanted individuals should not be able to deduce sensitive information about the user. However, the plethora of pseudonymization algorithms and tuneable parameters that currently exist make it difficult for a non expert developer (data controller) to understand and realise strong privacy guarantees. In this paper we propose a principled Model-Driven Engineering (MDE) framework to model data services in terms of their pseudonymization strategies and identify the risks to breaches of user privacy. A developer can explore alternative pseudonymization strategies to determine the effectiveness of their pseudonymization strategy in terms of quantifiable metrics: i) violations of privacy requirements for every user in the current data set; ii) the trade-off between conforming to these requirements and the usefulness of the data for its intended purposes. We demonstrate through an experimental evaluation that the information provided by the framework is useful, particularly in complex situations where privacy requirements are different for different users, and can inform decisions to optimize a chosen strategy in comparison to applying an off-the-shelf algorithm

    Transmembrane Protein Oxygen Content and Compartmentalization of Cells

    Get PDF
    Recently, there was a report that explored the oxygen content of transmembrane proteins over macroevolutionary time scales where the authors observed a correlation between the geological time of appearance of compartmentalized cells with atmospheric oxygen concentration. The authors predicted, characterized and correlated the differences in the structure and composition of transmembrane proteins from the three kingdoms of life with atmospheric oxygen concentrations in geological timescale. They hypothesized that transmembrane proteins in ancient taxa were selectively excluding oxygen and as this constraint relaxed over time with increase in the levels of atmospheric oxygen the size and number of communication-related transmembrane proteins increased. In summary, they concluded that compartmentalized and non-compartmentalized cells can be distinguished by how oxygen is partitioned at the proteome level. They derived this conclusion from an analysis of 19 taxa. We extended their analysis on a larger sample of taxa comprising 309 eubacterial, 34 archaeal, and 30 eukaryotic complete proteomes and observed that one can not absolutely separate the two groups of cells based on partition of oxygen in their membrane proteins. In addition, the origin of compartmentalized cells is likely to have been driven by an innovation than happened 2700 million years ago in the membrane composition of cells that led to the evolution of endocytosis and exocytosis rather than due to the rise in concentration of atmospheric oxygen
    corecore