91 research outputs found

    Optimal Sketching Bounds for Sparse Linear Regression

    Full text link
    We study oblivious sketching for kk-sparse linear regression under various loss functions such as an ℓp\ell_p norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse ℓ2\ell_2 norm regression, there is a distribution over oblivious sketches with Θ(klog⁥(d/k)/Δ2)\Theta(k\log(d/k)/\varepsilon^2) rows, which is tight up to a constant factor. This extends to ℓp\ell_p loss with an additional additive O(klog⁥(k/Δ)/Δ2)O(k\log(k/\varepsilon)/\varepsilon^2) term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the ℓ2\ell_2 norm, we observe an upper bound of O(klog⁥(d)/Δ+klog⁥(k/Δ)/Δ2)O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2) rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve o(d)o(d) rows showing that O(ÎŒ2klog⁥(ÎŒnd/Δ)/Δ2)O(\mu^2 k\log(\mu n d/\varepsilon)/\varepsilon^2) rows suffice, where ÎŒ\mu is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on ÎŒ\mu. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize ∄Ax−b∄22+λ∄x∄1\|Ax-b\|_2^2+\lambda\|x\|_1 over x∈Rdx\in\mathbb{R}^d. We show that sketching dimension O(log⁥(d)/(λΔ)2)O(\log(d)/(\lambda \varepsilon)^2) suffices and that the dependence on dd and λ\lambda is tight.Comment: AISTATS 202

    Streaming Euclidean Max-Cut: Dimension vs Data Reduction

    Full text link
    Max-Cut is a fundamental problem that has been studied extensively in various settings. We design an algorithm for Euclidean Max-Cut, where the input is a set of points in Rd\mathbb{R}^d, in the model of dynamic geometric streams, where the input X⊆[Δ]dX\subseteq [\Delta]^d is presented as a sequence of point insertions and deletions. Previously, Frahling and Sohler [STOC 2005] designed a (1+Ï”)(1+\epsilon)-approximation algorithm for the low-dimensional regime, i.e., it uses space exp⁥(d)\exp(d). To tackle this problem in the high-dimensional regime, which is of growing interest, one must improve the dependence on the dimension dd, ideally to space complexity poly(ϔ−1dlog⁡Δ)\mathrm{poly}(\epsilon^{-1} d \log\Delta). Lammersen, Sidiropoulos, and Sohler [WADS 2009] proved that Euclidean Max-Cut admits dimension reduction with target dimension dâ€Č=poly(ϔ−1)d' = \mathrm{poly}(\epsilon^{-1}). Combining this with the aforementioned algorithm that uses space exp⁥(dâ€Č)\exp(d'), they obtain an algorithm whose overall space complexity is indeed polynomial in dd, but unfortunately exponential in ϔ−1\epsilon^{-1}. We devise an alternative approach of \emph{data reduction}, based on importance sampling, and achieve space bound poly(ϔ−1dlog⁡Δ)\mathrm{poly}(\epsilon^{-1} d \log\Delta), which is exponentially better (in Ï”\epsilon) than the dimension-reduction approach. To implement this scheme in the streaming model, we employ a randomly-shifted quadtree to construct a tree embedding. While this is a well-known method, a key feature of our algorithm is that the embedding's distortion O(dlog⁡Δ)O(d\log\Delta) affects only the space complexity, and the approximation ratio remains 1+Ï”1+\epsilon

    A subspace constrained randomized Kaczmarz method for structure or external knowledge exploitation

    Full text link
    We study a subspace constrained version of the randomized Kaczmarz algorithm for solving large linear systems in which the iterates are confined to the space of solutions of a selected subsystem. We show that the subspace constraint leads to an accelerated convergence rate, especially when the system has structure such as having coherent rows or being approximately low-rank. On Gaussian-like random data, it results in a form of dimension reduction that effectively improves the aspect ratio of the system. Furthermore, this method serves as a building block for a second, quantile-based algorithm for the problem of solving linear systems with arbitrary sparse corruptions, which is able to efficiently exploit partial external knowledge about uncorrupted equations and achieve convergence in difficult settings such as in almost-square systems. Numerical experiments on synthetic and real-world data support our theoretical results and demonstrate the validity of the proposed methods for even more general data models than guaranteed by the theory.Comment: 34 page

    Narrative relations : resources for meaning-making and person-centred practices in geriatric care

    Get PDF
    Narrative approaches in healthcare have attracted a lot of academic attention, suggesting a strong potential in narrativity to help shift healthcare towards more compassionate and person-centred practices. Yet, there is still a need to better understand how narrativity might be understood, made relevant, and realized by healthcare staff in their everyday practices. Drawing on ethnographic fieldwork, healthcare professionals’ practice-based experiences shared in focus groups discussions, and narrative theory, this thesis puts everyday healthcare practices at the centre of inquiry, with the overall aim to develop a deepened understanding of narrativity as a potential resource for person-centredness and meaning-making in inpatient geriatric care practice. This compilation thesis includes four academic papers, each contributing to illuminating different aspects of narrativity in everyday practices. The initial studies shaped the design of the latter, thus building cumulative knowledge pertaining to the overall aim. Drawing on ethnographic fieldwork, Paper I explores how narrative meaning-making takes place and unfolds on a geriatric ward and discusses that in relation to contextual conditions and person-centred care. The findings render a multifaceted portrayal of the relational and intersubjective character of narrative meaning-making in healthcare practices and show how mundane events and activities of everyday life on a ward were often undervalued in terms of offering opportunities for exploring and co-creating possible understanding of patient situations between them and staff. Papers II & III are based on a constructivist grounded theory methodology. Vignettes developed from the previous ethnographic fieldwork were used to prompt focus group discussions with healthcare professionals. Paper II explores healthcare professionals' experiences and reflections about the use of narration in their everyday work. The findings reflect narration as an ongoing practice of mutual narrative interchange between multiple narrators, including patients, significant others, and staff, and thus introduce the notion of engaging in narrative relations. Moreover, the findings suggest potential consequences for clinical practice of people’s engagement in narrative relations. Paper III expands understanding about the notion of narrative relations by exploring how and where narrative relations are adopted and enacted in everyday practice on a geriatric ward. A main finding was the existence of a twofold practice whereby some activities and actions were generally approved as authorized tasks or routines, i.e. acknowledged practice, while other activities were not assigned this status, and thus took place as underground practices. Together with the concepts of clinical frontstage and backstage, the analysis constructed four distinct arenas for engaging in narrative relations. The findings discuss the transboundary function of narrative relations to interconnect these arenas and contribute to continuity in everyday practices. Finally, Paper IV explores conditions for engaging in narrative relations on a geriatric ward by delving into how healthcare staff interpret conditions for their practices. The findings from a hermeneutic analysis contribute to a deepening understanding of how everyday healthcare practices unfold not only governed by predefined organizational conditions, but that these conditions are continuously interpreted by people, which affect how practices are enacted. Whilst some interpretations were aligned with attitudes and activities enhancing narrative relations, others simultaneously thwarted narrative relations by enacting task-orientation, division, and a focus on measurable biomedical or functional improvements and outcomes. In summary, this thesis suggests a broadened understanding of narrativity that expands the focus beyond eliciting verbal narratives and coherent stories when aiming for fostering person-centredness, to entail a relational approach of continuously tapping into the ongoing narrative meaning-making that people – both staff and patients – engage in. This approach builds on the notion that multiple narratives continuously communicate through narrative relations. When consciously and ethically cultivated, staff practices of engaging in narrative relations may contribute to uphold foundational relational qualities in healthcare

    Malleable zero-knowledge proofs and applications

    Get PDF
    In recent years, the field of privacy-preserving technologies has experienced considerable expansion, with zero-knowledge proofs (ZKPs) playing one of the most prominent roles. Although ZKPs have been a well-established theoretical construct for three decades, recent efficiency improvements and novel privacy applications within decentralized finance have become the main drivers behind the surge of interest and investment in this area. This momentum has subsequently sparked unprecedented technical advances. Non-interactive ZKPs (NIZKs) are now regularly implemented across a variety of domains, encompassing, but not limited to, privacy-enabling cryptocurrencies, credential systems, voting, mixing, secure multi-party computation, and other cryptographic protocols. This thesis, although covering several areas of ZKP technologies and their application, focuses on one important aspect of NIZKs, namely their malleability. Malleability is a quality of a proof system that describes the potential for altering an already generated proof. Different properties may be desired in different application contexts. On the one end of the spectrum, non-malleability ensures proof immutability, an important requirement in scenarios such as prevention of replay attacks in anonymous cryptocurrencies. At the other end, some NIZKs enable proof updatability, recursively and directly, a feature that is integral for a variety of contexts, such as private smart contracts, compact blockchains, ZK rollups, ZK virtual machines, and MPC protocols generally. This work starts with a detailed analysis of the malleability and overarching security of a popular NIZK, known as Groth16. Here we adopt a more definitional approach, studying certain properties of the proof system, and its setup ceremony, that are crucial for its precise modelling within bigger systems. Subsequently, the work explores the malleability of transactions within a private cryptocurrency variant, where we show that relaxing non-malleability assumptions enables a functionality, specifically an atomic asset swap, that is useful for cryptocurrency applications. The work culminates with a study of a less general, algebraic NIZK, and particularly its updatability properties, whose applicability we present within the context of ensuring privacy for regulatory compliance purposes

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Shallow shadows: Expectation estimation using low-depth random Clifford circuits

    Full text link
    We provide practical and powerful schemes for learning many properties of an unknown n-qubit quantum state using a sparing number of copies of the state. Specifically, we present a depth-modulated randomized measurement scheme that interpolates between two known classical shadows schemes based on random Pauli measurements and random Clifford measurements. These can be seen within our scheme as the special cases of zero and infinite depth, respectively. We focus on the regime where depth scales logarithmically in n and provide evidence that this retains the desirable properties of both extremal schemes whilst, in contrast to the random Clifford scheme, also being experimentally feasible. We present methods for two key tasks; estimating expectation values of certain observables from generated classical shadows and, computing upper bounds on the depth-modulated shadow norm, thus providing rigorous guarantees on the accuracy of the output estimates. We consider observables that can be written as a linear combination of poly(n) Paulis and observables that can be written as a low bond dimension matrix product operator. For the former class of observables both tasks are solved efficiently in n. For the latter class, we do not guarantee efficiency but present a method that works in practice; by variationally computing a heralded approximate inverses of a tensor network that can then be used for efficiently executing both these tasks.Comment: 22 pages, 12 figures. Version 2: new MPS variational inversion algorithm and new numeric

    Modelling spillover effects in spatial stochastic frontier analysis

    Get PDF
    In the last two decades, authors have begun to expand classical stochastic frontier (SF) models in order to include also some spatial components. Indeed, firms tend to concentrate in clusters, taking advantage of positive agglomeration externalities due to cooperation, shared ideas and emulation, resulting in increased productivity levels. Until now scholars have introduced spatial dependence into SF models following two different paths: evaluating global and local spatial spillover effects related to the frontier or considering spatial cross-sectional correlation in the inefficiency and/or in the error term. In this thesis, we extend the current literature on spatial SF models introducing two novel specifications for panel data. First, besides considering productivity and input spillovers, we introduce the possibility to evaluate the specific spatial effects arising from each inefficiency determinant through their spatial lags aiming to capture also knowledge spillovers. Second, we develop a very comprehensive spatial SF model that includes both frontier and error-based spillovers in order to consider four different sources of spatial dependence (i.e. productivity and input spillovers related to the frontier function and behavioural and environmental correlation associated with the two error terms). Finally, we test the finite sample properties of the two proposed spatial SF models through simulations, and we provide two empirical applications to the Italian accommodation and agricultural sectors. From a practical perspective, policymakers, based on results from these models, can rely on precise, detailed and distinct insights on the spillover effects affecting the productive performance of neighbouring spatial units obtaining interesting and relevant suggestions for policy decisions

    Secure Sampling with Sublinear Communication

    Get PDF
    Random sampling from specified distributions is an important tool with wide applications for analysis of large-scale data. In this paper we study how to randomly sample when the distribution is partitioned among two parties\u27 private inputs. Of course, a trivial solution is to have one party send a (possibly encrypted) description of its weights to the other party who can then sample over the entire distribution (possibly using homomorphic encryption). However, this approach requires communication that is linear in the input size which is prohibitively expensive in many settings. In this paper, we investigate secure 2-party sampling with \emph{sublinear communication} for many standard distributions. We develop protocols for L1L_1, and L2L_2 sampling. Additionally, we investigate the feasibility of sublinear product sampling, showing impossibility for the general problem and showing a protocol for a restricted case of the problem. We additionally show how such product sampling can be used to instantiate a sublinear communication 2-party exponential mechanism for differentially-private data release

    Analyzing and Applying Cryptographic Mechanisms to Protect Privacy in Applications

    Get PDF
    Privacy-Enhancing Technologies (PETs) emerged as a technology-based response to the increased collection and storage of data as well as the associated threats to individuals' privacy in modern applications. They rely on a variety of cryptographic mechanisms that allow to perform some computation without directly obtaining knowledge of plaintext information. However, many challenges have so far prevented effective real-world usage in many existing applications. For one, some mechanisms leak some information or have been proposed outside of security models established within the cryptographic community, leaving open how effective they are at protecting privacy in various applications. Additionally, a major challenge causing PETs to remain largely academic is their practicality-in both efficiency and usability. Cryptographic mechanisms introduce a lot of overhead, which is mostly prohibitive, and due to a lack of high-level tools are very hard to integrate for outsiders. In this thesis, we move towards making PETs more effective and practical in protecting privacy in numerous applications. We take a two-sided approach of first analyzing the effective security (cryptanalysis) of candidate mechanisms and then building constructions and tools (cryptographic engineering) for practical use in specified emerging applications in the domain of machine learning crucial to modern use cases. In the process, we incorporate an interdisciplinary perspective for analyzing mechanisms and by collaboratively building privacy-preserving architectures with requirements from the application domains' experts. Cryptanalysis. While mechanisms like Homomorphic Encryption (HE) or Secure Multi-Party Computation (SMPC) provably leak no additional information, Encrypted Search Algorithms (ESAs) and Randomization-only Two-Party Computation (RoTPC) possess additional properties that require cryptanalysis to determine effective privacy protection. ESAs allow for search on encrypted data, an important functionality in many applications. Most efficient ESAs possess some form of well-defined information leakage, which is cryptanalyzed via a breadth of so-called leakage attacks proposed in the literature. However, it is difficult to assess their practical effectiveness given that previous evaluations were closed-source, used restricted data, and made assumptions about (among others) the query distribution because real-world query data is very hard to find. For these reasons, we re-implement known leakage attacks in an open-source framework and perform a systematic empirical re-evaluation of them using a variety of new data sources that, for the first time, contain real-world query data. We obtain many more complete and novel results where attacks work much better or much worse than what was expected based on previous evaluations. RoTPC mechanisms require cryptanalysis as they do not rely on established techniques and security models, instead obfuscating messages using only randomizations. A prominent protocol is a privacy-preserving scalar product protocol by Lu et al. (IEEE TPDS'13). We show that this protocol is formally insecure and that this translates to practical insecurity by presenting attacks that even allow to test for certain inputs, making the case for more scrutiny of RoTPC protocols used as PETs. This part of the thesis is based on the following two publications: [KKM+22] S. KAMARA, A. KATI, T. MOATAZ, T. SCHNEIDER, A. TREIBER, M. YONLI. “SoK: Cryptanalysis of Encrypted Search with LEAKER - A framework for LEakage AttacK Evaluation on Real-world data”. In: 7th IEEE European Symposium on Security and Privacy (EuroS&P’22). Full version: https://ia.cr/2021/1035. Code: https://encrypto.de/code/LEAKER. IEEE, 2022, pp. 90–108. Appendix A. [ST20] T. SCHNEIDER , A. TREIBER. “A Comment on Privacy-Preserving Scalar Product Protocols as proposed in “SPOC””. In: IEEE Transactions on Parallel and Distributed Systems (TPDS) 31.3 (2020). Full version: https://arxiv.org/abs/1906.04862. Code: https://encrypto.de/code/SPOCattack, pp. 543–546. CORE Rank A*. Appendix B. Cryptographic Engineering. Given the above results about cryptanalysis, we investigate using the leakage-free and provably-secure cryptographic mechanisms of HE and SMPC to protect privacy in machine learning applications. As much of the cryptographic community has focused on PETs for neural network applications, we focus on two other important applications and models: Speaker recognition and sum product networks. We particularly show the efficiency of our solutions in possible real-world scenarios and provide tools usable for non-domain experts. In speaker recognition, a user's voice data is matched with reference data stored at the service provider. Using HE and SMPC, we build the first privacy-preserving speaker recognition system that includes the state-of-the-art technique of cohort score normalization using cohort pruning via SMPC. Then, we build a privacy-preserving speaker recognition system relying solely on SMPC, which we show outperforms previous solutions based on HE by a factor of up to 4000x. We show that both our solutions comply with specific standards for biometric information protection and, thus, are effective and practical PETs for speaker recognition. Sum Product Networks (SPNs) are noteworthy probabilistic graphical models that-like neural networks-also need efficient methods for privacy-preserving inference as a PET. We present CryptoSPN, which uses SMPC for privacy-preserving inference of SPNs that (due to a combination of machine learning and cryptographic techniques and contrary to most works on neural networks) even hides the network structure. Our implementation is integrated into the prominent SPN framework SPFlow and evaluates medium-sized SPNs within seconds. This part of the thesis is based on the following three publications: [NPT+19] A. NAUTSCH, J. PATINO, A. TREIBER, T. STAFYLAKIS, P. MIZERA, M. TODISCO, T. SCHNEIDER, N. EVANS. Privacy-Preserving Speaker Recognition with Cohort Score Normalisation”. In: 20th Conference of the International Speech Communication Association (INTERSPEECH’19). Online: https://arxiv.org/abs/1907.03454. International Speech Communication Association (ISCA), 2019, pp. 2868–2872. CORE Rank A. Appendix C. [TNK+19] A. TREIBER, A. NAUTSCH , J. KOLBERG , T. SCHNEIDER , C. BUSCH. “Privacy-Preserving PLDA Speaker Verification using Outsourced Secure Computation”. In: Speech Communication 114 (2019). Online: https://encrypto.de/papers/TNKSB19.pdf. Code: https://encrypto.de/code/PrivateASV, pp. 60–71. CORE Rank B. Appendix D. [TMW+20] A. TREIBER , A. MOLINA , C. WEINERT , T. SCHNEIDER , K. KERSTING. “CryptoSPN: Privacy-preserving Sum-Product Network Inference”. In: 24th European Conference on Artificial Intelligence (ECAI’20). Full version: https://arxiv.org/abs/2002.00801. Code: https://encrypto.de/code/CryptoSPN. IOS Press, 2020, pp. 1946–1953. CORE Rank A. Appendix E. Overall, this thesis contributes to a broader security analysis of cryptographic mechanisms and new systems and tools to effectively protect privacy in various sought-after applications
    • 

    corecore