648 research outputs found

    Comparing knowledge sources for nominal anaphora resolution

    Get PDF
    We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora by means of shallow lexico-semantic patterns. As corpora we use the British National Corpus (BNC), as well as the Web, which has not been previously used for this task. Our results show that (a) the knowledge encoded in WordNet is often insufficient, especially for anaphor-antecedent relations that exploit subjective or context-dependent knowledge; (b) for other-anaphora, the Web-based method outperforms the WordNet-based method; (c) for definite NP coreference, the Web-based method yields results comparable to those obtained using WordNet over the whole dataset and outperforms the WordNet-based method on subsets of the dataset; (d) in both case studies, the BNC-based method is worse than the other methods because of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge gap often encountered in anaphora resolution, and handled examples with context-dependent relations between anaphor and antecedent. Because it is inexpensive and needs no hand-modelling of lexical knowledge, it is a promising knowledge source to integrate in anaphora resolution systems

    What Can We Learn Privately?

    Full text link
    Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large real-life data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in contexts where aggregate information is released about a database containing sensitive information about individuals. We demonstrate that, ignoring computational constraints, it is possible to privately agnostically learn any concept class using a sample size approximately logarithmic in the cardinality of the concept class. Therefore, almost anything learnable is learnable privately: specifically, if a concept class is learnable by a (non-private) algorithm with polynomial sample complexity and output size, then it can be learned privately using a polynomial number of samples. We also present a computationally efficient private PAC learner for the class of parity functions. Local (or randomized response) algorithms are a practical class of private algorithms that have received extensive investigation. We provide a precise characterization of local private learning algorithms. We show that a concept class is learnable by a local algorithm if and only if it is learnable in the statistical query (SQ) model. Finally, we present a separation between the power of interactive and noninteractive local learning algorithms.Comment: 35 pages, 2 figure

    Under pressure: costs of living, financial hardship and emergency relief in Victoria

    Get PDF
    'Under pressure: Costs of living, financial hardship and emergency relief in Victoria' presents the findings of a research project conducted between 2007 and 2008 on demand for emergency relief in Victoria. The research project was a partnership between the Victorian Council of Social Service (VCOSS), RMIT University and the emergency relief peak body ER Victoria. Emergency relief can be defined as the provision of critical support to individuals and families experiencing a financial emergency or crisis. Emergency relief assistance can include a food voucher or parcel, household goods, clothing and financial assistance for utilities or food. Emergency relief is currently provided by over 700 non-government organisations in Victoria

    Differential Privacy and the Fat-Shattering Dimension of Linear Queries

    Full text link
    In this paper, we consider the task of answering linear queries under the constraint of differential privacy. This is a general and well-studied class of queries that captures other commonly studied classes, including predicate queries and histogram queries. We show that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.Comment: Appears in APPROX 201

    HI 21-centimetre emission from an ensemble of galaxies at an average redshift of one

    Full text link
    The baryonic processes in galaxy evolution include gas infall onto galaxies to form neutral atomic hydrogen (HI), the conversion of HI to the molecular state (H2_2), and, finally, the conversion of H2_2 to stars. Understanding galaxy evolution thus requires understanding the evolution of both the stars, and the neutral atomic and molecular gas, the primary fuel for star-formation, in galaxies. For the stars, the cosmic star-formation rate density is known to peak in the redshift range z≈1−3z \approx 1-3, and to decline by an order of magnitude over the next ≈10\approx 10 billion years; the causes of this decline are not known. For the gas, the weakness of the hyperfine HI 21cm transition, the main tracer of the HI content of galaxies, has meant that it has not hitherto been possible to measure the atomic gas mass of galaxies at redshifts higher than ≈0.4\approx 0.4; this is a critical lacuna in our understanding of galaxy evolution. Here, we report a measurement of the average HI mass of star-forming galaxies at a redshift z≈1z \approx 1, by stacking their individual HI 21 cm emission signals. We obtain an average HI mass similar to the average stellar mass of the sample. We also estimate the average star-formation rate of the same galaxies from the 1.4 GHz radio continuum, and find that the HI mass can fuel the observed star-formation rates for only ≈1−2\approx 1-2 billion years in the absence of fresh gas infall. This suggests that gas accretion onto galaxies at z<1z < 1 may have been insufficient to sustain high star-formation rates in star-forming galaxies. This is likely to be the cause of the decline in the cosmic star-formation rate density at redshifts below 1.Comment: 43 pages, 8 figures. Published in Nature: https://www.nature.com/articles/s41586-020-2794-7. In the original version of the paper, the upper x-axis "Lookback time (Gyr)" of Fig. 3 had incorrectly-placed tick marks. This is now fixed in the updated version. We thank Sambit Roychowdhury for pointing out the erro

    Probing star formation in galaxies at z≈1z \approx 1 via a Giant Metrewave Radio Telescope stacking analysis

    Full text link
    We have used the Giant Metrewave Radio Telescope (GMRT) to carry out deep 610 MHz continuum imaging of four sub-fields of the DEEP2 Galaxy Redshift Survey. We stacked the radio emission in the GMRT images from a near-complete (absolute blue magnitude MB≤−21{\rm M_B} \leq -21) sample of 3698 blue star-forming galaxies with redshifts 0.7≲z≲1.450.7 \lesssim z \lesssim 1.45 to detect (at ≈17σ\approx 17\sigma significance) the median rest-frame 1.4 GHz radio continuum emission of the sample galaxies. The stacked emission is unresolved, with a rest-frame 1.4 GHz luminosity of L1.4  GHz=(4.13±0.24)×1022\rm L_{1.4 \; GHz} = (4.13 \pm 0.24) \times 10^{22} W Hz−1^{-1}. We used the local relation between total star formation rate (SFR) and 1.4 GHz luminosity to infer a median total SFR of (24.4±1.4)  M⊙\rm (24.4 \pm 1.4)\; M_\odot yr−1^{-1} for blue star-forming galaxies with MB≤−21\rm M_B \leq -21 at 0.7≲z≲1.450.7 \lesssim z \lesssim 1.45. We detect the main-sequence relation between SFR and stellar mass, M⋆\rm M_\star, obtaining SFR=(13.4±1.8)×[(M⋆/(1010  M⊙)]0.73±0.09  M⊙  yr−1\rm SFR = (13.4 \pm 1.8) \times [(M_{\star}/(10^{10} \;M_\odot)]^{0.73 \pm 0.09} \; M_\odot \; yr^{-1}; the power-law index shows no change over z≈0.7−1.45z \approx 0.7 - 1.45. We find that the nebular line emission suffers less extinction than the stellar continuum, contrary to the situation in the local Universe; the ratio of nebular extinction to stellar extinction increases with decreasing redshift. We obtain an upper limit of 0.87 Gyr to the atomic gas depletion time of a sub-sample of the DEEP2 galaxies at z≈1.3z \approx 1.3; neutral atomic gas thus appears to be a transient phase in high-zz star-forming galaxies.Comment: 16 pages, 7 figures; accepted for publication in ApJ. Minor changes to match version accepted for publicatio

    Targeted delivery of anti-inflammatory therapy to rheumatoid tissue by fusion proteins containing an IL-4-linked synovial targeting peptide

    Get PDF
    We provide first-time evidence that the synovial endothelium-targeting peptide (SyETP) CKSTHDRLC successfully delivers conjugated IL-4 to human rheumatoid synovium transplanted into SCID mice. SyETP, previously isolated by in vivo phage display and shown to preferentially localize to synovial xenografts, was linked by recombinant technology to hIL-4 via an MMP-cleavable sequence. Both IL-4 and the MMP-cleavable sequence were shown to be functional. IL-4-SyETP augmented production of IL-1ra by synoviocytes stimulated with IL-1[beta] in a dose-dependent manner. In vivo imaging confirmed increased retention of SyETP-linked-IL-4 in synovial grafts which was enhanced by increasing number of copies (one to three) in the constructs. Strikingly, SyETP delivered bioactive IL-4 in vivo as demonstrated by increased pSTAT6 in synovial grafts. Thus, this study provides proof of concept for peptide-tissue-specific targeted immunotherapy in rheumatoid arthritis. This technology is potentially applicable to other biological therapies providing enhanced potency to inflammatory sites and reducing systemic toxicity

    Formalizing Data Deletion in the Context of the Right to be Forgotten

    Get PDF
    The right of an individual to request the deletion of their personal data by an entity that might be storing it -- referred to as the right to be forgotten -- has been explicitly recognized, legislated, and exercised in several jurisdictions across the world, including the European Union, Argentina, and California. However, much of the discussion surrounding this right offers only an intuitive notion of what it means for it to be fulfilled -- of what it means for such personal data to be deleted. In this work, we provide a formal definitional framework for the right to be forgotten using tools and paradigms from cryptography. In particular, we provide a precise definition of what could be (or should be) expected from an entity that collects individuals' data when a request is made of it to delete some of this data. Our framework captures several, though not all, relevant aspects of typical systems involved in data processing. While it cannot be viewed as expressing the statements of current laws (especially since these are rather vague in this respect), our work offers technically precise definitions that represent possibilities for what the law could reasonably expect, and alternatives for what future versions of the law could explicitly require. Finally, with the goal of demonstrating the applicability of our framework and definitions, we consider various natural and simple scenarios where the right to be forgotten comes up. For each of these scenarios, we highlight the pitfalls that arise even in genuine attempts at implementing systems offering deletion guarantees, and also describe technological solutions that provably satisfy our definitions. These solutions bring together techniques built by various communities

    Intelligent OS X malware threat detection with code inspection

    Get PDF
    With the increasing market share of Mac OS X operating system, there is a corresponding increase in the number of malicious programs (malware) designed to exploit vulnerabilities on Mac OS X platforms. However, existing manual and heuristic OS X malware detection techniques are not capable of coping with such a high rate of malware. While machine learning techniques offer promising results in automated detection of Windows and Android malware, there have been limited efforts in extending them to OS X malware detection. In this paper, we propose a supervised machine learning model. The model applies kernel base Support Vector Machine (SVM) and a novel weighting measure based on application library calls to detect OS X malware. For training and evaluating the model, a dataset with a combination of 152 malware and 450 benign were is created. Using common supervised Machine Learning algorithm on the dataset, we obtain over 91% detection accuracy with 3.9% false alarm rate. We also utilize Synthetic Minority Over-sampling Technique (SMOTE) to create three synthetic datasets with different distributions based on the refined version of collected dataset to investigate impact of different sample sizes on accuracy of malware detection. Using SMOTE datasets we could achieve over 96% detection accuracy and false alarm of less than 4%. All malware classification experiments are tested using cross validation technique. Our results reflect that increasing sample size in synthetic datasets has direct positive effect on detection accuracy while increases false alarm rate in compare to the original dataset
    • …
    corecore