191 research outputs found
Recommended from our members
Tree Dependent Identically Distributed Learning
We view a dataset of points or samples as having an underlying, yet unspecified, tree structure and exploit this assumption in learning problems. Such a tree structure assumption is equivalent to treating a dataset as being tree dependent identically distributed or tdid and preserves exchange-ability. This extends traditional iid assumptions on data since each datum can be sampled sequentially after being conditioned on a parent. Instead of hypothesizing a single best tree structure, we infer a richer Bayesian posterior distribution over tree structures from a given dataset. We compute this posterior over (directed or undirected) trees via the Laplacian of conditional distributions between pairs of input data points. This posterior distribution is efficiently normalized by the Laplacian's determinant and also facilitates novel maximum likelihood estimators, efficient expectations and other useful inference computations. In a classification setting, tdid assumptions yield a criterion that maximizes the determinant of a matrix of conditional distributions between pairs of input and output points. This leads to a novel classification algorithm we call the Maximum Determinant Machine. Unsupervised and supervised experiments are shown
When Enough is Enough: Location Tracking, Mosaic Theory, and Machine Learning
Since 1967, when it decided Katz v. United States, the Supreme Court has tied the right to be free of unwanted government scrutiny to the concept of reasonable xpectations of privacy.[1] An evaluation of reasonable expectations depends, among other factors, upon an assessment of the intrusiveness of government action. When making such assessment historically the Court has considered police conduct with clear temporal, geographic, or substantive limits. However, in an era where new technologies permit the storage and compilation of vast amounts of personal data, things are becoming more complicated. A school of thought known as “mosaic theory” has stepped into the void, ringing the alarm that our old tools for assessing the intrusiveness of government conduct potentially undervalue privacy rights.
Mosaic theorists advocate a cumulative approach to the evaluation of data collection. Under the theory, searches are “analyzed as a collective sequence of steps rather than as individual steps.”[2] The approach is based on the recognition that comprehensive aggregation of even seemingly innocuous data reveals greater insight than consideration of each piece of information in isolation. Over time, discrete units of surveillance data can be processed to create a mosaic of habits, relationships, and much more. Consequently, a Fourth Amendment analysis that focuses only on the government’s collection of discrete units of trivial data fails to appreciate the true harm of long-term surveillance—the composite.
In the context of location tracking, the Court has previously suggested that the Fourth Amendment may (at some theoretical threshold) be concerned with the accumulated information revealed by surveillance.[3] Similarly, in the Court’s recent decision in United States v. Jones, a majority of concurring justices indicated willingness to explore such an approach.[4] However, in general, the Court has rejected any notion that technological enhancement matters to the constitutional treatment of location tracking.[5] Rather, it has found that such surveillance in public spaces, which does not require physical trespass, is equivalent to a human tail and thus not regulated by the Fourth Amendment. In this way, the Court has avoided quantitative analysis of the amendment’s protections.
The Court’s reticence is built on the enticingly direct assertion that objectivity under the mosaic theory is impossible. This is true in large part because there has been no rationale yet offered to objectively distinguish relatively short-term monitoring from its counterpart of greater duration. As Justice Scalia recently observed in Jones: “it remains unexplained why a 4-week investigation is ‘surely’ too long.”[6] This article suggests that by combining the lessons of machine learning with the mosaic theory and applying the pairing to the Fourth Amendment we can see the contours of a response. Machine learning makes clear that mosaics can be created. Moreover, there are also important lessons to be learned on when that is the case.
Machine learning is the branch of computer science that studies systems that can draw inferences from collections of data, generally by means of mathematical algorithms. In a recent competition called “The Nokia Mobile Data Challenge,”[7] researchers evaluated machine learning’s applicability to GPS and cell phone tower data. From a user’s location history alone, the researchers were able to estimate the user’s gender, marital status, occupation and age.[8] Algorithms developed for the competition were also able to predict a user’s likely future location by observing past location history. The prediction of a user’s future location could be even further improved by using the location data of friends and social contacts.[9]
Machine learning of the sort on display during the Nokia competition seeks to harness the data deluge of today’s information society by efficiently organizing data, finding statistical regularities and other patterns in it, and making predictions therefrom. Machine learning algorithms are able to deduce information—including information that has no obvious linkage to the input data—that may otherwise have remained private due to the natural limitations of manual and human-driven investigation. Analysts can “train” machine learning programs using one dataset to find similar characteristics in new datasets. When applied to the digital “bread crumbs” of data generated by people, machine learning algorithms can make targeted personal predictions. The greater the number of data points evaluated, the greater the accuracy of the algorithm’s results.
In five parts, this article advances the conclusion that the duration of investigations is relevant to their substantive Fourth Amendment treatment because duration affects the accuracy of the predictions. Though it was previously difficult to explain why an investigation of four weeks was substantively different from an investigation of four hours, we now have a better understanding of the value of aggregated data when viewed through a machine learning lens. In some situations, predictions of startling accuracy can be generated with remarkably few data points. Furthermore, in other situations accuracy can increase dramatically above certain thresholds. For example, a 2012 study found the ability to deduce ethnicity moved sideways through five weeks of phone data monitoring, jumped sharply to a new plateau at that point, and then increased sharply again after twenty-eight weeks.[10] More remarkably, the accuracy of identification of a target’s significant other improved dramatically after five days’ worth of data inputs.[11] Experiments like these support the notion of a threshold, a point at which it makes sense to draw a Fourth Amendment line.
In order to provide an objective basis for distinguishing between law enforcement activities of differing duration the results of machine learning algorithms can be combined with notions of privacy metrics, such as k-anonymity or l-diversity. While reasonable minds may dispute the most suitable minimum accuracy threshold, this article makes the case that the collection of data points allowing predictions that exceed selected thresholds should be deemed unreasonable searches in the absence of a warrant.[12] Moreover, any new rules should take into account not only the data being collected but also the foreseeable improvements in the machine learning technology that will ultimately be brought to bear on it; this includes using future algorithms on older data.
In 2001, the Supreme Court asked “what limits there are upon the power of technology to shrink the realm of guaranteed privacy.”[13] In this piece, we explore an answer and investigate what lessons there are in the power of technology to protect the realm of guaranteed privacy. After all, as technology takes away, it also gives. The objective understanding of data compilation and analysis that is revealed by machine learning provides important Fourth Amendment insights. We should begin to consider these insights more closely.
[1] Katz v. United States, 389 U.S. 347, 361 (1967) (Harlan, J., concurring).
[2] Orin Kerr, The Mosaic Theory of the Fourth Amendment, 111 Mich. L. Rev. 311, 312 (2012).
[3] United States v. Knotts, 460 U.S. 276, 284 (1983).
[4] Justice Scalia writing for the majority left the question open. United States v. Jones, 132 S. Ct. 945, 954 (2012) (“It may be that achieving the same result [as in traditional surveillance] through electronic means, without an accompanying trespass, is an unconstitutional invasion of privacy, but the present case does not require us to answer that question.”).
[5] Compare Knotts, 460 U.S. at 276 (rejecting the contention that an electronic beeper should be treated differently than a human tail) and Smith v. Maryland, 442 U.S. 735, 744 (1979) (approving the warrantless use of a pen register in part because the justices were “not inclined to hold that a different constitutional result is required because the telephone company has decided to automate”) with Kyllo v. United States, 533 U.S. 27, 33 (2001) (recognizing that advances in technology affect the degree of privacy secured by the Fourth Amendment).
[6] United States v. Jones, 132 S.Ct. 945 (2012); see also Kerr, 111 Mich. L. Rev. at 329-330.
[7] See Nokia Research Center, Mobile Data Challenge 2012 Workshop, http://research.nokia.com/page/12340.
[8] Demographic Attributes Prediction on the Real-World Mobile Data, Sanja Brdar, Dubravko Culibrk & Vladimir Crnojevic, Nokia Mobile Data Challenge Workshop 2012.
[9] Interdependence and Predictability of Human Mobility and Social Interactions, Manlio de Domenico, Antonio Lima & Mirco Musolesi, Nokia Mobile Data Challenge Workshop 2012.
[10] See Yaniv Altshuler, Nadav Aharony, Michael Fire, Yuval Elovici, Alex Pentland, Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data, WS3P, IEEE Social Computing (2012), Figure 10.
[11] Id., Figure 9.
[12] Admittedly, there are differing views on sources of authority beyond the Constitution that might justify location tracking. See, e.g., Stephanie K. Pell & Christopher Soghoian, Can You See Me Now? Toward Reasonable Standards for Law Enforcement Access to Location Data That Congress Could Enact, 27 Berkeley Tech. L.J. 117 (2012).
[13] Kyllo, 533 U.S. at 34
Belief-Propagation for Weighted b-Matchings on Arbitrary Graphs and its Relation to Linear Programs with Integer Solutions
We consider the general problem of finding the minimum weight \bm-matching
on arbitrary graphs. We prove that, whenever the linear programming (LP)
relaxation of the problem has no fractional solutions, then the belief
propagation (BP) algorithm converges to the correct solution. We also show that
when the LP relaxation has a fractional solution then the BP algorithm can be
used to solve the LP relaxation. Our proof is based on the notion of graph
covers and extends the analysis of (Bayati-Shah-Sharma 2005 and Huang-Jebara
2007}.
These results are notable in the following regards: (1) It is one of a very
small number of proofs showing correctness of BP without any constraint on the
graph structure. (2) Variants of the proof work for both synchronous and
asynchronous BP; it is the first proof of convergence and correctness of an
asynchronous BP algorithm for a combinatorial optimization problem.Comment: 28 pages, 2 figures. Submitted to SIAM journal on Discrete
Mathematics on March 19, 2009; accepted for publication (in revised form)
August 30, 2010; published electronically July 1, 201
A highly osmotolerant rhizobial strain confers a better tolerance of nitrogen fixation and enhances protective activities to nodules of Phaseolus vulgaris under drought stress
The effect of water deficiency on nodules of common bean (Phaseolus vulgaris) inoculated with three rhizobial strains differing in their osmotolerance, was investigated in two different experiments on sterile sand. In the first experiment, the control plants were maintained at 90% field capacity (FC) and water-deficient plants were grown at 35% FC. The nitrogen fixation and growth parameters drastically decreased under water deficiency, however the three rhizobial strains, Rhizobium etli A32 (sensitive), Rhizobium tropici CIAT899 (tolerant), and Ensifer meliloti 4H41 (highly tolerant), showed different symbiotic performances. E. meliloti 4H41 allowed the best acetylene reduction activity (ARA) and biomass production and the highest number of large-sized nodules, while no significant effect was observed on lipid peroxidation, protein and legheamoglobin contents. The effect on antioxidant activities was the lowest. In the second experiment, plants were maintained at 90% FC during 45 days and then watering was stopped. The results showed that, the response to water deficit was quite similar for the three analyzed symbioses until 35% FC, but below this value of FC, symbiosis involving strain E. meliloti 4H41 was the most tolerant. This tolerance was accompanied, by in both experiments, by a stability of metabolic indices and protective antioxidant activities. These results suggest that, the relative tolerance of the nodules induced by strain 4H41 could be due to a constructive adaptation involving specific cortex structure and stress-adapted metabolic activities acquired during nodule formation and growth, rather than to a timely inducible response due to the stimulation of antioxidant enzymes. This suggestion should be confirmed through microscopic structure analysis and supplemental key enzymes in nodule metabolism such as sucrose synthase and malate dehydrogenase.Key words: Antioxidant activities, in pots experiment, leghemoglobin content, nodule, rhizobia, osmotolerance, symbiotic efficiency, water deficiency
Evaluation of the 2019 ACTp Pilot [Final Report]
In 2019, the first Additional Cost of Teaching Pharmacy (ACTp) funded experiential learning (EL) was piloted across Scotland. These pilots ran alongside existing EL in all years of the MPharm in two Scottish Schools of Pharmacy. Sixty undergraduate MPharm students participated in the pilot: 29 Robert Gordon University (RGU); 31 University of Strathclyde (UoS). A total of 41 sites hosted students, including 18 general practices, 10 community pharmacies, 10 community/specialist hospitals (e.g. mental health, prison service), NHS 24 (two sites) and one ‘combined package’. Two community pharmacies had extended opening and the eight remaining community pharmacies were in remote and rural locations. The sites were distributed across most of the Scottish Health Boards. This evaluation explored stakeholder opinions and experiences of the pilots and identified areas for future improvement
Global avian influenza surveillance in wild birds: A strategy to capture viral diversity
Wild birds play a major role in the evolution, maintenance, and spread of avian influenza viruses. However, surveillance for these viruses in wild birds is sporadic, geographically biased, and often limited to the last outbreak virus. To identify opportunities to optimize wild bird surveillance for understanding viral diversity, we reviewed responses to a World Organisation for Animal Health-administered survey, government reports to this organization, articles on Web of Knowledge, and the Influenza Research Database. At least 119 countries conducted avian influenza virus surveillance in wild birds during 2008-2013, but coordination and standardization was lacking among surveillance efforts, and most focused on limited subsets of influenza viruses. Given high financial and public health burdens of recent avian influenza outbreaks, we call for sustained, cost-effective investments in locations with high avian influenza diversity in wild birds and efforts to promote standardized sampling, testing, and reporting methods, including full-genome sequencing. (Résumé d'auteur
Recommended from our members
Computational Social Science
A field is emerging that leverages the capacity to collect and analyze data at a scale that may reveal patterns of individual and group behaviors.Governmen
A Novel Visual Word Co-occurrence Model for Person Re-identification
Person re-identification aims to maintain the identity of an individual in
diverse locations through different non-overlapping camera views. The problem
is fundamentally challenging due to appearance variations resulting from
differing poses, illumination and configurations of camera views. To deal with
these difficulties, we propose a novel visual word co-occurrence model. We
first map each pixel of an image to a visual word using a codebook, which is
learned in an unsupervised manner. The appearance transformation between camera
views is encoded by a co-occurrence matrix of visual word joint distributions
in probe and gallery images. Our appearance model naturally accounts for
spatial similarities and variations caused by pose, illumination &
configuration change across camera views. Linear SVMs are then trained as
classifiers using these co-occurrence descriptors. On the VIPeR and CUHK Campus
benchmark datasets, our method achieves 83.86% and 85.49% at rank-15 on the
Cumulative Match Characteristic (CMC) curves, and beats the state-of-the-art
results by 10.44% and 22.27%.Comment: Accepted at ECCV Workshop on Visual Surveillance and
Re-Identification, 201
- …