Search CORE

191 research outputs found

Recommended from our members

Tree Dependent Identically Distributed Learning

Author: Jebara Tony
Long Philip M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2005
Field of study

We view a dataset of points or samples as having an underlying, yet unspecified, tree structure and exploit this assumption in learning problems. Such a tree structure assumption is equivalent to treating a dataset as being tree dependent identically distributed or tdid and preserves exchange-ability. This extends traditional iid assumptions on data since each datum can be sampled sequentially after being conditioned on a parent. Instead of hypothesizing a single best tree structure, we infer a richer Bayesian posterior distribution over tree structures from a given dataset. We compute this posterior over (directed or undirected) trees via the Laplacian of conditional distributions between pairs of input data points. This posterior distribution is efficiently normalized by the Laplacian's determinant and also facilitates novel maximum likelihood estimators, efficient expectations and other useful inference computations. In a classification setting, tdid assumptions yield a criterion that maximizes the determinant of a matrix of conditional distributions between pairs of input and output points. This leads to a novel classification algorithm we call the Maximum Determinant Machine. Unsupervised and supervised experiments are shown

Columbia University Academic Commons

When Enough is Enough: Location Tracking, Mosaic Theory, and Machine Learning

Author: Bellovin Steven M.
Hutchins Renée M.
Jebara Tony
Zimmeck Sebastian
Publication venue: DigitalCommons@UM Carey Law
Publication date: 01/01/2014
Field of study

Since 1967, when it decided Katz v. United States, the Supreme Court has tied the right to be free of unwanted government scrutiny to the concept of reasonable xpectations of privacy.[1] An evaluation of reasonable expectations depends, among other factors, upon an assessment of the intrusiveness of government action. When making such assessment historically the Court has considered police conduct with clear temporal, geographic, or substantive limits. However, in an era where new technologies permit the storage and compilation of vast amounts of personal data, things are becoming more complicated. A school of thought known as “mosaic theory” has stepped into the void, ringing the alarm that our old tools for assessing the intrusiveness of government conduct potentially undervalue privacy rights. Mosaic theorists advocate a cumulative approach to the evaluation of data collection. Under the theory, searches are “analyzed as a collective sequence of steps rather than as individual steps.”[2] The approach is based on the recognition that comprehensive aggregation of even seemingly innocuous data reveals greater insight than consideration of each piece of information in isolation. Over time, discrete units of surveillance data can be processed to create a mosaic of habits, relationships, and much more. Consequently, a Fourth Amendment analysis that focuses only on the government’s collection of discrete units of trivial data fails to appreciate the true harm of long-term surveillance—the composite. In the context of location tracking, the Court has previously suggested that the Fourth Amendment may (at some theoretical threshold) be concerned with the accumulated information revealed by surveillance.[3] Similarly, in the Court’s recent decision in United States v. Jones, a majority of concurring justices indicated willingness to explore such an approach.[4] However, in general, the Court has rejected any notion that technological enhancement matters to the constitutional treatment of location tracking.[5] Rather, it has found that such surveillance in public spaces, which does not require physical trespass, is equivalent to a human tail and thus not regulated by the Fourth Amendment. In this way, the Court has avoided quantitative analysis of the amendment’s protections. The Court’s reticence is built on the enticingly direct assertion that objectivity under the mosaic theory is impossible. This is true in large part because there has been no rationale yet offered to objectively distinguish relatively short-term monitoring from its counterpart of greater duration. As Justice Scalia recently observed in Jones: “it remains unexplained why a 4-week investigation is ‘surely’ too long.”[6] This article suggests that by combining the lessons of machine learning with the mosaic theory and applying the pairing to the Fourth Amendment we can see the contours of a response. Machine learning makes clear that mosaics can be created. Moreover, there are also important lessons to be learned on when that is the case. Machine learning is the branch of computer science that studies systems that can draw inferences from collections of data, generally by means of mathematical algorithms. In a recent competition called “The Nokia Mobile Data Challenge,”[7] researchers evaluated machine learning’s applicability to GPS and cell phone tower data. From a user’s location history alone, the researchers were able to estimate the user’s gender, marital status, occupation and age.[8] Algorithms developed for the competition were also able to predict a user’s likely future location by observing past location history. The prediction of a user’s future location could be even further improved by using the location data of friends and social contacts.[9] Machine learning of the sort on display during the Nokia competition seeks to harness the data deluge of today’s information society by efficiently organizing data, finding statistical regularities and other patterns in it, and making predictions therefrom. Machine learning algorithms are able to deduce information—including information that has no obvious linkage to the input data—that may otherwise have remained private due to the natural limitations of manual and human-driven investigation. Analysts can “train” machine learning programs using one dataset to find similar characteristics in new datasets. When applied to the digital “bread crumbs” of data generated by people, machine learning algorithms can make targeted personal predictions. The greater the number of data points evaluated, the greater the accuracy of the algorithm’s results. In five parts, this article advances the conclusion that the duration of investigations is relevant to their substantive Fourth Amendment treatment because duration affects the accuracy of the predictions. Though it was previously difficult to explain why an investigation of four weeks was substantively different from an investigation of four hours, we now have a better understanding of the value of aggregated data when viewed through a machine learning lens. In some situations, predictions of startling accuracy can be generated with remarkably few data points. Furthermore, in other situations accuracy can increase dramatically above certain thresholds. For example, a 2012 study found the ability to deduce ethnicity moved sideways through five weeks of phone data monitoring, jumped sharply to a new plateau at that point, and then increased sharply again after twenty-eight weeks.[10] More remarkably, the accuracy of identification of a target’s significant other improved dramatically after five days’ worth of data inputs.[11] Experiments like these support the notion of a threshold, a point at which it makes sense to draw a Fourth Amendment line. In order to provide an objective basis for distinguishing between law enforcement activities of differing duration the results of machine learning algorithms can be combined with notions of privacy metrics, such as k-anonymity or l-diversity. While reasonable minds may dispute the most suitable minimum accuracy threshold, this article makes the case that the collection of data points allowing predictions that exceed selected thresholds should be deemed unreasonable searches in the absence of a warrant.[12] Moreover, any new rules should take into account not only the data being collected but also the foreseeable improvements in the machine learning technology that will ultimately be brought to bear on it; this includes using future algorithms on older data. In 2001, the Supreme Court asked “what limits there are upon the power of technology to shrink the realm of guaranteed privacy.”[13] In this piece, we explore an answer and investigate what lessons there are in the power of technology to protect the realm of guaranteed privacy. After all, as technology takes away, it also gives. The objective understanding of data compilation and analysis that is revealed by machine learning provides important Fourth Amendment insights. We should begin to consider these insights more closely. [1] Katz v. United States, 389 U.S. 347, 361 (1967) (Harlan, J., concurring). [2] Orin Kerr, The Mosaic Theory of the Fourth Amendment, 111 Mich. L. Rev. 311, 312 (2012). [3] United States v. Knotts, 460 U.S. 276, 284 (1983). [4] Justice Scalia writing for the majority left the question open. United States v. Jones, 132 S. Ct. 945, 954 (2012) (“It may be that achieving the same result [as in traditional surveillance] through electronic means, without an accompanying trespass, is an unconstitutional invasion of privacy, but the present case does not require us to answer that question.”). [5] Compare Knotts, 460 U.S. at 276 (rejecting the contention that an electronic beeper should be treated differently than a human tail) and Smith v. Maryland, 442 U.S. 735, 744 (1979) (approving the warrantless use of a pen register in part because the justices were “not inclined to hold that a different constitutional result is required because the telephone company has decided to automate”) with Kyllo v. United States, 533 U.S. 27, 33 (2001) (recognizing that advances in technology affect the degree of privacy secured by the Fourth Amendment). [6] United States v. Jones, 132 S.Ct. 945 (2012); see also Kerr, 111 Mich. L. Rev. at 329-330. [7] See Nokia Research Center, Mobile Data Challenge 2012 Workshop, http://research.nokia.com/page/12340. [8] Demographic Attributes Prediction on the Real-World Mobile Data, Sanja Brdar, Dubravko Culibrk & Vladimir Crnojevic, Nokia Mobile Data Challenge Workshop 2012. [9] Interdependence and Predictability of Human Mobility and Social Interactions, Manlio de Domenico, Antonio Lima & Mirco Musolesi, Nokia Mobile Data Challenge Workshop 2012. [10] See Yaniv Altshuler, Nadav Aharony, Michael Fire, Yuval Elovici, Alex Pentland, Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data, WS3P, IEEE Social Computing (2012), Figure 10. [11] Id., Figure 9. [12] Admittedly, there are differing views on sources of authority beyond the Constitution that might justify location tracking. See, e.g., Stephanie K. Pell & Christopher Soghoian, Can You See Me Now? Toward Reasonable Standards for Law Enforcement Access to Location Data That Congress Could Enact, 27 Berkeley Tech. L.J. 117 (2012). [13] Kyllo, 533 U.S. at 34

Digital Commons @ UM Law

Belief-Propagation for Weighted b-Matchings on Arbitrary Graphs and its Relation to Linear Programs with Integer Solutions

Author: Bayati M.
Braunstein A.
Christian Borgs
Edmonds J.
Jebara T.
Jennifer Chayes
Marinari E.
Mohsen Bayati
Mézard M.
Riccardo Zecchina
Yanover C.
Zdeborová L.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2011
Field of study

We consider the general problem of finding the minimum weight \bm-matching on arbitrary graphs. We prove that, whenever the linear programming (LP) relaxation of the problem has no fractional solutions, then the belief propagation (BP) algorithm converges to the correct solution. We also show that when the LP relaxation has a fractional solution then the BP algorithm can be used to solve the LP relaxation. Our proof is based on the notion of graph covers and extends the analysis of (Bayati-Shah-Sharma 2005 and Huang-Jebara 2007}. These results are notable in the following regards: (1) It is one of a very small number of proofs showing correctness of BP without any constraint on the graph structure. (2) Variants of the proof work for both synchronous and asynchronous BP; it is the first proof of convergence and correctness of an asynchronous BP algorithm for a combinatorial optimization problem.Comment: 28 pages, 2 figures. Submitted to SIAM journal on Discrete Mathematics on March 19, 2009; accepted for publication (in revised form) August 30, 2010; published electronically July 1, 201

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

A highly osmotolerant rhizobial strain confers a better tolerance of nitrogen fixation and enhances protective activities to nodules of Phaseolus vulgaris under drought stress

Author: Chihaoui S
Jebara M
Mhadhbi H
Mhamdi R
Mhamdi R
Mnasri B
Publication venue: 'African Journals Online (AJOL)'
Publication date: 12/09/2013
Field of study

The effect of water deficiency on nodules of common bean (Phaseolus vulgaris) inoculated with three rhizobial strains differing in their osmotolerance, was investigated in two different experiments on sterile sand. In the first experiment, the control plants were maintained at 90% field capacity (FC) and water-deficient plants were grown at 35% FC. The nitrogen fixation and growth parameters drastically decreased under water deficiency, however the three rhizobial strains, Rhizobium etli A32 (sensitive), Rhizobium tropici CIAT899 (tolerant), and Ensifer meliloti 4H41 (highly tolerant), showed different symbiotic performances. E. meliloti 4H41 allowed the best acetylene reduction activity (ARA) and biomass production and the highest number of large-sized nodules, while no significant effect was observed on lipid peroxidation, protein and legheamoglobin contents. The effect on antioxidant activities was the lowest. In the second experiment, plants were maintained at 90% FC during 45 days and then watering was stopped. The results showed that, the response to water deficit was quite similar for the three analyzed symbioses until 35% FC, but below this value of FC, symbiosis involving strain E. meliloti 4H41 was the most tolerant. This tolerance was accompanied, by in both experiments, by a stability of metabolic indices and protective antioxidant activities. These results suggest that, the relative tolerance of the nodules induced by strain 4H41 could be due to a constructive adaptation involving specific cortex structure and stress-adapted metabolic activities acquired during nodule formation and growth, rather than to a timely inducible response due to the stimulation of antioxidant enzymes. This suggestion should be confirmed through microscopic structure analysis and supplemental key enzymes in nodule metabolism such as sucrose synthase and malate dehydrogenase.Key words: Antioxidant activities, in pots experiment, leghemoglobin content, nodule, rhizobia, osmotolerance, symbiotic efficiency, water deficiency

AJOL - African Journals Online

Evaluation of the 2019 ACTp Pilot [Final Report]

Author: Jacob S A
Jebara T
Watson M C
Publication venue: NHS Education for Scotland
Publication date: 30/08/2019
Field of study

In 2019, the first Additional Cost of Teaching Pharmacy (ACTp) funded experiential learning (EL) was piloted across Scotland. These pilots ran alongside existing EL in all years of the MPharm in two Scottish Schools of Pharmacy. Sixty undergraduate MPharm students participated in the pilot: 29 Robert Gordon University (RGU); 31 University of Strathclyde (UoS). A total of 41 sites hosted students, including 18 general practices, 10 community pharmacies, 10 community/specialist hospitals (e.g. mental health, prison service), NHS 24 (two sites) and one ‘combined package’. Two community pharmacies had extended opening and the eight remaining community pharmacies were in remote and rural locations. The sites were distributed across most of the Scottish Health Boards. This evaluation explored stakeholder opinions and experiences of the pilots and identified areas for future improvement

University of Strathclyde Institutional Repository

Discriminative learning can succeed where generative learning fails

Author: Anoulova
Duda
Ehrenfeucht
Goldberg
Hans Ulrich Simon
Jaakkola
Jebara
Philip M. Long
Rocco A. Servedio
Valiant
Vapnik
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Global avian influenza surveillance in wild birds: A strategy to capture viral diversity

Author: Daszak Peter
Elwood Sarah E.
Forcella Simona
Gaidet Nicolas
Hamilton Keith
Jebara Karim B.
Karesh William B.
Machalaba Catherine C.
Mazet Jonna A.K.
Mumford Elizabeth
Smith Kristine M.
Swayne David E.
Webby Richard J.
Publication venue: 'Centers for Disease Control and Prevention (CDC)'
Publication date: 01/01/2015
Field of study

Wild birds play a major role in the evolution, maintenance, and spread of avian influenza viruses. However, surveillance for these viruses in wild birds is sporadic, geographically biased, and often limited to the last outbreak virus. To identify opportunities to optimize wild bird surveillance for understanding viral diversity, we reviewed responses to a World Organisation for Animal Health-administered survey, government reports to this organization, articles on Web of Knowledge, and the Influenza Research Database. At least 119 countries conducted avian influenza virus surveillance in wild birds during 2008-2013, but coordination and standardization was lacking among surveillance efforts, and most focused on limited subsets of influenza viruses. Given high financial and public health burdens of recent avian influenza outbreaks, we call for sustained, cost-effective investments in locations with high avian influenza diversity in wild birds and efforts to promote standardized sampling, testing, and reporting methods, including full-genome sequencing. (Résumé d'auteur

Crossref

Directory of Open Access Journals

PubMed Central

Agritrop

eScholarship - University of California

Recommended from our members

Computational Social Science

Author: Adamic Lada
Aral Sinan
Barabási Albert-László
Brewer Devon
Christakis Nicholas Alexander
Contractor Noshir
Fowler James
Gutmann Myron
Jebara Tony
King Gary
Lazer David M.
Macy Michael
Pentland Alex
Roy Deb
Van Alstyne Marshall
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 02/06/2010
Field of study

A field is emerging that leverages the capacity to collect and analyze data at a scale that may reveal patterns of individual and group behaviors.Governmen

Harvard University - DASH

A Novel Visual Word Co-occurrence Model for Person Re-identification

Author: AJ Smola
B Hariharan
C Liu
D Gray
J Gemert van
L Bazzani
M Dikmen
N Gheissari
ND Bird
O Javed
PF Felzenszwalb
RE Fan
T Jebara
V-H Nguyen
W Li
WS Zheng
Publication venue
Publication date: 23/10/2014
Field of study

Person re-identification aims to maintain the identity of an individual in diverse locations through different non-overlapping camera views. The problem is fundamentally challenging due to appearance variations resulting from differing poses, illumination and configurations of camera views. To deal with these difficulties, we propose a novel visual word co-occurrence model. We first map each pixel of an image to a visual word using a codebook, which is learned in an unsupervised manner. The appearance transformation between camera views is encoded by a co-occurrence matrix of visual word joint distributions in probe and gallery images. Our appearance model naturally accounts for spatial similarities and variations caused by pose, illumination & configuration change across camera views. Linear SVMs are then trained as classifiers using these co-occurrence descriptors. On the VIPeR and CUHK Campus benchmark datasets, our method achieves 83.86% and 85.49% at rank-15 on the Cumulative Match Characteristic (CMC) curves, and beats the state-of-the-art results by 10.44% and 22.27%.Comment: Accepted at ECCV Workshop on Visual Surveillance and Re-Identification, 201

arXiv.org e-Print Archive

Crossref