Search CORE

13 research outputs found

Computing Bits of Algebraic Numbers

Author: Datta Samir
Pratap Rameshwar
Publication venue
Publication date: 01/01/2011
Field of study

We initiate the complexity theoretic study of the problem of computing the bits of (real) algebraic numbers. This extends the work of Yap on computing the bits of transcendental numbers like \pi, in Logspace. Our main result is that computing a bit of a fixed real algebraic number is in C=NC1\subseteq Logspace when the bit position has a verbose (unary) representation and in the counting hierarchy when it has a succinct (binary) representation. Our tools are drawn from elementary analysis and numerical analysis, and include the Newton-Raphson method. The proof of our main result is entirely elementary, preferring to use the elementary Liouville's theorem over the much deeper Roth's theorem for algebraic numbers. We leave the possibility of proving non-trivial lower bounds for the problem of computing the bits of an algebraic number given the bit position in binary, as our main open question. In this direction we show very limited progress by proving a lower bound for rationals

arXiv.org e-Print Archive

CiteSeerX

Testing Uniformity of Stationary Distribution

Author: Chakraborty Sourav
Kamath Akshay
Pratap Rameshwar
Publication venue
Publication date: 10/03/2016
Field of study

A random walk on a directed graph gives a Markov chain on the vertices of the graph. An important question that arises often in the context of Markov chain is whether the uniform distribution on the vertices of the graph is a stationary distribution of the Markov chain. Stationary distribution of a Markov chain is a global property of the graph. In this paper, we prove that for a regular directed graph whether the uniform distribution on the vertices of the graph is a stationary distribution, depends on a local property of the graph, namely if (u,v) is an directed edge then outdegree(u) is equal to indegree(v). This result also has an application to the problem of testing whether a given distribution is uniform or "far" from being uniform. This is a well studied problem in property testing and statistics. If the distribution is the stationary distribution of the lazy random walk on a directed graph and the graph is given as an input, then how many bits of the input graph do one need to query in order to decide whether the distribution is uniform or "far" from it? This is a problem of graph property testing and we consider this problem in the orientation model (introduced by Halevy et al.). We reduce this problem to test (in the orientation model) whether a directed graph is Eulerian. And using result of Fischer et al. on query complexity of testing (in the orientation model) whether a graph is Eulerian, we obtain bounds on the query complexity for testing whether the stationary distribution is uniform

arXiv.org e-Print Archive

CiteSeerX

Efficient Compression Technique for Sparse Sets

Author: Kulkarni Raghav
Pratap Rameshwar
Sohony Ishan
Publication venue
Publication date: 16/08/2017
Field of study

Recent technological advancements have led to the generation of huge amounts of data over the web, such as text, image, audio and video. Most of this data is high dimensional and sparse, for e.g., the bag-of-words representation used for representing text. Often, an efficient search for similar data points needs to be performed in many applications like clustering, nearest neighbour search, ranking and indexing. Even though there have been significant increases in computational power, a simple brute-force similarity-search on such datasets is inefficient and at times impossible. Thus, it is desirable to get a compressed representation which preserves the similarity between data points. In this work, we consider the data points as sets and use Jaccard similarity as the similarity measure. Compression techniques are generally evaluated on the following parameters --1) Randomness required for compression, 2) Time required for compression, 3) Dimension of the data after compression, and 4) Space required to store the compressed data. Ideally, the compressed representation of the data should be such, that the similarity between each pair of data points is preserved, while keeping the time and the randomness required for compression as low as possible. We show that the compression technique suggested by Pratap and Kulkarni also works well for Jaccard similarity. We present a theoretical proof of the same and complement it with rigorous experimentations on synthetic as well as real-world datasets. We also compare our results with the state-of-the-art "min-wise independent permutation", and show that our compression algorithm achieves almost equal accuracy while significantly reducing the compression time and the randomness

arXiv.org e-Print Archive

Crossref

Improved Outlier Robust Seeding for k-means

Author: Deshpande Amit
Pratap Rameshwar
Publication venue
Publication date: 06/09/2023
Field of study

The

k

-means is a popular clustering objective, although it is inherently non-robust and sensitive to outliers. Its popular seeding or initialization called

k

-means++ uses

D^{2}

sampling and comes with a provable

O(\log k)

approximation guarantee \cite{AV2007}. However, in the presence of adversarial noise or outliers,

D^{2}

sampling is more likely to pick centers from distant outliers instead of inlier clusters, and therefore its approximation guarantees \textit{w.r.t.}

k

-means solution on inliers, does not hold. Assuming that the outliers constitute a constant fraction of the given data, we propose a simple variant in the

D^2

sampling distribution, which makes it robust to the outliers. Our algorithm runs in

O(ndk)

time, outputs

O(k)

clusters, discards marginally more points than the optimal number of outliers, and comes with a provable

O(1)

approximation guarantee. Our algorithm can also be modified to output exactly

k

clusters instead of

O(k)

clusters, while keeping its running time linear in

n

and

d

. This is an improvement over previous results for robust

k

-means based on LP relaxation and rounding \cite{Charikar}, \cite{KrishnaswamyLS18} and \textit{robust

k

-means++} \cite{DeshpandeKP20}. Our empirical results show the advantage of our algorithm over

k

-means++~\cite{AV2007}, uniform random seeding, greedy sampling for

k

means~\cite{tkmeanspp}, and robust

k

-means++~\cite{DeshpandeKP20}, on standard real-world and synthetic data sets used in previous work. Our proposal is easily amenable to scalable, faster, parallel implementations of

k

-means++ \cite{Bahmani,BachemL017} and is of independent interest for coreset constructions in the presence of outliers \cite{feldman2007ptas,langberg2010universal,feldman2011unified}

arXiv.org e-Print Archive

Minwise-Independent Permutations with Insertion and Deletion of Features

Author: Kulkarni Raghav
Pratap Rameshwar
Publication venue
Publication date: 22/08/2023
Field of study

In their seminal work, Broder \textit{et. al.}~\citep{BroderCFM98} introduces the

\mathrm{minHash}

algorithm that computes a low-dimensional sketch of high-dimensional binary data that closely approximates pairwise Jaccard similarity. Since its invention,

\mathrm{minHash}

has been commonly used by practitioners in various big data applications. Further, the data is dynamic in many real-life scenarios, and their feature sets evolve over time. We consider the case when features are dynamically inserted and deleted in the dataset. We note that a naive solution to this problem is to repeatedly recompute

\mathrm{minHash}

with respect to the updated dimension. However, this is an expensive task as it requires generating fresh random permutations. To the best of our knowledge, no systematic study of

\mathrm{minHash}

is recorded in the context of dynamic insertion and deletion of features. In this work, we initiate this study and suggest algorithms that make the

\mathrm{minHash}

sketches adaptable to the dynamic insertion and deletion of features. We show a rigorous theoretical analysis of our algorithms and complement it with extensive experiments on several real-world datasets. Empirically we observe a significant speed-up in the running time while simultaneously offering comparable performance with respect to running

\mathrm{minHash}

from scratch. Our proposal is efficient, accurate, and easy to implement in practice

arXiv.org e-Print Archive

Efficient Sketching Algorithm for Sparse Binary Data

Author: Bera Debajyoti
Pratap Rameshwar
Revanuru Karthik
Publication venue
Publication date: 10/10/2019
Field of study

Recent advancement of the WWW, IOT, social network, e-commerce, etc. have generated a large volume of data. These datasets are mostly represented by high dimensional and sparse datasets. Many fundamental subroutines of common data analytic tasks such as clustering, classification, ranking, nearest neighbour search, etc. scale poorly with the dimension of the dataset. In this work, we address this problem and propose a sketching (alternatively, dimensionality reduction) algorithm -- \binsketch (Binary Data Sketch) -- for sparse binary datasets. \binsketch preserves the binary version of the dataset after sketching and maintains estimates for multiple similarity measures such as Jaccard, Cosine, Inner-Product similarities, and Hamming distance, on the same sketch. We present a theoretical analysis of our algorithm and complement it with extensive experimentation on several real-world datasets. We compare the performance of our algorithm with the state-of-the-art algorithms on the task of mean-square-error and ranking. Our proposed algorithm offers a comparable accuracy while suggesting a significant speedup in the dimensionality reduction time, with respect to the other candidate algorithms. Our proposal is simple, easy to implement, and therefore can be adopted in practice

arXiv.org e-Print Archive

Crossref

One-Pass Additive-Error Subset Selection for ?_p Subspace Approximation

Author: Deshpande Amit
Pratap Rameshwar
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)
Publication date: 01/01/2022
Field of study

We consider the problem of subset selection for ?_p subspace approximation, that is, to efficiently find a small subset of data points such that solving the problem optimally for this subset gives a good approximation to solving the problem optimally for the original input. Previously known subset selection algorithms based on volume sampling and adaptive sampling [Deshpande and Varadarajan, 2007], for the general case of p ? [1, ?), require multiple passes over the data. In this paper, we give a one-pass subset selection with an additive approximation guarantee for ?_p subspace approximation, for any p ? [1, ?). Earlier subset selection algorithms that give a one-pass multiplicative (1+?) approximation work under the special cases. Cohen et al. [Michael B. Cohen et al., 2017] gives a one-pass subset section that offers multiplicative (1+?) approximation guarantee for the special case of ?? subspace approximation. Mahabadi et al. [Sepideh Mahabadi et al., 2020] gives a one-pass noisy subset selection with (1+?) approximation guarantee for ?_p subspace approximation when p ? {1, 2}. Our subset selection algorithm gives a weaker, additive approximation guarantee, but it works for any p ? [1, ?)

Dagstuhl Research Online Publication Server

Optical coherence tomography and subclinical optical neuritis in longitudinally extensive transverse myelitis

Author: Abhishek Pathak
Deepika Joshi
Garima Gupta
Prakash Kumar Sinha
Rameshwar Nath Chaurasia
Sujit Deshmukh
Usha Singh
Vijay Nath Mishra
Virendra Pratap Singh
Vivek Sharda
Publication venue: 'Medknow'
Publication date: 01/01/2017
Field of study

Objective: The aim is to compare the retinal nerve fiber layer (RNFL) thickness of longitudinally extensive transverse myelitis (LETM) eyes without previous optic neuritis with that of healthy control subjects. Methods: Over 20 LETM eyes and 20 normal control eyes were included in the study and subjected to optical coherence tomography to evaluate and compare the RNFL thickness. Result: Significant RNFL thinning was observed at 8 o'clock position in LETM eyes as compared to the control eyes (P = 0.038). No significant differences were seen in other RNFL measurements. Conclusion: Even in the absence of previous optic neuritis LETM can lead to subclinical axonal damage leading to focal RNFL thinning

Directory of Open Access Journals

Root‐specific expression of chickpea cytokinin oxidase/dehydrogenase 6 leads to enhanced root growth, drought tolerance and yield without compromising nodulation

Author: Aleena Francis
Debasis Chattopadhyay
Drishti Mandal
Henry A.
Hitaishi Khandal
Kapil Sharma
Kondo M.
Krishnamurthy L.
Lalita Pal
Megha Choudhary
Nagendra Pratap Singh
Narendra Pratap Singh
Nilesh Kumar Sharma
Niraj Kumar Vishwakarma
Paheli Malakar
Passioura J.B.
Rameshwar Sharma
Rupela O.P.
Santosh Kumar Gupta
Senjuti Sinharoy
Sharma K.K.
Vikas Dwivedi
Publication venue: 'Wiley'
Publication date
Field of study

Crossref