Search CORE

53 research outputs found

Exploring the synergistic potential of quantum annealing and gate model computing for portfolio optimization

Author: Chandra M Girish
Jain Naman
Publication venue
Publication date: 02/05/2023
Field of study

Portfolio optimization is one of the most studied problems for demonstrating the near-term applications of quantum computing. However, large-scale problems cannot be solved on today's quantum hardware. In this work, we extend upon a study to use the best of both quantum annealing and gate-based quantum computing systems to enable solving large-scale optimization problems efficiently on the available hardware. The existing work uses a method called Large System Sampling Approximation (LSSA) that involves dividing the large problem into several smaller problems and then combining the multiple solutions to approximate the solution to the original problem. This paper introduces a novel technique to modify the sampling step of LSSA. We divide the portfolio optimization problem into sub-systems of smaller sizes by selecting a diverse set of assets that act as representatives of the entire market and capture the highest correlations among assets. We conduct tests on real-world stock data from the Indian stock market on up to 64 assets. Our experimentation shows that the hybrid approach performs at par with the traditional classical optimization methods with a good approximation ratio. We also demonstrate the effectiveness of our approach on a range of portfolio optimization problems of different sizes. We present the effects of different parameters on the proposed method and compare its performance with the earlier work. Our findings suggest that hybrid annealer-gate quantum computing can be a valuable tool for portfolio managers seeking to optimize their investment portfolios in the near future.Comment: 12 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

Campaign Ad - Betty Sutton

Author: Jain Naman
Rathfelder Amy
Swoy Madison
Publication venue: Open Works
Publication date: 05/05/2012
Field of study

Campaign ad for Political Science class - PSCI 217 - Media and Politics

The College of Wooster

PlantDoc: A Dataset for Visual Plant Disease Detection

Author: Batra Nipun
Jain Naman
Jain Pranjali
Kayal Pratik
Kumawat Sudhakar
Singh Davinder
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/11/2019
Field of study

India loses 35% of the annual crop yield due to plant diseases. Early detection of plant diseases remains difficult due to the lack of lab infrastructure and expertise. In this paper, we explore the possibility of computer vision approaches for scalable and early plant disease detection. The lack of availability of sufficiently large-scale non-lab data set remains a major challenge for enabling vision based plant disease detection. Against this background, we present PlantDoc: a dataset for visual plant disease detection. Our dataset contains 2,598 data points in total across 13 plant species and up to 17 classes of diseases, involving approximately 300 human hours of effort in annotating internet scraped images. To show the efficacy of our dataset, we learn 3 models for the task of plant disease classification. Our results show that modelling using our dataset can increase the classification accuracy by up to 31%. We believe that our dataset can help reduce the entry barrier of computer vision techniques in plant disease detection.Comment: 5 Pages, 6 figures, 3 table

arXiv.org e-Print Archive

Crossref

Sampling Semantic Data Stream: Resolving Overload and Limited Storage Issues

Author: Chiky Raja
Jain Naman
Kazi-Aoul Zakia
POZO Manuel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/12/2013
Field of study

International audienceThe Semantic Web technologies are being increasingly used for exploiting relations between data. In addition, new tendencies of real-time systems, such as social networks, sensors, cameras or weather information , are continuously generating data. This implies that the data and the links between them are becoming extremely vast. Such huge quantity of data needs to be analyzed, processed, as well as stored, if necessary. In this paper, we propose sampling operators that allow us to drop RDF Triples from the incoming data. Thereby, helping us to reduce the load on existing engines like CQELS, C-SPARQL, which are able to deal with big and linked data. Hence, the processing efforts, time as well as required storage space will reduce remarkably. We have proposed Uniform Random Sampling, Reservoir Sampling and Chain Sampling operators which may be implemented depending on the application

Revisiting Prompt Engineering via Declarative Crowdsourcing

Author: Asawa Parth
Jain Naman
Parameswaran Aditya G.
Shankar Shreya
Wang Yujie
Publication venue
Publication date: 07/08/2023
Field of study

Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been an advent of toolkits and recipes centered around so-called prompt engineering-the process of asking an LLM to do something via a series of prompts. However, for LLM-powered data processing workflows, in particular, optimizing for quality, while keeping cost bounded, is a tedious, manual process. We put forth a vision for declarative prompt engineering. We view LLMs like crowd workers and leverage ideas from the declarative crowdsourcing literature-including leveraging multiple prompting strategies, ensuring internal consistency, and exploring hybrid-LLM-non-LLM approaches-to make prompt engineering a more principled process. Preliminary case studies on sorting, entity resolution, and imputation demonstrate the promise of our approac

arXiv.org e-Print Archive

LLM-Assisted Code Cleaning For Training Accurate Code Generators

Author: Chiang Wei-Lin
Gonzalez Joseph E.
Jain Naman
Sen Koushik
Stoica Ion
Zhang Tianjun
Publication venue
Publication date: 24/11/2023
Field of study

Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional correctness of training sets while disregarding other stylistic elements of programs. More recently, data quality has garnered a lot of interest and multiple works have showcased its importance for improving performance. In this work, we investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs by 1.) renaming variables, 2.) modularizing and decomposing complex code into smaller helper sub-functions, and 3.) inserting natural-language based plans via LLM based transformations. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B on our transformed modularized programs improves the performance by up to 30% compared to fine-tuning on the original dataset. Additionally, we demonstrate improved performance from using a smaller amount of higher-quality data, finding that a model fine-tuned on the entire original dataset is outperformed by a model trained on 15% of our cleaned dataset. Even in comparison to closed-source models, our models outperform the much larger AlphaCoder models

arXiv.org e-Print Archive

Corrigendum: Water Supply 22 (12), 9023–9040: Optimization of convergent angle of the Venturi meter for best coefficient of discharge, https://doi.org/10.2166/ws.2022.381

Author: Naman Jain
S. Anbu Kumar
Zohaib Ahmed Khan
Publication venue: 'IWA Publishing'
Publication date: 01/05/2023
Field of study

Directory of Open Access Journals