59 research outputs found
Online sketch-based image retrieval using keyshape mining of geometrical objects
Online image retrieval has become an active information-sharing due to the massive use of the Internet. The key challenging problems are the semantic gap between the low-level visual features and high-semantic perception and interpretation, due to understating complexity of images and the hand-drawn query input representation which is not a regular input in addition to the huge amount of web images. Besides, the state-of-art research is highly desired to combine multiple types of different feature representations to close the semantic gap. This study developed a new schema to retrieve images directly from the web repository. It comprises three major phases. Firstly a new online input representation based on pixel mining to detect sketch shape features and correlate them with the semantic sketch objects meaning was designed. Secondly, training process was developed to obtain common templates using Singular Value Decomposition (SVD) technique to detect common sketch template. The outcome of this step is a sketch of variety templates dictionary. Lastly, the retrieval phase matched and compared the sketch with image repository using metadata annotation to retrieve the most relevant images. The sequence of processes in this schema converts the drawn input sketch to a string form which contains the sketch object elements. Then, the string is matched with the templates dictionary to specify the sketch metadata name. This selected name will be sent to a web repository to match and retrieve the relevant images. A series of experiments was conducted to evaluate the performance of the schema against the state of the art found in literature using the same datasets comprising one million images from FlickerIm and 0.2 million images from ImageNet. There was a significant retrieval in all cases of 100% precision for the first five retrieved images whereas the state of the art only achieved 88.8%. The schema has addressed many low features obstacles to retrieve more accurate images such as imperfect sketches, rotation, transpose and scaling. The schema has solved all these problems by using a high level semantic to retrieve accurate images from large databases and the web
CONTENT BASED IMAGE RETRIEVAL (CBIR) SYSTEM
Advancement in hardware and telecommunication technology has boosted up creation
and distribution of digital visual content. However this rapid growth of visual content
creations has not been matched by the simultaneous emergence of technologies to support
efficient image analysis and retrieval. Although there are attempt to solve this problem by
using meta-data text annotation but this approach are not practical when it come to the
large number of data collection.
This system used 7 different feature vectors that are focusing on 3 main low level feature
groups (color, shape and texture). This system will use the image that the user feed and
search the similar images in the database that had similar feature by considering the
threshold value. One of the most important aspects in CBIR is to determine the correct
threshold value. Setting the correct threshold value is important in CBIR because setting
it too low will result in less image being retrieve that might exclude relevant data. Setting
to high threshold value might result in irrelevant data to be retrieved and increase the
search time for image retrieval.
Result show that this project able to increase the image accuracy to average 70% by
combining 7 different feature vector at correct threshold value.
ii
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
CONTENT BASED IMAGE RETRIEVAL (CBIR) SYSTEM
Advancement in hardware and telecommunication technology has boosted up creation
and distribution of digital visual content. However this rapid growth of visual content
creations has not been matched by the simultaneous emergence of technologies to support
efficient image analysis and retrieval. Although there are attempt to solve this problem by
using meta-data text annotation but this approach are not practical when it come to the
large number of data collection.
This system used 7 different feature vectors that are focusing on 3 main low level feature
groups (color, shape and texture). This system will use the image that the user feed and
search the similar images in the database that had similar feature by considering the
threshold value. One of the most important aspects in CBIR is to determine the correct
threshold value. Setting the correct threshold value is important in CBIR because setting
it too low will result in less image being retrieve that might exclude relevant data. Setting
to high threshold value might result in irrelevant data to be retrieved and increase the
search time for image retrieval.
Result show that this project able to increase the image accuracy to average 70% by
combining 7 different feature vector at correct threshold value.
ii
Recommended from our members
Towards Robust Long-form Text Generation Systems
Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to world knowledge and the input prompt; (2) it is difficult to accurately evaluate the quality of long-form generated text; (3) it is difficult to identify whether a piece of long-form text was AI-generated, a task necessary to prevent widespread misinformation and plagiarism.
In this thesis I design algorithms aimed at making progress towards these three issues in current LLMs. I will first describe a retrieval-augmented system we built for long-form question answering, to improve factual correctness of long-form generated text. However, a careful empirical analysis reveals issues related to input/output consistency of generated text, and an inherent difficulty in evaluation. I will then describe our model RankGen, which uses large-scale contrastive learning on documents to significantly outperform competing long-form text generation methods to generate text more faithful to the input. Next, I will describe our efforts to improve human evaluation of long-form generation (issue #2) by proposing the LongEval guidelines. LongEval is a set of three simple empirically-motivated ideas to make human evaluation of long-form generation more consistent, less expensive, and cognitively easier for evaluators. Finally, I describe my work on AI-generated text detection (issue #3), and showcase the brittleness of existing methods to paraphrasing attacks I designed. I will describe a simple new AI-generated text detection algorithm using information retrieval, which is significantly more robust to paraphrasing attacks.
Finally, I conclude this thesis with some future research directions that I am excited about, including plan-based long-form text generation, and a deeper dive into understanding large language model training dynamics
Digital watermarking methods for data security and authentication
Philosophiae Doctor - PhDCryptology is the study of systems that typically originate from a consideration of the ideal circumstances under which secure information exchange is to take place. It involves the study of cryptographic and other processes that might be introduced for breaking the output of such systems - cryptanalysis. This includes the introduction of formal mathematical methods for the design of a cryptosystem and for estimating its theoretical level of securit
- …