148,286 research outputs found
A New Approach for Text String Detection from Natural Scenes By Grouping & Partition
In this paper we have reviewed and analyzed different methods to find strings of characters from natural scene images. We have reviewed different techniques like extraction of character string regions from scenery images based on contours and thickness of characters, efficient binarization and enhancement technique followed by a suitable connected component analysis procedure, text string detection from natural scenes by structure - based partition and grouping, and a robust algorithm for text detection in images. It is assumed that characters have closed contours, and a character string consists of characters which lie on a straight line in most cases. Therefore, by extracting closed contours and searching neighbors of them, character string regions can be extracted; Image binarization successfully processed natural scene images having shadows, non - uniform illumination, low contrast and large signal - dependent noise. Connected component analysis is used to define the final binary images that mainly consist of text regions. One technique chooses the candidate text characters from connected components by gradient feature and color feature. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitte d text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region coveri ng all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi - scales. The proposed methods outperform the state - of - the - art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in no horizontal orientations
Recommended from our members
An investigation to study the feasibility of on-line bibliographic information retrieval system using an APP
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.This thesis reports an investigation on the feasibility study of a
searching mechanism using an APP suitable for an on-line bibliographic
retrieval, operation, especially for retrospective searches.
From the study of the searching methods used in the conventional
systems it is seen that elaborate file- and data- structures are
introduced to improve the response time of the system. These
consequently lead to software and hardware redundancies. To mask
these complexities of the system an expensive computer with higher
capabilities and more powerful instruction set is commonly used.
Thus the service of the systen becomes cost-ineffective.
On the other hand the primitive operations of a searching mechanism,
such as, association, domain selection, intersection and unions, are
the intrinsic features of an associative parallel processor. Therefore
it is important to establish the feasibility of an APP as a cost-effective
searching mechanise.
In this thesis a searching mechanism using an 'ON-THE-FLY' searching
technique has been proposed. The parallel search unit uses a Byte-oriented
VRL-APP for efficient character string processing.
At the time of undertaking this work the specification for neither the
retrieval systems nor the BO-VRL APP's were well established; hence a
two-phase investigation was originated. In the Phase I of the work a
bottom up approach was adopted to derive a formal and precise
specification for the BO-VRL-APP. During the Phase II of the work
a top-down approach was opted for the implementation of the searching
mechanism.
An experimental research vehicle has been developed to establish
the feasibility of an APP as a cost-effective searching mechanism.
Although rigorous proof of the feasibility has not been obtained,
the thesis establishes that the APP is well suited for on-line
bibligraphic information retrieval operations where substring searches
including boolean selection and threshold weights are efficiently
supported
Energy Efficient Hardware Accelerators for Packet Classification and String Matching
This thesis focuses on the design of new algorithms and energy efficient high throughput hardware accelerators that implement packet classification and fixed string matching. These computationally heavy and memory intensive tasks are used by networking equipment to inspect all packets at wire speed. The constant growth in Internet usage has made them increasingly difficult to implement at core network line speeds. Packet classification is used to sort packets into different flows by comparing their headers to a list of rules. A flow is used to decide a packet’s priority and the manner in which it is processed. Fixed string matching is used to inspect a packet’s payload to check if it contains any strings associated with known viruses, attacks or other harmful activities.
The contributions of this thesis towards the area of packet classification are hardware accelerators that allow packet classification to be implemented at core network line speeds when classifying packets using rulesets containing tens of thousands of rules. The hardware accelerators use modified versions of the HyperCuts packet classification algorithm. An adaptive clocking unit is also presented that dynamically adjusts the clock speed of a packet classification hardware accelerator so that its processing capacity matches the processing needs of the network traffic. This keeps dynamic power consumption to a minimum.
Contributions made towards the area of fixed string matching include a new algorithm that builds a state machine that is used to search for strings with the aid of default transition pointers. The use of default transition pointers keep memory consumption low, allowing state machines capable of searching for thousands of strings to be small enough to fit in the on-chip memory of devices such as FPGAs. A hardware accelerator is also presented that uses these state machines to search through the payloads of packets for strings at core network line speeds
Regular Expression Search on Compressed Text
We present an algorithm for searching regular expression matches in
compressed text. The algorithm reports the number of matching lines in the
uncompressed text in time linear in the size of its compressed version. We
define efficient data structures that yield nearly optimal complexity bounds
and provide a sequential implementation --zearch-- that requires up to 25% less
time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19
Ultra-high throughput string matching for deep packet inspection
Deep Packet Inspection (DPI) involves searching a packet's header and payload against thousands of rules to detect possible attacks. The increase in Internet usage and growing number of attacks which must be searched for has meant hardware acceleration has become essential in the prevention of DPI becoming a bottleneck to a network if used on an edge or core router. In this paper we present a new multi-pattern matching algorithm which can search for the fixed strings contained within these rules at a guaranteed rate of one character per cycle independent of the number of strings or their length. Our algorithm is based on the Aho-Corasick string matching algorithm with our modifications resulting in a memory reduction of over 98% on the strings tested from the Snort ruleset. This allows the search structures needed for matching thousands of strings to be small enough to fit in the on-chip memory of an FPGA. Combined with a simple architecture for hardware, this leads to high throughput and low power consumption. Our hardware implementation uses multiple string matching engines working in parallel to search through packets. It can achieve a throughput of over 40 Gbps (OC-768) when implemented on a Stratix 3 FPGA and over 10 Gbps (OC-192) when implemented on the lower power Cyclone 3 FPGA
Searching by approximate personal-name matching
We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based
on the probabilities of the edit operations accordingly to the involved
letters and their position, and using a variable threshold. The efficacy
of DEA is quantitatively evaluated, without human relevance judgments,
very superior to the efficacy of known methods. A very efficient
approximate search technique for the DEA function is also presented
based on a compacted trie-tree structure.Postprint (published version
ESPOON: Enforcing Security Policies In Outsourced Environments
Data outsourcing is a growing business model offering services to individuals
and enterprises for processing and storing a huge amount of data. It is not
only economical but also promises higher availability, scalability, and more
effective quality of service than in-house solutions. Despite all its benefits,
data outsourcing raises serious security concerns for preserving data
confidentiality. There are solutions for preserving confidentiality of data
while supporting search on the data stored in outsourced environments. However,
such solutions do not support access policies to regulate access to a
particular subset of the stored data.
For complex user management, large enterprises employ Role-Based Access
Controls (RBAC) models for making access decisions based on the role in which a
user is active in. However, RBAC models cannot be deployed in outsourced
environments as they rely on trusted infrastructure in order to regulate access
to the data. The deployment of RBAC models may reveal private information about
sensitive data they aim to protect. In this paper, we aim at filling this gap
by proposing \textbf{} for enforcing RBAC policies in
outsourced environments. enforces RBAC policies in an
encrypted manner where a curious service provider may learn a very limited
information about RBAC policies. We have implemented
and provided its performance evaluation showing a limited overhead, thus
confirming viability of our approach.Comment: The final version of this paper has been accepted for publication in
Elsevier Computers & Security 2013. arXiv admin note: text overlap with
arXiv:1306.482
Direct Observation of Cosmic Strings via their Strong Gravitational Lensing Effect: II. Results from the HST/ACS Image Archive
We have searched 4.5 square degrees of archival HST/ACS images for cosmic
strings, identifying close pairs of similar, faint galaxies and selecting
groups whose alignment is consistent with gravitational lensing by a long,
straight string. We find no evidence for cosmic strings in five large-area HST
treasury surveys (covering a total of 2.22 square degrees), or in any of 346
multi-filter guest observer images (1.18 square degrees). Assuming that
simulations ccurately predict the number of cosmic strings in the universe,
this non-detection allows us to place upper limits on the unitless Universal
cosmic string tension of G mu/c^2 < 2.3 x 10^-6, and cosmic string density of
Omega_s < 2.1 x 10^-5 at the 95% confidence level (marginalising over the other
parameter in each case). We find four dubious cosmic string candidates in 318
single filter guest observer images (1.08 square degrees), which we are unable
to conclusively eliminate with existing data. The confirmation of any one of
these candidates as cosmic strings would imply G mu/c^2 ~ 10^-6 and Omega_s ~
10^-5. However, we estimate that there is at least a 92% chance that these
string candidates are random alignments of galaxies. If we assume that these
candidates are indeed false detections, our final limits on G mu/c^2 and
Omega_s fall to 6.5 x 10^-7 and 7.3 x 10^-6. Due to the extensive sky coverage
of the HST/ACS image archive, the above limits are universal. They are quite
sensitive to the number of fields being searched, and could be further reduced
by more than a factor of two using forthcoming HST data.Comment: 21 pages, 18 figure
- …