150 research outputs found

    Parameterized Matching in the Streaming Model

    Get PDF
    We study the problem of parameterized matching in a stream where we want to output matches between a pattern of length m and the last m symbols of the stream before the next symbol arrives. Parameterized matching is a natural generalisation of exact matching where an arbitrary one-to-one relabelling of pattern symbols is allowed. We show how this problem can be solved in constant time per arriving stream symbol and sublinear, near optimal space with high probability. Our results are surprising and important: it has been shown that almost no streaming pattern matching problems can be solved (not even randomised) in less than Theta(m) space, with exact matching as the only known problem to have a sublinear, near optimal space solution. Here we demonstrate that a similar sublinear, near optimal space solution is achievable for an even more challenging problem. The proof is considerably more complex than that for exact matching.Comment: 19 pages, 3 figure

    Desirable properties for XML update mechanisms

    Get PDF
    The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality

    An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation

    Get PDF
    Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric for this problem, present an optimal linear time algorithm based on novel formalism, and compare experimentally its performance to a linear time top-down regression algorithm. We show that our algorithm is faster and more accurate. Applications include pattern recognition and qualitative modeling

    New techniques and framework for sentiment analysis and tuning of CRM structure in the context of Arabic language

    Get PDF
    A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirements for the degree of Doctor of PhilosophyKnowing customers’ opinions regarding services received has always been important for businesses. It has been acknowledged that both Customer Experience Management (CEM) and Customer Relationship Management (CRM) can help companies take informed decisions to improve their performance in the decision-making process. However, real-word applications are not so straightforward. A company may face hard decisions over the differences between the opinions predicted by CRM and actual opinions collected in CEM via social media platforms. Until recently, how to integrate the unstructured feedback from CEM directly into CRM, especially for the Arabic language, was still an open question. Furthermore, an accurate labelling of unstructured feedback is essential for the quality of CEM. Finally, CRM needs to be tuned and revised based on the feedback from social media to realise its full potential. However, the tuning mechanism for CEM of different levels has not yet been clarified. Facing these challenges, in this thesis, key techniques and a framework are presented to integrate Arabic sentiment analysis into CRM. First, as text pre-processing and classification are considered crucial to sentiment classification, an investigation is carried out to find the optimal techniques for the pre-processing and classification of Arabic sentiment analysis. Recommendations for using sentiment analysis classification in MSA as well as Saudi dialects are proposed. Second, to deal with the complexities of the Arabic language and to help operators identify possible conflicts in their original labelling, this study proposes techniques to improve the labelling process of Arabic sentiment analysis with the introduction of neural classes and relabelling. Finally, a framework for adjusting CRM via CEM for both the structure of the CRM system (on the sentence level) and the inaccuracy of the criteria or weights employed in the CRM system (on the aspect level) are proposed. To ensure the robustness and the repeatability of the proposed techniques and framework, the results of the study are further validated with real-word applications from different domains

    Image Processing Using FPGAs

    Get PDF
    This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs

    Exponential Separation of Quantum Communication and Classical Information

    Full text link
    We exhibit a Boolean function for which the quantum communication complexity is exponentially larger than the classical information complexity. An exponential separation in the other direction was already known from the work of Kerenidis et. al. [SICOMP 44, pp. 1550-1572], hence our work implies that these two complexity measures are incomparable. As classical information complexity is an upper bound on quantum information complexity, which in turn is equal to amortized quantum communication complexity, our work implies that a tight direct sum result for distributional quantum communication complexity cannot hold. The function we use to present such a separation is the Symmetric k-ary Pointer Jumping function introduced by Rao and Sinha [ECCC TR15-057], whose classical communication complexity is exponentially larger than its classical information complexity. In this paper, we show that the quantum communication complexity of this function is polynomially equivalent to its classical communication complexity. The high-level idea behind our proof is arguably the simplest so far for such an exponential separation between information and communication, driven by a sequence of round-elimination arguments, allowing us to simplify further the approach of Rao and Sinha. As another application of the techniques that we develop, we give a simple proof for an optimal trade-off between Alice's and Bob's communication while computing the related Greater-Than function on n bits: say Bob communicates at most b bits, then Alice must send n/exp(O(b)) bits to Bob. This holds even when allowing pre-shared entanglement. We also present a classical protocol achieving this bound.Comment: v1, 36 pages, 3 figure

    Processing nested complex sequence pattern queries over event streams

    Get PDF
    Complex event processing (CEP) has become increasingly important for tracking and monitoring applications ranging from healthcare, supply chain management to surveillance. These monitoring applications submit complex event queries to track sequences of events that match a given pattern. As these systems mature the needfor increasingly complex nested sequence queries arises, while thestate-of-the-art CEP systems mostly focus on the execution of flat sequence queries only. In this paper, we now introduce an iterative execution strategy for nested CEP queries composed of sequence, negation, AND and OR operators. Lastly the promise of applying selective caching of intermediate results to optimize the execution. Our experimental study using real-world stock trades evaluates the performance of our proposed iterative execution strategy for differentquery types.HP Labs Innovation Research Program ; National Science Foundation ; TÜBİTAKpost-prin
    corecore