17 research outputs found

    An analysis of Japanese foreign exchange interventions, 1991-2002

    Get PDF
    The effectiveness of Japanese interventions over the past decade depended in large part on the frequency and size of the transactions. Prior to June 1995, Japanese interventions only had value as a forecast that the previous day's yen appreciation or depreciation would moderate during the current day. After June 1995, Japanese purchases of dollars had value as a forecast that the yen would depreciate. Probit analysis confirms that large, infrequent interventions, which characterized the later period, had a higher likelihood of success than small, frequent interventions.Foreign exchange - Japan

    Faster Approximate Pattern Matching: A Unified Approach

    Get PDF
    Approximate pattern matching is a natural and well-studied problem on strings: Given a text TT, a pattern PP, and a threshold kk, find (the starting positions of) all substrings of TT that are at distance at most kk from PP. We consider the two most fundamental string metrics: the Hamming distance and the edit distance. Under the Hamming distance, we search for substrings of TT that have at most kk mismatches with PP, while under the edit distance, we search for substrings of TT that can be transformed to PP with at most kk edits. Exact occurrences of PP in TT have a very simple structure: If we assume for simplicity that ∣TâˆŁâ‰€3∣P∣/2|T| \le 3|P|/2 and trim TT so that PP occurs both as a prefix and as a suffix of TT, then both PP and TT are periodic with a common period. However, an analogous characterization for the structure of occurrences with up to kk mismatches was proved only recently by Bringmann et al. [SODA'19]: Either there are O(k2)O(k^2) kk-mismatch occurrences of PP in TT, or both PP and TT are at Hamming distance O(k)O(k) from strings with a common period O(m/k)O(m/k). We tighten this characterization by showing that there are O(k)O(k) kk-mismatch occurrences in the case when the pattern is not (approximately) periodic, and we lift it to the edit distance setting, where we tightly bound the number of kk-edit occurrences by O(k2)O(k^2) in the non-periodic case. Our proofs are constructive and let us obtain a unified framework for approximate pattern matching for both considered distances. We showcase the generality of our framework with results for the fully-compressed setting (where TT and PP are given as a straight-line program) and for the dynamic setting (where we extend a data structure of Gawrychowski et al. [SODA'18]).Comment: 74 pages, 7 figures, FOCS'2

    A Hybrid Model for Android Malware Detection using Decision Tree and KNN

    Get PDF
    Malwares are becoming a major problem nowadays all around the world in android operating systems. The malware is a piece of software developed for harming or exploiting certain other hardware as well as software. The term Malware is also known as malicious software which is utilized to define Trojans, viruses, as well as other kinds of spyware. There have been developed many kinds of techniques for protecting the android operating systems from malware during the last decade. However, the existing techniques have numerous drawbacks such as accuracy to detect the type of malware in real-time in a quick manner for protecting the android operating systems. In this article, the authors developed a hybrid model for android malware detection using a decision tree and KNN (k-nearest neighbours) technique. First, Dalvik opcode, as well as real opcode, was pulled out by using the reverse procedure of the android software. Secondly, eigenvectors of sampling were produced by utilizing the n-gram model. Our suggested hybrid model efficiently combines KNN along with the decision tree for effective detection of the android malware in real-time. The outcome of the proposed scheme illustrates that the proposed hybrid model is better in terms of the accurate detection of any kind of malware from the Android operating system in a fast and accurate manner. In this experiment, 815 sample size was selected for the normal samples and the 3268-sample size was selected for the malicious samples. Our proposed hybrid model provides pragmatic values of the parameters namely precision, ACC along with the Recall, and F1 such as 0.93, 0.98, 0.96, and 0.99 along with 0.94, 0.99, 0.93, and 0.99 respectively. In the future, there are vital possibilities to carry out more research in this field to develop new methods for Android malware detection

    Faster Approximate Pattern Matching: {A} Unified Approach

    Get PDF
    Approximate pattern matching is a natural and well-studied problem on strings: Given a text TT, a pattern PP, and a threshold kk, find (the starting positions of) all substrings of TT that are at distance at most kk from PP. We consider the two most fundamental string metrics: the Hamming distance and the edit distance. Under the Hamming distance, we search for substrings of TT that have at most kk mismatches with PP, while under the edit distance, we search for substrings of TT that can be transformed to PP with at most kk edits. Exact occurrences of PP in TT have a very simple structure: If we assume for simplicity that ∣TâˆŁâ‰€3∣P∣/2|T| \le 3|P|/2 and trim TT so that PP occurs both as a prefix and as a suffix of TT, then both PP and TT are periodic with a common period. However, an analogous characterization for the structure of occurrences with up to kk mismatches was proved only recently by Bringmann et al. [SODA'19]: Either there are O(k2)O(k^2) kk-mismatch occurrences of PP in TT, or both PP and TT are at Hamming distance O(k)O(k) from strings with a common period O(m/k)O(m/k). We tighten this characterization by showing that there are O(k)O(k) kk-mismatch occurrences in the case when the pattern is not (approximately) periodic, and we lift it to the edit distance setting, where we tightly bound the number of kk-edit occurrences by O(k2)O(k^2) in the non-periodic case. Our proofs are constructive and let us obtain a unified framework for approximate pattern matching for both considered distances. We showcase the generality of our framework with results for the fully-compressed setting (where TT and PP are given as a straight-line program) and for the dynamic setting (where we extend a data structure of Gawrychowski et al. [SODA'18])

    Time Series Residual Momentum

    Get PDF

    Hierarchical Relative Lempel-Ziv Compression

    Get PDF
    Relative Lempel-Ziv (RLZ) parsing is a dictionary compression method in which a string S is compressed relative to a second string R (called the reference) by parsing S into a sequence of substrings that occur in R. RLZ is particularly effective at compressing sets of strings that have a high degree of similarity to the reference string, such as a set of genomes of individuals from the same species. With the now cheap cost of DNA sequencing, such datasets have become extremely abundant and are rapidly growing. In this paper, instead of using a single reference string for the entire collection, we investigate the use of different reference strings for subsets of the collection, with the aim of improving compression. In particular, we propose a new compression scheme hierarchical relative Lempel-Ziv (HRLZ) which form a rooted tree (or hierarchy) on the strings and then compress each string using RLZ with parent as reference, storing only the root of the tree in plain text. To decompress, we traverse the tree in BFS order starting at the root, decompressing children with respect to their parent. We show that this approach leads to a twofold improvement in compression on bacterial genome datasets, with negligible effect on decompression time compared to the standard single reference approach. We show that an effective hierarchy for a given set of strings can be constructed by computing the optimal arborescence of a completed weighted digraph of the strings, with weights as the number of phrases in the RLZ parsing of the source and destination vertices. We further show that instead of computing the complete graph, a sparse graph derived using locality-sensitive hashing can significantly reduce the cost of computing a good hierarchy, without adversely effecting compression performance

    Approximate Circular Pattern Matching under Edit Distance

    Get PDF
    In the k-Edit Circular Pattern Matching (k-Edit CPM) problem, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions of the substrings of T that are at edit distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if any such substring exists. Very recently, Charalampopoulos et al. [ESA 2022] presented O(nk^2)-time and O(nk \log^3 k)-time solutions for the reporting and decision versions of k-Edit CPM, respectively. Here, we show that the reporting and decision versions of k-Edit CPM can be solved in O(n+(n/m)k^6) time and O(n+(n/m)k^5 \log^3 k) time, respectively, thus obtaining the first algorithms with a complexity of the type O(n+(n/m) poly(k)) for this problem. Notably, our algorithms run in O(n) time when m=Ω(k^6) and are superior to the previous respective solutions when m=ω(k^4). We provide a meta-algorithm that yields efficient algorithms in several other interesting settings, such as when the strings are given in a compressed form (as straight-line programs), when the strings are dynamic, or when we have a quantum computer. We obtain our solutions by exploiting the structure of approximate circular occurrences of P in T, when T is relatively short w.r.t. P. Roughly speaking, either the starting positions of approximate occurrences of rotations of P form O(k^4) intervals that can be computed efficiently, or some rotation of P is almost periodic (is at a small edit distance from a string with small period). Dealing with the almost periodic case is the most technically demanding part of this work; we tackle it using properties of locked fragments (originating from [Cole and Hariharan, SICOMP 2002])

    Efficient Representations for Large Dynamic Sequences in ML

    Get PDF
    International audienceThe use of sequence containers, including stacks, queues, and double-ended queues, is ubiquitous in programming. When the maximal number of elements is not known in advance, containers need to grow dynamically. For this purpose, most ML programs either rely on lists or vectors. These structures are inefficient, both in terms of time and space usage. We investigate the use of chunked-based data structures. Such structures save a lot of memory and may deliver better performance than classic container data structures. We observe a 2x speedup compared with vectors, and up to a 3x speedup compared with lengthy lists

    Counting patterns in strings and graphs

    Get PDF
    We study problems related to finding and counting patterns in strings and graphs. In the string-regime, we are interested in counting how many substring of a text are at Hamming (or edit) distance at most to a pattern . Among others, we are interested in the fully-compressed setting, where both and are given in a compressed representation. For both distance measures, we give the first algorithm that runs in (almost) linear time in the size of the compressed representations. We obtain the algorithms by new and tight structural insights into the solution structure of the problems. In the graph-regime, we study problems related to counting homomorphisms between graphs. In particular, we study the parameterized complexity of the problem #IndSub(), where we are to count all -vertex induced subgraphs of a graph that satisfy the property . Based on a theory of LovĂĄsz, Curticapean et al., we express #IndSub() as a linear combination of graph homomorphism numbers to obtain #W[1]-hardness and almost tight conditional lower bounds for properties that are monotone or that depend only on the number of edges of a graph. Thereby, we prove a conjecture by Jerrum and Meeks. In addition, we investigate the parameterized complexity of the problem #Hom(ℋ → ) for graph classes ℋ and . In particular, we show that for any problem in the class #W[1], there are classes ℋ_ and _ such that is equivalent to #Hom(ℋ_ → _ ).Wir untersuchen Probleme im Zusammenhang mit dem Finden und ZĂ€hlen von Mustern in Strings und Graphen. Im Stringbereich ist die Aufgabe, alle Teilstrings eines Strings zu bestimmen, die eine Hamming- (oder Editier-)Distanz von höchstens zu einem Pattern haben. Unter anderem sind wir am voll-komprimierten Setting interessiert, in dem sowohl , als auch in komprimierter Form gegeben sind. FĂŒr beide Abstandsbegriffe entwickeln wir die ersten Algorithmen mit einer (fast) linearen Laufzeit in der GrĂ¶ĂŸe der komprimierten Darstellungen. Die Algorithmen nutzen neue strukturelle Einsichten in die Lösungsstruktur der Probleme. Im Graphenbereich betrachten wir Probleme im Zusammenhang mit dem ZĂ€hlen von Homomorphismen zwischen Graphen. Im Besonderen betrachten wir das Problem #IndSub(), bei dem alle induzierten Subgraphen mit Knoten zu zĂ€hlen sind, die die Eigenschaft haben. Basierend auf einer Theorie von LovĂĄsz, Curticapean, Dell, and Marx drĂŒcken wir #IndSub() als Linearkombination von Homomorphismen-Zahlen aus um #W[1]-HĂ€rte und fast scharfe konditionale untere Laufzeitschranken zu erhalten fĂŒr , die monoton sind oder nur auf der Kantenanzahl der Graphen basieren. Somit beweisen wir eine Vermutung von Jerrum and Meeks. Weiterhin beschĂ€ftigen wir uns mit der KomplexitĂ€t des Problems #Hom(ℋ → ) fĂŒr Graphklassen ℋ und . Im Besonderen zeigen wir, dass es fĂŒr jedes Problem in #W[1] Graphklassen ℋ_ und _ gibt, sodass Ă€quivalent zu #Hom(ℋ_ → _ ) ist
    corecore