10,031 research outputs found
An Evaluation of Popular Copy-Move Forgery Detection Approaches
A copy-move forgery is created by copying and pasting content within the same
image, and potentially post-processing it. In recent years, the detection of
copy-move forgeries has become one of the most actively researched topics in
blind image forensics. A considerable number of different algorithms have been
proposed focusing on different types of postprocessed copies. In this paper, we
aim to answer which copy-move forgery detection algorithms and processing steps
(e.g., matching, filtering, outlier detection, affine transformation
estimation) perform best in various postprocessing scenarios. The focus of our
analysis is to evaluate the performance of previously proposed feature sets. We
achieve this by casting existing algorithms in a common pipeline. In this
paper, we examined the 15 most prominent feature sets. We analyzed the
detection performance on a per-image basis and on a per-pixel basis. We created
a challenging real-world copy-move dataset, and a software framework for
systematic image manipulation. Experiments show, that the keypoint-based
features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and
Zernike features perform very well. These feature sets exhibit the best
robustness against various noise sources and downsampling, while reliably
identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper
appeared in IEEE Transaction on Information Forensics and Securit
Draft genomes of two Artocarpus plants, jackfruit (A. heterophyllus) and breadfruit (A. altilis)
Two of the most economically important plants in the Artocarpus genus are jackfruit (A. heterophyllus Lam.) and breadfruit (A. altilis (Parkinson) Fosberg). Both species are long-lived trees that have been cultivated for thousands of years in their native regions. Today they are grown throughout tropical to subtropical areas as an important source of starch and other valuable nutrients. There are hundreds of breadfruit varieties that are native to Oceania, of which the most commonly distributed types are seedless triploids. Jackfruit is likely native to the Western Ghats of India and produces one of the largest tree-borne fruit structures (reaching up to 45 kg). To-date, there is limited genomic information for these two economically important species. Here, we generated 273 Gb and 227 Gb of raw data from jackfruit and breadfruit, respectively. The high-quality reads from jackfruit were assembled into 162,440 scaffolds totaling 982 Mb with 35,858 genes. Similarly, the breadfruit reads were assembled into 180,971 scaffolds totaling 833 Mb with 34,010 genes. A total of 2822 and 2034 expanded gene families were found in jackfruit and breadfruit, respectively, enriched in pathways including starch and sucrose metabolism, photosynthesis, and others. The copy number of several starch synthesis-related genes were found to be increased in jackfruit and breadfruit compared to closely-related species, and the tissue-specific expression might imply their sugar-rich and starch-rich characteristics. Overall, the publication of high-quality genomes for jackfruit and breadfruit provides information about their specific composition and the underlying genes involved in sugar and starch metabolism
λ°μ΄ν° μ§μ½μ μμ©μ ν¨μ¨μ μΈ μμ€ν μμ νμ©μ μν λ©λͺ¨λ¦¬ μλΈμμ€ν μ΅μ ν
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2020. 8. μΌνμ.With explosive data growth, data-intensive applications, such as relational database and key-value storage, have been increasingly popular in a variety of domains in recent years. To meet the growing performance demands of data-intensive applications, it is crucial to efficiently and fully utilize memory resources for the best possible performance.
However, general-purpose operating systems (OSs) are designed to provide system resources to applications running on a system in a fair manner at system-level. A single application may find it difficult to fully exploit the systems best performance due to this system-level fairness. For performance reasons, many data-intensive applications implement their own mechanisms that OSs already provide, under the assumption that they know better about the data than OSs. They can be greedily optimized for performance but this may result in inefficient use of system resources.
In this dissertation, we claim that simple OS support with minor application modifications can yield even higher application performance without sacrificing system-level resource utilization. We optimize and extend OS memory subsystem for better supporting applications while addressing three memory-related issues in data-intensive applications. First, we introduce a memory-efficient cooperative caching approach between application and kernel buffer to address double caching problem where the same data resides in multiple layers. Second, we present a memory-efficient, transparent zero-copy read I/O scheme to avoid the performance interference problem caused by memory copy behavior during I/O. Third, we propose a memory-efficient fork-based checkpointing mechanism for in-memory database systems to mitigate the memory footprint problem of the existing fork-based checkpointing scheme; memory usage increases incrementally (up to 2x) during checkpointing for update-intensive workloads.
To show the effectiveness of our approach, we implement and evaluate our schemes on real multi-core systems. The experimental results demonstrate that our cooperative approach can more effectively address the above issues related to data-intensive applications than existing non-cooperative approaches while delivering better performance (in terms of transaction processing speed, I/O throughput, or memory footprint).μ΅κ·Ό νλ°μ μΈ λ°μ΄ν° μ±μ₯κ³Ό λλΆμ΄ λ°μ΄ν°λ² μ΄μ€, ν€-λ°Έλ₯ μ€ν λ¦¬μ§ λ±μ λ°μ΄ν° μ§μ½μ μΈ μμ©λ€μ΄ λ€μν λλ©μΈμμ μΈκΈ°λ₯Ό μ»κ³ μλ€. λ°μ΄ν° μ§μ½μ μΈ μμ©μ λμ μ±λ₯ μꡬλ₯Ό μΆ©μ‘±νκΈ° μν΄μλ μ£Όμ΄μ§ λ©λͺ¨λ¦¬ μμμ ν¨μ¨μ μ΄κ³ μλ²½νκ² νμ©νλ κ²μ΄ μ€μνλ€. κ·Έλ¬λ, λ²μ© μ΄μ체μ (OS)λ μμ€ν
μμ μν μ€μΈ λͺ¨λ μμ©λ€μ λν΄ μμ€ν
μ°¨μμμ 곡ννκ² μμμ μ 곡νλ κ²μ μ°μ νλλ‘ μ€κ³λμ΄μλ€. μ¦, μμ€ν
μ°¨μμ 곡νμ± μ μ§λ₯Ό μν μ΄μ체μ μ§μμ νκ³λ‘ μΈν΄ λ¨μΌ μμ©μ μμ€ν
μ μ΅κ³ μ±λ₯μ μμ ν νμ©νκΈ° μ΄λ ΅λ€. μ΄λ¬ν μ΄μ λ‘, λ§μ λ°μ΄ν° μ§μ½μ μμ©μ μ΄μ체μ μμ μ 곡νλ κΈ°λ₯μ μμ§νμ§ μκ³ λΉμ·ν κΈ°λ₯μ μμ© λ 벨μ ꡬννκ³€ νλ€. μ΄λ¬ν μ κ·Ό λ°©λ²μ νμμ μΈ μ΅μ νκ° κ°λ₯νλ€λ μ μμ μ±λ₯ μ μ΄λμ΄ μμ μ μμ§λ§, μμ€ν
μμμ λΉν¨μ¨μ μΈ μ¬μ©μ μ΄λν μ μλ€.
λ³Έ λ
Όλ¬Έμμλ μ΄μ체μ μ μ§μκ³Ό μ½κ°μ μμ© μμ λ§μΌλ‘λ λΉν¨μ¨μ μΈ μμ€ν
μμ μ¬μ© μμ΄ λ³΄λ€ λμ μμ© μ±λ₯μ λ³΄μΌ μ μμμ μ¦λͺ
νκ³ μ νλ€. κ·Έλ¬κΈ° μν΄ μ΄μ체μ μ λ©λͺ¨λ¦¬ μλΈμμ€ν
μ μ΅μ ν λ° νμ₯νμ¬ λ°μ΄ν° μ§μ½μ μΈ μμ©μμ λ°μνλ μΈ κ°μ§ λ©λͺ¨λ¦¬ κ΄λ ¨ λ¬Έμ λ₯Ό ν΄κ²°νμλ€. 첫째, λμΌν λ°μ΄ν°κ° μ¬λ¬ κ³μΈ΅μ μ‘΄μ¬νλ μ€λ³΅ μΊμ± λ¬Έμ λ₯Ό ν΄κ²°νκΈ° μν΄ μμ©κ³Ό 컀λ λ²νΌ κ°μ λ©λͺ¨λ¦¬ ν¨μ¨μ μΈ νλ ₯ μΊμ± λ°©μμ μ μνμλ€. λμ§Έ, μ
μΆλ ₯μ λ°μνλ λ©λͺ¨λ¦¬ 볡μ¬λ‘ μΈν μ±λ₯ κ°μ λ¬Έμ λ₯Ό νΌνκΈ° μν΄ λ©λͺ¨λ¦¬ ν¨μ¨μ μΈ λ¬΄λ³΅μ¬ μ½κΈ° μ
μΆλ ₯ λ°©μμ μ μνμλ€. μ
μ§Έ, μΈ-λ©λͺ¨λ¦¬ λ°μ΄ν°λ² μ΄μ€ μμ€ν
μ μν λ©λͺ¨λ¦¬ ν¨μ¨μ μΈ fork κΈ°λ° μ²΄ν¬ν¬μΈνΈ κΈ°λ²μ μ μνμ¬ κΈ°μ‘΄ ν¬ν¬ κΈ°λ° μ²΄ν¬ν¬μΈνΈ κΈ°λ²μμ λ°μνλ λ©λͺ¨λ¦¬ μ¬μ©λ μ¦κ° λ¬Έμ λ₯Ό μννμλ€; κΈ°μ‘΄ λ°©μμ μ
λ°μ΄νΈ μ§μ½μ μν¬λ‘λμ λν΄ μ²΄ν¬ν¬μΈν
μ μννλ λμ λ©λͺ¨λ¦¬ μ¬μ©λμ΄ μ΅λ 2λ°°κΉμ§ μ μ§μ μΌλ‘ μ¦κ°ν μ μμλ€.
λ³Έ λ
Όλ¬Έμμλ μ μν λ°©λ²λ€μ ν¨κ³Όλ₯Ό μ¦λͺ
νκΈ° μν΄ μ€μ λ©ν° μ½μ΄ μμ€ν
μ ꡬννκ³ κ·Έ μ±λ₯μ νκ°νμλ€. μ€νκ²°κ³Όλ₯Ό ν΅ν΄ μ μν νλ ₯μ μ κ·Όλ°©μμ΄ κΈ°μ‘΄μ λΉνλ ₯μ μ κ·Όλ°©μλ³΄λ€ λ°μ΄ν° μ§μ½μ μμ©μκ² ν¨μ¨μ μΈ λ©λͺ¨λ¦¬ μμ νμ©μ
κ°λ₯νκ² ν¨μΌλ‘μ¨ λ λμ μ±λ₯μ μ 곡ν μ μμμ νμΈν μ μμλ€.Chapter 1 Introduction 1
1.1 Motivation 1
1.1.1 Importance of Memory Resources 1
1.1.2 Problems 2
1.2 Contributions 5
1.3 Outline 6
Chapter 2 Background 7
2.1 Linux Kernel Memory Management 7
2.1.1 Page Cache 7
2.1.2 Page Reclamation 8
2.1.3 Page Table and TLB Shootdown 9
2.1.4 Copy-on-Write 10
2.2 Linux Support for Applications 11
2.2.1 fork 11
2.2.2 madvise 11
2.2.3 Direct I/O 12
2.2.4 mmap 13
Chapter 3 Memory Efficient Cooperative Caching 14
3.1 Motivation 14
3.1.1 Problems of Existing Datastore Architecture 14
3.1.2 Proposed Architecture 17
3.2 Related Work 17
3.3 Design and Implementation 19
3.3.1 Overview 19
3.3.2 Kernel Support 24
3.3.3 Migration to DBIO 25
3.4 Evaluation 27
3.4.1 System Configuration 27
3.4.2 Methodology 28
3.4.3 TPC-C Benchmarks 30
3.4.4 YCSB Benchmarks 32
3.5 Summary 37
Chapter 4 Memory Efficient Zero-copy I/O 38
4.1 Motivation 38
4.1.1 The Problems of Copy-Based I/O 38
4.2 Related Work 40
4.2.1 Zero Copy I/O 40
4.2.2 TLB Shootdown 42
4.2.3 Copy-on-Write 43
4.3 Design and Implementation 44
4.3.1 Prerequisites for z-READ 44
4.3.2 Overview of z-READ 45
4.3.3 TLB Shootdown Optimization 48
4.3.4 Copy-on-Write Optimization 52
4.3.5 Implementation 55
4.4 Evaluation 55
4.4.1 System Configurations 56
4.4.2 Effectiveness of the TLB Shootdown Optimization 57
4.4.3 Effectiveness of CoW Optimization 59
4.4.4 Analysis of the Performance Improvement 62
4.4.5 Performance Interference Intensity 63
4.4.6 Effectiveness of z-READ in Macrobenchmarks 65
4.5 Summary 67
Chapter 5 Memory Efficient Fork-based Checkpointing 69
5.1 Motivation 69
5.1.1 Fork-based Checkpointing 69
5.1.2 Approach 71
5.2 Related Work 73
5.3 Design and Implementation 74
5.3.1 Overview 74
5.3.2 OS Support 78
5.3.3 Implementation 79
5.4 Evaluation 80
5.4.1 Experimental Setup 80
5.4.2 Performance 81
5.5 Summary 86
Chapter 6 Conclusion 87
μμ½ 100Docto
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
Big data analytics for large-scale wireless networks: Challenges and opportunities
Β© 2019 Association for Computing Machinery. The wide proliferation of various wireless communication systems and wireless devices has led to the arrival of big data era in large-scale wireless networks. Big data of large-scale wireless networks has the key features of wide variety, high volume, real-time velocity, and huge value leading to the unique research challenges that are different from existing computing systems. In this article, we present a survey of the state-of-art big data analytics (BDA) approaches for large-scale wireless networks. In particular, we categorize the life cycle of BDA into four consecutive stages: Data Acquisition, Data Preprocessing, Data Storage, and Data Analytics. We then present a detailed survey of the technical solutions to the challenges in BDA for large-scale wireless networks according to each stage in the life cycle of BDA. Moreover, we discuss the open research issues and outline the future directions in this promising area
Shuttle Ground Operations Efficiencies/Technologies (SGOE/T) study. Volume 2: Ground Operations evaluation
The Ground Operations Evaluation describes the breath and depth of the various study elements selected as a result of an operational analysis conducted during the early part of the study. Analysis techniques used for the evaluation are described in detail. Elements selected for further evaluation are identified; the results of the analysis documented; and a follow-on course of action recommended. The background and rationale for developing recommendations for the current Shuttle or for future programs is presented
End-to-End Entity Resolution for Big Data: A Survey
One of the most important tasks for improving data quality and the
reliability of data analytics results is Entity Resolution (ER). ER aims to
identify different descriptions that refer to the same real-world entity, and
remains a challenging problem. While previous works have studied specific
aspects of ER (and mostly in traditional settings), in this survey, we provide
for the first time an end-to-end view of modern ER workflows, and of the novel
aspects of entity indexing and matching methods in order to cope with more than
one of the Big Data characteristics simultaneously. We present the basic
concepts, processing steps and execution strategies that have been proposed by
different communities, i.e., database, semantic Web and machine learning, in
order to cope with the loose structuredness, extreme diversity, high speed and
large scale of entity descriptions used by real-world applications. Finally, we
provide a synthetic discussion of the existing approaches, and conclude with a
detailed presentation of open research directions
- β¦