An Efficient Simulation of application specific instruction-set processors (ASIP) is a challenging onus in the area of VLSI design. This paper reconnoiters the possibility of use of ASIP simulators for ASIP Simulation. This proposed study allow as the simulation of the cache memory design with various ASIP simulators like Simple scalar and VEX. In this paper we have implemented the memory configuration according to desire application. These simulators performs the cache related results such as cache name, sets, cache associativity, cache block size, cache replacement policy according to specific application.
INTRODUCTION
ASIPs are the challenging task in the area of high performance embedded system design. ASIP performs the target architecture such big-endian and little-endian it can reduce the cost, speed, code size, and power consumption and increasing performance. We have used two ASIP simulator like SimpleScalar and VEX. SimpleScalar simulator is an ASIP simulator; it consists of compiler, assembler, linker and simulation tools for the Simple Scalar PISA and Alpha AXP architectures. SimpleScalar tool set contains many simulators ranging from a fast functional simulator to a detailed out-oforder issue processor with a multi-level memory system. SimpleScalar also provides extensible, portable, highperformance architecture for high performance embedded systems design. Specific application compiled with using SimpleScalar, which generates application specific cache results. Another kind of ASIP simulator is VEX defines a parametric space of architecture that share a common set of application and system resources. VEX is a 32-bit clustered VLIW ISA is scalable and customizable to specific application domains.
RELATED WORK
Jain, M. K., Balakrishnan M. and Kumar A. proposed [1] scheduler based technique for exploring the register windows and cache configuration. Kin, J., Gupta, M. And MangioneSmith, W. H. [2] analyzed energy efficiency by filtering cache references through an unusually small first level cache. A second level cache, similar in size and structure to a conventional first level cache, is positioned behind the filter cache and serves to mitigate the performance loss. Performance for different register file sizes is estimated by predicting the number of memory spills and its delay.
Vivekanadarajah K. and Thambipillai, S. [3] the tuning filter cache to the needs of a particular application can save power and energy. Beside, a simple loop profiler directed methodology to deduce the optimal or near-optimal filter cache is proposed, without having to simulating all possible combinations of cache parameters from the specified space. The technique employed does not require explicit register assignment. Shuie, W. T. [4] Proposed three performance metrics, such as cache size, memory access time and energy consumption. Extensive experiments indicate that a small filter cache still can achieve a high hit rate and good performance. This approach allows the second level cache to be in a low power mode most of the time, thus resulting in power savings. Prikryl Z., Kroustck I., Hruska, T. and Kolar, D. [5] proposed automatically generated just-in-time translated simulator with the profiling capabilities. Gremzow, C.
[8] using virtual machine architectures for ASIP synthesis and quantitative global data flow analysis for code partitioning, several "real world" applications from the domain of digital video signal processing. D. Fischer, J. Teich, M., Weper, R. [9] designed an efficient exploration algorithm for architecture/compiler co-designs of applicationspecific instruction-set processors. Guzman, V., Bhattacharyya, S.S, Kellomaki, E. and Takala, J. [10] developed an integration of SDF-and ASIP-oriented design flows, and use this integrated design flow to explore trade-offs in the space of hardware/software implementation and explore an approach to ASIP implementation in terms of "critical" and "non-critical" applications.
SIMPLESCALAR SIMULATOR
SimpleScalar simulator [6] used the MIPS architecture and support both big-endian and little-endian executable. SimpleScalar used the target files big-endian and little endian architecture is ssbig-na-sstrix and sslittle-na-sstrix, respectively. We have determined endian to our host environment and run the endian program located in the simplesim-2.0/ directory. SimpleScalar simulator provides fast cache simulation. SimpleScalar simulator is target specific simulator we have used 32-bit system as i-386 or 64-bit as i-686 host platform after targeting little-endian we have analyzed the cache memory result. In SimpleScalar we have used various application benchmarks and compiled with SimpleScalar version of GCC, which generates SimpleScalar assembly. The SimpleScalar assembly and loader, along with the necessary ported libraries, it produce SimpleScalar executable that can then be feel directly one of the provided simulators (this simulator compiled with the host"s platforms) (see Figure 1) .Simulator resources such as Sim-Cache,SimSafe etc. used for simulation.
International Journal of Computer Applications (0975 -8887)
Volume 90 
SimpleScalar internals processor simulator
SimpleScalar simulator [6] contains five executions driven processor simulator. SimpleScalar processor simulator performs the non-blocking cache and speculative execution.
We have used Sim-Cache for cache simulation.
Executions driven processor simulator are:
Sim-safe
This simulator is a functional simulation, it can providing alignment and access permissions for each memory reference. 
Sim-cache:
Simplescalar Sim-cache simulator performs the cache mapping as a set-associative mapping. Set associative mapping, is an improvement over the direct-mapping organization in that each word of cache can store two or more word of memory under the same index address. Each data word is stored to-gether with it"s tag and no. of tag item in one word of cache is said to form a set. With the help of SimCache we have implemented memory configuration according to specific application. This cache simulator performs the cache memory related results such as cache name, sets, cache associativity, cache block size, cache replacement policy etc. (see Figure 3 ). 
Sim-cheetah
Sim-Cheetah cache simulation engine to generating simulation results for multiple cache configurations with a single simulation. It"s full associative efficiently as well as simulating a sometimes optimal replacement policy.
Sim-profile
SimpleScalar Sim-profile simulator generates detailed profiles on instruction classes and addresses, text symbols, memory accesses, branches, and data segment symbols.
Sim-out order:
SimpleScalar simulator supports the out-of-order processor"s memory system which employs a load/store queue. Store values are placed in the queue and Loads are dispatched to the memory system when the addresses of all previous stores are known. Loads may be satisfied either by the memory system or by an earlier store value residing in the queue, if their addresses match. We can easily implementation with memory and processor by Sim-out order processor simulator.
We can specify the processor core parameters are 
SimpleScalar Cache Simulation
SimpleScalar simulator is an application specific Simulator which can produce the target specific cache memory results (see Figure 4 ). This SimpleScalar simulator contain cache simulator; this simulator can emulate a system with multiple levels of instruction and data caches. After simulation of SimpleScalar we can get the parameter such as total no. of instructions, sim-mem ref., sim-elapsed time, sim-inst-rate etc. (see Table [ 1] ). We have analyzed the memory references according to the total no. of instruction executed (see Figure  5 ). Our model assumes two level data cache. Simplescalar tool suite defines both little-ness and big-endian-ness (target) of the architecture to improve the portability (the host machine is the one that matches the endian-ness of the host). A lot of features are available but we have used some limited parameter for ASIP simulation. 
Fig 4: SimpleScalar Simulation Results

International Journal of Computer Applications (0975 -8887) Volume 90 -No 13, March 2014
Table1.Simplescalar Simulation Result
VEX SIMULATOR
VEX [7] provides a parametric space of architecture that share a common set of application and system resources, such as registers and operation.VEX is a 32-bit clustered VLIW ISA which is scalable and customizable to individual application. VEX simulator is an architecture-level (function) simulator that uses compiled simulator technology to achieve a speed of many equivalent "MIPS". This simulation system used sets of POSIX -like libc and libm libraries, VEX uses a cache simulator (level-1 cache only), and an API that enables for modeling the memory systems. VEX contains two qualifiers that specify streaming access (access to object that only exhibit spatial locality); and local access (access to object that exhibit a strong temporal locality).
VEX Cluster architecture
VEX uses cluster architecture (see Figure 6 ): it provides scalability of issue width and functionality using modular execution clusters. Each cluster is a collection of register files and a tightly coupled a set of functional units. Functional units within a cluster directly access only cluster register files. Data cache port and private memories are associated with each cluster. VEX allow multiple memory access to executes simultaneously.
Customization of VEX
VEX used load/store architecture, meaning that only load and store operations can access memory, and that memory operations only target general-purpose registers. VEX generally uses a big-endian byte ordering target model.
VEX Cache customize
We can easily choose the cache configuration according to desire application. Vex contains various cache property such as cache size, sets, line size, no. of cache line size, cache miss penalty etc. (see Figure 7) . In VEX compiled simulator we can easily specify the execution -driven parameters, such as clock and bus cycle, cache parameters (size, associativity, refill latency). 
Vex cluster customize
The Defaults VEX cluster contains two register files, four integer ALUs, two 16x32-bit Multiply units, and a data cache port. The register set consists of 64 general purposes 32-bit registers (GRs) and 8 1bit branch register (BRs).
VEX VCG
VEX contain Visualization tools are often a very useful in the developemnt of various tunned application profiling and optimization of a complex application. This profilling usually necessary regardness of the target architecture. VEX have the rgg utility that converts the standards gprof output intoa VCG call graph (see Figure 8 ). Each Application can be implementing with VEX VCG and eaisly optimized specific application. 
VEX Cache Simulation
VEX development system (VEX tool chain) provides the set of tools that allow application benchmarks compiled for a VEX target to be simulated on a host workstation. VEX tool chain is mainly used for architecture exploration, application development, and benchmarking. It includes very fast architectural simulation that uses a form of binary translation to convert VEX assembler files. When we simulate an application with vex simulator it can generate assembly files (see Figure  9 ), Assembly files are simulated and we get execution statistics including cache misses. The pcntl utility is used for this purpose. VEX links with a simple cache simulation library, which models a L1 instruction and data cache memory. The cache simulator is really a trace simulator, which is embedded in the same binary for performance reasons. The VEX simulator supports for gprof, when invokes with the "-mas_G". Gprof running in the host environment. At the end of simulation, four files are created, gmon.out containing profile data that include cache simulation, gmon-nocache.out containing profile data not include cache simulation, gmon-icache/gmondcache containing data for respectively only instruction and data cache statistcs (see Figure 10 ).VEX gmon-icache/gmondcache file contains complete details of instruction and data memory operations,stall cycles,cache hit rate, cache miss rate etc.
VEX output file containing the complete statistcis, such as cycles (total,execution,stall,operations,time), branch statistics (execution, taken, condition, unconditions), instruction memory statistics (estimated codesize, hits/misses) data memory statistics (hits/misses, bus conflicts), bus statistics (bandwidth usages fration), simulation speed (mips,simulation time) (see Figure 11 ). In VEX cache simulation process we have used various standard benchmarks applications and after simulation we can get I-Cache & D-cache results according to the total no. of instruction executed ( see table (2, 3) ) and analyzed the I-cache/ D-cache according to desire application (see Figure 12 ). 
CONCLUSION
ASIP simulators allows as the simulation of the cache memory design is an efficient manner. We have used ASIP simulators like SimpleScalar and VEX simulator performs target specific cache memory results. SimpleScalar is MIPS based architecture used in design space exploration, perform two level cache simulation.VEX defines a 32-bit clustered VLIW ISA is scalable and customizable to specific application and performs single level cache simulation. By the use of these ASIP simulators we have customized the memory configuration according to desire application and we can get complete details of I-Cache & D-cache according to the total no. of instructions executed.
ACKNOWLEDGMENTS
Our thanks to the SimpleScalar and VEX tool developer who has developed these simulators.
