Abstract: Recent commercial microprocessors are concentrating on the multi-core CPU architectures, while most parallel and/or distributed computing methods focus on the multi-CPU architectures. Therefore, there are needs to analyze and adapt traditional parallel algorithms for the new multi-core environments. In this paper, we use matrix multiplications as the target problem, and implemented it using various methods including the traditional serialized and parallel versions using OpenMP and Windows-threads, etc. We measure the execution times for each implementation, to finally analyze their overall performance. The most important factor for the execution time is the efficient use of level-2 caches in the CPU, according to our experimental results. We expect to develop a more efficient implementation method and design a new matrix multiplication method for the multi-core CPU’s. Key–Words: Multi-core CPU, parallel computing, performance analysis.