Shannon, in his seminal work, formalized the transmission of data over a communication channel and determined its fundamental limits. He characterized the relation between communication rate and error probability and showed that as long as the communication rate is below the capacity of the channel, error probability can be made as small as desirable by using appropriate coding over the communication channel and letting the codeword length approach infinity. He provided the formula for capacity of discrete memoryless channel. However, his proposed coding scheme was too complex to be practical in communication systems. Polar codes, recently introduced by Arıkan, are the first practical codes that are known to achieve the capacity for a large class of channel and have low encoding and decoding complexity. The original polar codes of Arıkan achieve a block error probability decaying exponentially in the square root of the block length as it goes to infinity. However, it is interesting to investigate their performance in finite length as this is the case in all practical communication schemes. In this dissertation, after a brief overview on polar codes, we introduce a practical framework for simulation of error correcting codes in general. We introduce the importance sampling concept to efficiently evaluate the performance of polar codes with finite bock length. Next, based on simulation results, we investigate the performance of different genie aided decoders to mitigate the poor performance of polar codes in low to moderate block length and propose single-error correction methods to improve the performance dramatically in expense of complexity of decoder. In this context, we also study the correlation between error events in a successive cancellation decoder. Finally, we investigate the performance of polar codes in non-binary channels. We compare the code construction of Sasoglu for Q-ary channels and classical multilevel codes. We construct multilevel polar codes for Q-ary channels and provide a thorough comparison of complexity and performance of two methods in finite length