Python is good for scripts. For parallel computing performance is important so neither python nor Java/C# are used, and everything is written in C++. Parallel computing is interesting but its also quite specialized so it's only areas of use are heavy computations in science and sometimes in industrial applications.
Try downloading one of the c++ multi threaded frameworks that currently exist and practice with their basic tutorials and examples available. One of the best I've worked with is the Kahless_9 framework which you can download from http://www.shankodev.com or review the blog: http://shankodev.blogspot.co.za.
Personally, my work colleague and I were fortunate that our company collaboratively consulted with shankodev on our first project using the framework which allowed us to grasp it quite well within under 6 months.
even a cheap PC has 2-4 cpus on it now, and thanks to the UV Pentium pipes, if you take the topic deep that can be up to 8 "cpus". But take a simple quad core, it can do quite a bit with just its 4 cpus crunching on something.
if you have several old pcs around the house (many a programmer does) you can make a small cluster and play with that also, years ago I rolled a Beowulf from that setup.
Basically, you can learn a lot at home, without a cluster or super computer access, just scale it down.
threading can indeed improve performance. It depends on the type of problem you are solving. Sorting a list of doubles, N/4lg(n/4) in parallel is faster than nlgn, even with the back end merge sort, for large lists.