Quicksort lends itself well to parallelization, because the sorting is done recursively on independent sub-arrays. This can be delegated to threads whose only synchronization is to wait for all the threads to finish their sorting, in order for the sub-array to be sorted.
It would be correct, but uninteresting, to exchange lines 20 and 21. In effect, that would wait for the lower portion of the array to be sorted before starting the sort of the upper portion. We would thus obtain the behavior of a sequential program, with the additional cost of threads but without obtaining any benefit.
In practice, we should limit parallelization to a reasonable factor and continue sequentially after that.