Monday, November 14, 2011

Learning Scala : writing Quicksort the functional programming way

Authored by Win Myo Htet


The previous post has been about doing Scala exercise from the Haskell example. Thus, the code is already written in the functional programming paradigm and I have just translated it into Scala language. To learn Scala and functional programming, we have to learn to think in the functional programming way. How do we do that? As usual, there is no easy way. We can only learn through getting our hands dirty by exercising through thinking in functional programming way.

So, Quicksort has been picked for this exercise and the Quicksort wiki will be used as a common reference. I will start implementing Quicksort in Scala by following the pseudo code described under the sub title "In-Place version". The Wiki has talked about some methods of choosing the pivot. However, it is not trivial to choose the pivot and it all depends on the set of data for the pivot-chooising-method to be meaningful. So, we will just pick the value of the first index of the array. Remember, the first value of the array can still be any value. So, here they are, very Java like.

Partition method:
def partition(left : Int, right : Int, pivotIndex : Int, array : Array[Int]) : Int = {
 val pivotValue = array(pivotIndex)
 array(pivotIndex) = array(right)
 array(right) = pivotValue
 var storeIndex = left
 for (i <- left until right) {
  if (array(i) <= pivotValue) {
   val temp = array(i)
   array(i) = array(storeIndex)
   array(storeIndex) = temp
   storeIndex = storeIndex + 1
  }
 }
 val temp2 = array(storeIndex)
 array(storeIndex) = array(right)
 array(right) = temp2
 storeIndex
}

quickSort method
def quickSort(left : Int, right : Int, array : Array[Int], direction : String = "begin") : Unit = {
 if (left < right) {
  val pivotIndex = 0
  val newPivotIndex = partition(left, right, pivotIndex, array)
  quickSort(left, newPivotIndex - 1, array, "left")
  quickSort(newPivotIndex + 1, right, array, "right")
 }
}

The codes are simply the implementation of the pseudo code. The only thing that  differ from pseudo code might be that I have used the Scala's default argument variable for "direction" for debugging purpose.

To be honest, I don't quite know how to re-implement this code into functional programming way. So, I have to get back to the basic abstract under sub title, "Algorithm".
1. pick a pivot(done!)
2. put a lesser value on the left, greater value on the right, put pivot where it belongs, which is between lesser and greater, of course.
3. do the same process for lesser and greater by way of recursion.

So how do we do step No. 2. Well, first, let's pick List as the data structure for our method since List is the preferred data structure when it comes to functional programming. The input will be List. Then, there are lesser and greater created based on the condition that the value of element of the List lesser(<) than pivot go to lesser List and  the value of element of the List greater(>) than pivot go to greater List. How do we do that? We have to feed our variable List into some looping mechanism that will give us back two Lists.

From my experience with Oleg's Tic Tac Toe code. He makes use of foldLeft quite a lot. So, folding might be a solution?  Yes, it is. If you read that blog on my understanding of the folding process, you will see that we can feed any collection into folding and can process the data for all kind of type of data as an end result. If you want to know how I come into this conclusion along with the thinking  process, you need to read through that  Tic Tac Toe blogs series.

So, folding will be the looping mechanism. I have understood the folding process to involve (foldLeftSeedValue,foldLeftElement). The foldLeftSeedValue will be our end result and we want it to be two Lists namely: lesser and greater. So, the seed value has to be a tuple of two Lists : lesser and greater. Then, we will do the matching on the conditions : lesser value will be added to the left List and greater value will be added to the right List. If you can visualize this far, that's good enough and let's look at the code.

def myQuickSort(list : List[Int]) : List[Int] = {
 def myQuickSort(list : List[Int], 
      direction : String = "begin") : List[Int] = {
  if (list.length <= 1) return list;
  val pivot = list(0)
  val pl = list.foldLeft(List[Int](), List[Int]()) {
   case ((l, r), e) if e < pivot => (e :: l, r)
   case ((l, r), e) if e > pivot => (l, e :: r)
   case ((l, r), e) if e == pivot => (l :+ e, r)
   case ((l, r), e) => println("Error processing : " + e);
                                     sys.exit()
  }
  myQuickSort(pl._1.slice(0, pl._1.length - 1), "left") 
                ++ List(pivot) ++ myQuickSort(pl._2, "right")
 }
 myQuickSort(list)
}

I have used nested method, which is not necessary, here. Let's get back to our foldLeft and case condition here. The first case line start with the form of (foldLeftSeedValue,foldLeftElement) as expected, foldLeftSeedValue is (l,r) tuple. foldLeftElement is e. Then if the condtion expression is met, the foldLeftElement is added to the head of the list l and the (l,r) tuple is passed on for the next iteration. The same can be said of the second match case. The third match case add the value to the end of the list l. The forth match is added because of the compiler warning that matching is not exhaustive. It makes me smile tho. :)

Now, we have a tuple of two Lists: l being lesser and r being greater(on the right side of the pivot value). So we process the lesser and greater List the same way by means of iteration. We have to subtract one value from the lesser because we have added value equal to pivot at the end of the lesser list including the pivot itself. The pivot value has been put into a List which is placed between lesser and greater. The recursion will be exhausted when the list size is less than or equal 1. You know what's next.

So, that's the end of my Quicksort in functional programming way in my own style(TM).  However, you cannot count on me to write a truly efficient and fp style Quicksort because I am still a noob! Let's see how Quicksort is coded in Haskell land.

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

Wow! Much way simpler! If it were not for Alan J. Perlis's wisdom: "Simplicity does not precede complexity, but follows it.", I would have shot myself! Let's look into this Haskell quicksort code. The detail analysis of this code can be found here at section 3.1. The first line is method signature. The second line said that if the list is empty, return empty list. The third line syntax is a bit tricky. If you don't know that tricky syntax (along with the last line of greater = ... ), you will think that there is a bug(I do think tho). If you were like me, you would have deduced that p is pivot. Good. xs is an input List. Hmm...

If xs  were input List,  the resulting List size would have increased because in the last line, greater includes values equal to pivot while we are also manually adding pivot in a List in the middle of lesser and greater. Then, what is xs? Trust me, I am not Omniscience nor super genius. I have to ask my buddy! :D So, this Haskell buddy tells me that xs is a tail List and p(pivot) is the first element of the input List. Now, it all makes sense!

Here is the Scala implementation of that QuickSort

def hsQuickSort(list : List[Int], 
  direction : String = "begin") : List[Int] = {
 if (list.length <= 1) return list
 val tail = list.tail
 val pivot = list(0)
 val lesser = (for { l <- tail
     if l < pivot
     } yield l).toList
 val greater = (for { g <- tail
     if g >= pivot
     } yield g).toList
 hsQuickSort(lesser, "lesser") ++ List(pivot)
 ++ hsQuickSort(greater, "greater")
}

So, this has been my exercise on thinking in functional programming way. My original FP quicksort is not the most optimized nor simple but I can proudly say that it is my original. I would also encourage you to try to make yourself think and code in FP way. In the beginning, the code will be messy and inefficient but don't get discouraged. Jon Bentley, who gives talk on "Three Beautiful QuickSort", has mentioned of going over 11 versions to a piece of code to achieve the utmost simplicity. Cheer!

Authored by Win Myo Htet

UPDATE

MergeSort
def mergeSort(list:List[Int]):List[Int]={
 if(list.length<=1) return list
 val mid=list.length/2
 val left=mergeSort(list.slice(0,mid))
 val right=mergeSort(list.slice(mid,list.length))
 merge(left,right)
}

def merge(left:List[Int],right:List[Int]):List[Int]={
 if(!left.isEmpty && right.isEmpty)return left
 if(left.isEmpty && !right.isEmpty)return right
 if(left.isEmpty && right.isEmpty)return List[Int]()
 
 if(left(0)<=right(0)) return List(left(0))++merge(left.tail,right)
 else return List(right(0))++merge(left,right.tail)
}

Haskell MergeSort
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x <= y
                      then x : merge xs (y:ys)
                      else y : merge (x:xs) ys

mergesort [] = []
mergesort [x] = [x]
mergesort xs = let (as, bs) = splitAt (length xs `quot` 2) xs
               in merge (mergesort as) (mergesort bs)



Authored by Win Myo Htet
 

2 comments:

  1. I think Scala can do the same like Haskel for quick sort too

    def quickSort(list:List[Int]):List[Int] = list match {
    case Nil => Nil
    case pivot::rest => quickSort(rest.filter(_pivot))
    }

    ReplyDelete