Wednesday, November 30, 2011

Learning Scala : "case class", twitter interview question?

Authored by Win Myo Htet


I have been learning Scala actively and reading blogs, stackoverflow, and books; browsing repo, getting my hands dirty coding snippet and writing blog. I come to this subject "case class". Well, I have to admit that reading the books is boring unlike blog, yet, the books are more reliable sources of information (oh yeah? You don't know that ?) Then, there is stackoverflow, which is more reliable, does not detour and gives instant enlightened answers/insights proof-read by professionals. Still, the book does a better job of explaining (better experience on ebook with the help of Glossary and Index) Yes, I will talk about the "case class". I digress in introduction!

As I crawl the internet to learn Scala, I have downloaded some twitter's open source popular repos on github. I have browsed, skimmed and searched the repos to see how the early scala adopter is using it and defining the Scala usage pattern. Here are the downloaded repos(watchers) : flockdb (1,133), finagle(592), gizzard(966), killdeer(18) and ostrich(179). With only these opened in the eclipse, I use eclispe search tool for some of the thing I am interested in. Search keyword and number found paired.

tailrec 17
foldLeft 19
foldRight 0
for 2945 /*This includes "for" from code comment*/
while 152 /*This also includes "while" from code comment*/

Yup, at that time, I want to know how pervasive is the TCO'ed recursion and fold usages. I guess, for-comprehension rules the day. And I also skim through the codes and notice quite a lot of case block matchings. So here are some stats:

case 1658 /*This includes "just in case ..." kind of comment*/
case class 187 /*don't know if this words pair can be in comment*/

Now I hope to have your attention on the seriousness of the subject matter. It will be on the twitter interview questions! Ok, how is "case class" different from "class"? What is "case class"? "case class" is just a class. Ok... It adds extra niceties-methods : copy, equal, hashcode and (prettier) toString. The important part is that when "case class" is used, the compiler creates companion object which has both apply and unapply methods. "apply(variables)" method in Scala can be replaced with just "(variables)", thus "case class" enables constructor without keyword "new".

scala> case class MyCaseClass(name:String,isCase:Boolean)
defined class MyCaseClass

scala> val myCaseClass=MyCaseClass("myCaseClass",true)
myCaseClass: MyCaseClass = MyCaseClass(myCaseClass,true)

scala> class MyClass(name:String,isCase:Boolean)
defined class MyClass

scala> val myClass=MyClass("myClass",false)
<console>:7: error: not found: value MyClass
       val myClass=MyClass("myClass",false)
                   ^

scala> val myClass=new MyClass("myClass",false)
myClass: MyClass = MyClass@be7f971

So the interviewer asks "Why is that "case class" constructor does not require "new" keyword while the regular one does?" Then, you will answer "Cause the compiler generate "companion object" with "apply" method which can be omitted" Then, the interviewer will ask "We don't care much about "new" keyword but we just want to use regular class and want the benefit of case pattern matching. How would you do that?" Don't worry. Here is the answer for you. "case class" pattern matching is made possible by companion object's unapply extractor method. So we just need to do what "case class" does!

Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.

scala> case class MyCaseClass(name:String,isCase:Boolean)
defined class MyCaseClass

scala> :paste
// Entering paste mode (ctrl-D to finish)

class MyClass(val name:String,val isCase:Boolean)
object MyClass {
  def unapply(myClass:MyClass):Option[(String,Boolean)]={
    if(myClass eq null) None
    Some((myClass.name,myClass.isCase))
  }
}

// Exiting paste mode, now interpreting.

defined class MyClass
defined module MyClass

scala> val myCaseClass=MyCaseClass("myCaseClass",true)
myCaseClass: MyCaseClass = MyCaseClass(myCaseClass,true)

scala> val myClass=new MyClass("myClass",false)
myClass: MyClass = MyClass@7bc2f501

scala> val classList=List(myClass,myCaseClass)
classList: List[ScalaObject] = List(MyClass@7bc2f501, MyCaseClass(myCaseClass,true))

scala> for( x <- classList) x match{
     |   case MyCaseClass(name,isCase)=>println(x+" "+name+" "+isCase)
     |   case MyClass(name,isCase)=>println(x+" "+name+" "+isCase)
     | }
$line2.$read$$iw$$iw$MyClass@7bc2f501 myClass false
MyCaseClass(myCaseClass,true) myCaseClass true

scala> 


In case, if you miss it, I have started the new repl session and redefine the classes. You will notice the ":paste" before MyClass definition tho. It is because we want the repl session to know that the following object with unapply method is the companion object to the class above. If you forget ":paste", don't worry, the repl will warn you. "unapply" method takes MyClass object and return the fields like the default constructor wrapped in the Option. Then we see the unapply extractor method of MyClass at work for pattern matching in the for loop near the bottom like MyCaseClass. Do you notice how pretty "case class" makes MyCaseClass's toString ? Your interviewer will then said "Ok, I am convinced that "case class" is better. What else does it do? How about if I want a new constructor?" Well, "case class" makes default constructor fields to be immutable (Scala always encourages immutable type.), so you don't have to state "val" explicitly. However, you will have to do that for "var" tho. If you want new constructor, you can create them yourself but to use it, you have to use "new" keyword because there won't be any compiler generated overloaded "apply" method for that new constructor.

Then, the interviewer is happy and you get the job at twitter. Don't forget me when twitter goes IPO and you become rich,OK?

Update
So, one interviewee came back to me and said that the interviewer said there are more to the case class. Well, if it is said so, let's take a look:

aunndroid@ubuntu:/host/linux/learning_scala/notes$ cat MyCaseClass.scala
case class MyCaseClass(name:String,isCase:Boolean){
  override def productPrefix="MyPrettyCaseClass"
}


We will use scalac -print option to peep into it. Please do it at home, it is safe. Actually I have omitted some interesting lines of codes for brevity, that's why.

aunndroid@ubuntu:/host/linux/learning_scala/notes$ scalac -print MyCaseClass.scala 
[[syntax trees at end of cleanup]]// Scala source: MyCaseClass.scala
package <empty> {
  case class MyCaseClass extends java.lang.Object with ScalaObject with Product with Serializable {
  .
  .
  .

  };
  final <synthetic> object MyCaseClass extends scala.runtime.AbstractFunction2 with ScalaObject with Serializable {
  .
  .
  .
  
  }
}


Now, we know that case class also has mixin of Serializable and Product trait. We know SerializableWhat is Product? Product enables the case class' fields to be accessed without reflection. Here is the Product scaladoc API and the following is the code snippet.

aunndroid@ubuntu:/host/linux/learning_scala/notes$ scala
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.

scala> :load MyCaseClass.scala
Loading MyCaseClass.scala...
defined class MyCaseClass

scala> val myCaseClass=MyCaseClass("aunnDroidCaseClass",true)
myCaseClass: MyCaseClass = MyPrettyCaseClass(aunnDroidCaseClass,true)

scala> myCaseClass.productArity
res3: Int = 2

scala> myCaseClass.productElement(0)
res4: Any = aunnDroidCaseClass

scala> myCaseClass.productElement(1)
res5: Any = true

scala> for(x <- myCaseClass.productIterator)println(x)
aunnDroidCaseClass
true

scala> myCaseClass.productPrefix
res7: java.lang.String = MyPrettyCaseClass


We have skipped the canEqual function. productArity provides the number of fileds in the class while productElement (n: Int) let us access those field with index. productIterator gives us the Iterator[Any] object which we use with for comprehension to print out the fields. We have overriden productPrefix, which is part of the reason why the case class has a pretty toString method.

Hope the interviewer is happy this time.

Authored by Win Myo Htet

Tuesday, November 22, 2011

Learning Scala : Recursion and TCO 2

Authored by Win Myo Htet



In the previous post, we talk about the TCO. Now, we will do some TCO'ed recursion exercise. I have picked an example  from "Programming Interviews Exposed" on P.97. Here is the problem definition : "Implement a function that prints all possible combinations of the characters  in a string. These combinations range in length from one to the length of the string. Two combinations that differ only in ordering of their characters are the same combination. In other words, "ab" and "ca" are different combinations from their input string "abc". But "ba" is the same as "ab""

We want to write a tail call recursive code with accumulator, which will give us the final result, at the end of iteration. We have to remember that we don't want any computation on the way back in these recursive calls. We will accumulate the result along the way so that we don't have to do backtrack computing. Before we go on analyzing the code, let me disclose something first. In the past, when I started coding, I tried to implement this problem by myself without reading at the authors' analysis. I failed. :) I was quite frustrated with my attempts and didn't try it again until I have started learning recursion in Scala. I try again this time. Yay! I made it. ;) With these experience, I have understood the problem and answer well. I am able to do educated grouping to form some patterns.

Yet, I still need paper and pencil! Nothing can beat paper and pencil in this age of computing when it comes to do pattern analyzing scratches for the problem. Holding pencil in the right hand like a sword, while paper in the left hand like a shield, I march on! When I am implementing the simple recursion for this problem, I look at the two logically grouped tables in the book (I won't be drawing these tables here. Buy the book!) It helps me with my thinking and visualizing of the data grouping and the flow of the data in the code too. Now, I am trying to go through the data flow where I don't have to backtrack for computing, the following data flow emerge.


input: w x y z 
=======
result: w wx wxy wxyz wxz wy wyz wz x xy xyz xz y yz z 

mydata flow
-----------
w
x
  wx 
y
  wy  xy
          wxy
z
  wz  xz  
          wxz
               yz
                   wyz  xyz  
                             wxyz  

Authored by Win Myo Htet


First, we will have 'w', which we will carry it in our accumulator to the next iteration. Then, we have 'x'. We will keep the 'w'. We want to add 'x' and also our first combination of character 'wx' too. I would like to go over this again. We start with 'w'. Carry it over. Get a new character 'x', keep the 'w', add the new 'x' and add 'w' and 'x' combination. Carry all these over in the accumulator to the next iteration. We still want to keep whatever we have before and add a new 'y' and the new characters combinations of 'y' and the previous list. Now, you know what happens to Z. If you don't, well, read this whole paragraph recursively until you get it. ;)

So basically, we have the accumulator and add a new character to the accumulator and also add a new list which has the combination of the new character with the string in the accumulator. Cool!

import scala.annotation.tailrec
 final def tailReccursionEx(str : String) : List[String] = {

  @tailrec
  def doTailRecursionEx(str : String, pos : Int, accu : List[String]) : List[String] = {
   if (pos == str.length) return accu
   else {
    val newAccu = accu ++ accu.foldLeft(List[String](str(`pos`).toString)) {
     (l, ch) => l :+ ch + str(`pos`)
    }
    doTailRecursionEx(str, pos + 1, newAccu)
   }
  }

  doTailRecursionEx(str, 0, List[String]())
 }

To make life simpler, I have started foldLeft process with the List with the new character. You can get the complete source code here .

When you look at the complete source code from the provided git repo link. You will see the comparison of imperative, recursion and TCO'ed recursion. They all look comparable. Here is another FP recursion implementation from the book (the one after the above example). When I implement the regular recursion, I am able to write it out. However, I have to make considerable effort to get it TCO'ed. The TCO'ed code is more verbose than regular recursion and at the same time also suffer from the performance impact.

Why do we want TCO again? To prevent side effect. To prevent stack over flow. To prevent our precious application from crashing. Is TCO worth the time and energy? Do you have better tip and trick ? Please let me know. I get to this haskell thread when I start looking into recursion via Ackermann.  I was thinking of implementing it TCO'ed. Having read the thread and seeing the TCO destroy the beauty of recursion and the extent needed to go to achieve the TCO, I don't want to try anymore. Yup, these are the ugly sides of FP and recursion. It is the different side of the same coin. We need to have discussion going how to tackle these issue to get FP traction for wider use. Here is one of the methods called trampoline to address some TCO issue in Scala. It is still not easy to use. It is advanced stuff. Well, it's an option.

StackOverflow discussion on this.


Authored by Win Myo Htet

Learning Scala : Recursion and TCO 1


Authored by Win Myo Htet



fib=1:1[a+b|(a,b)<-zip fib(tail fib)]
My buddy texted me the above Haskell Fibonacci code, which uses  recursion, of course. Very neat, huh? We have also seen recursion making QuickSort self documenting  and gracefully simple. I have also used recursion in my Boyer-Moore search without making a conscious effort. Yes, recursion is at home in functional programming.

Recursion can make thing simple. Yet, simple is not easy! It takes recursively using recursion to master recursion. Ok, that's not a very good joke. But you get the point that you need to keep using recursion whenever possible to learn to think in recursion if you want to be the master in the land of functional programming. In Scala, you can get away with, without using recursion, then, you are not really in functional programming land. Assume that you have decided to start using recursion buying into my much lauded hype, well, let me raise the bar a bit then. With great power comes ... I mean, "With Recursion comes this nasty bug stack overflow."

Great... Well, all hope is not lost, sort of... The stack overflow happens because the function keeps calling itself thus putting itself on the stack memory again and again until it runs out of stack memory. The remedy to this is TCO (Tail Call Optimization). In TCO, we are using a particular pattern of recursion so that the compiler can optimize the code to avoid the stack overflow. Let's try lookat some code.

scala> List(1,2,3,4,5,6).foldLeft(10)(_ + _)
res0: Int = 31

Are we still talking about recursion? Yes, we are. I have mentioned in one of the blog before that foldLeft is preferred over foldRight. The reason being is that foldLeft is TCO'ed while foldRight is not. Here is a bit detail analysis of their process.

Here is the foldLeft sequence:
((((((10 + 1) + 2) + 3) + 4) + 5) + 6)
((((((11) + 2) + 3) + 4) + 5) + 6)
(((((13) + 3) + 4) + 5) + 6)
((((16) + 4) + 5) + 6)
(((20) + 5) + 6)
((25) + 6)
(31)

Here is the foldRight sequence:
(1 + (2 + (3 + (4 + (5 + (6 + 10))))))
(1 + (2 + (3 + (4 + (5 + (16))))))
(1 + (2 + (3 + (4 + (21)))))
(1 + (2 + (3 + (25))))
(1 + (2 + (28)))
(1 + (30))
(31)

In foldLeft, you can see that the result can be computed before the next iteration, while foldRight have to traverse all the way to the end, then compute and backtrack for more computing. Alright, how do we do that for our own Fibonacci code? Following the above foldLeft example, here is our Fibonacci code

object Fibonacci{

  def main(args:Array[String]):Unit={
    println(fibonacci(6))  
  }

  import scala.annotation.tailrec
  private def fibonacci(n:Int):Int={
  
    @tailrec
    def fibonacciTCO(n:Int,accu:Int):Int={
      if(n==1) return n+accu
      fibonacciTCO(n-1,n+accu)
    }
    
    fibonacciTCO(n,10)
  }  
}

Authored by Win Myo Htet

We have to include proper import above fibonacci and @tailrec annotation above fibonacciTCO. The fibonacci function has to be declared private or final so that the method cannot be overridden to undo the TCO. You also need to have accumulator (e.g. In our case accu) so that we can carry on our result and return  the final result at the last iteration without needing to backtrack all the way back. If you look carefully at the code, you will notice that, our (near ?) result(accu) is always in the same function scope and the stack reference to the previous call is not needed any more. Now, to make sure that our method is TCO'ed, we will look into the byte code of our compilation using javap with -c and -private option on the class file of Fibonacci$.class itself.

aunndroid@ubuntu:/host/Users/aunndroid/workspace/source_code$ javap -c -private Fibonacci\$
Compiled from "fibonacci.scala"
public final class Fibonacci$ extends java.lang.Object implements scala.ScalaObject{
public static final Fibonacci$ MODULE$;

public static {};
  Code:
   0: new #9; //class Fibonacci$
   3: invokespecial #12; //Method "":()V
   6: return

public void main(java.lang.String[]);
  Code:
   0: getstatic #19; //Field scala/Predef$.MODULE$:Lscala/Predef$;
   3: aload_0
   4: bipush 6
   6: invokespecial #24; //Method fibonacci:(I)I
   9: invokestatic #30; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
   12: invokevirtual #34; //Method scala/Predef$.println:(Ljava/lang/Object;)V
   15: return

private int fibonacci(int);
  Code:
   0: aload_0
   1: iload_1
   2: bipush 10
   4: invokespecial #42; //Method fibonacciTCO$1:(II)I
   7: ireturn

private final int fibonacciTCO$1(int, int);
  Code:
   0: iload_1
   1: iconst_1
   2: if_icmpne 9
   5: iload_1
   6: iload_2
   7: iadd
   8: ireturn
   9: iload_1
   10: iconst_1
   11: isub
   12: iload_1
   13: iload_2
   14: iadd
   15: istore_2
   16: istore_1
   17: goto 0

private Fibonacci$();
  Code:
   0: aload_0
   1: invokespecial #48; //Method java/lang/Object."":()V
   4: aload_0
   5: putstatic #50; //Field MODULE$:LFibonacci$;
   8: return

}

aunndroid@ubuntu:/host/Users/aunndroid/workspace/source_code$ 

Don't worry too much if you don't know anything here. Well, if you know everything and if you are the master of the Scala Universe, you might even not bother to read this, right ? ;) So, we find our TCO'ed fibonacciTCO$1, at line 30. At line 47, instead of re-referencing itself, we find this 17: goto 0. When you see that goto expression, you can be rest assured that your tail call recursion has been TCO'ed. The tail call recursion get optimized by scala compiler using goto. Compare to other functional languages, Scala has some restraints. Since Scala run on JVM and JVM does not support TCO(yet), Scala requires that tail call recursion must be self recursive tail call (e.g. fibonacciTCO must call fibonacciTCO).

Yup, that's quite a bit of info. I don't conceive these knowledge auto-magically. I am not the creator of Scala. I have read the blogs and books. So, I also recommend you to keep on reading about recursion and encourage you to blog about it too. Reading the same info from different narration will help you grasp the concept better. Here are the two I have read: Graham's blog and Nick's blog. They are also explaining the same info in their own way. As usual, we will be getting into some code in the next blog in this series.



Authored by Win Myo Htet

Friday, November 18, 2011

Learning Scala : Boyer–Moore search 3

Authored by Win Myo Htet


We have both the first table and second. Now we just need to create a startSearch method, which will take the two tables and two Strings : searchBody and searchString. We will return a List[Int],  which will have positions of the match strings, thus the size will be the number of the matches.

def startSearch(searchBody : String, searchString : String,
  pos : Int, table1 : Map[Int, Int],
  table2 : Map[Char, Int]) : List[Int] = {
  
  def search(pos : Int, inc : Int = 0) : List[Int] = {
   if (pos >= searchBody.length - 1) return List[Int]()
   val jump = matchChar(pos + searchString.length - 1, inc)
   if (jump > 0) return search(pos + jump, 0)
   if (jump == 0) return search(pos, inc + 1)
   if (jump == -1) 
     return pos :: search(pos + searchString.length, 0)
   List[Int]()
  }

  def matchChar(pos : Int, inc : Int) : Int = {
   if (!searchBody.isDefinedAt(pos - inc)) return -2
   
   if (!table2.isDefinedAt(searchBody(pos))) 
     return searchString.length
   if (searchBody(pos - inc) 
       == searchString(searchString.length - 1 - inc)) {
    if (searchString.length - 1 - (inc + 1) == -1) return -1
    return 0
   }
   val result1 = table1.get(inc).get
   
   if (!table2.isDefinedAt(searchBody(pos - inc))) 
     return result1   
   val result2 = table2.get(searchBody(pos - inc)).get
   if (result2 > result1) return result2
   result1
  }
  
  search(pos)
 }


We have two nested functions inside our function. We check the validity of our search continuity by searchBody string length. We invoke matchChar function to get the jump value. -1 means we found a string match. 0 means the character match. All the positive value means, we will jump that much position. The sort of default return is empty list. Inside matchChar, we will do the validity check and return -2, which will fall through our default empty list return in search function. We return the length of the searchString if the last character is not a valid character according to table2. Then, we do the individual character matching. If it is matched, we will check to see it is the last character to match. If it is the last character matched, we found a match String and return -1. If it is not the last character matched, we will return 0. If the characters are not matched, the two tables are referred for the jump value and larger value is return. In case of the character not being in the table2, we cannot jump the whole searchString's length since there still is a possible match inside the range.

Below is some output. I admit that it is not well tested. You can get the complete source code here.

Look at the two Map values of ANPANMAN
aunndroid@ubuntu:/host/Users/aunndroid/workspace/source_code$ scala boyer_moore.scala 

Usage: boyer_more string_body search_string
e.g. : boyer_more "HERE IS A SIMPLE EXAMPLE" "ANPANMAN"
Map(A -> 1, M -> 2, N -> 3, P -> 5)
Map(0 -> 1, 5 -> 6, 1 -> 8, 6 -> 6, 2 -> 3, 7 -> 6, 3 -> 6, 4 -> 6)
ANPANMAN
HERE IS A SIMPLE EXAMPLE
       -
        ANPANMAN
HERE IS A SIMPLE EXAMPLE
               -
                ANPANMAN
HERE IS A SIMPLE EXAMPLE
                       -
Match found : 0 Pos : 

Here is the search replica of step by step B-M search demo from the official site <?>

aunndroid@ubuntu:/host/Users/aunndroid/workspace/source_code$ scala boyer_moore.scala "HERE IS A SIMPLE EXAMPLE" "EXAMPLE"
Map(E -> 6, X -> 5, A -> 4, M -> 3, L -> 1, P -> 2)
Map(0 -> 1, 5 -> 6, 1 -> 7, 6 -> 6, 2 -> 6, 3 -> 6, 4 -> 6)
EXAMPLE
HERE IS A SIMPLE EXAMPLE
      -
       EXAMPLE
HERE IS A SIMPLE EXAMPLE
             -
         EXAMPLE
HERE IS A SIMPLE EXAMPLE
               -
         EXAMPLE
HERE IS A SIMPLE EXAMPLE
              -
         EXAMPLE
HERE IS A SIMPLE EXAMPLE
             -
         EXAMPLE
HERE IS A SIMPLE EXAMPLE
            -
         EXAMPLE
HERE IS A SIMPLE EXAMPLE
           -
               EXAMPLE
HERE IS A SIMPLE EXAMPLE
                     -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                       -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                      -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                     -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                    -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                   -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                  -
                 EXAMPLE
HERE IS A SIMPLE EXAMPLE
                 -
Match found : 1 Pos : 17
aunndroid@ubuntu:/host/Users/aunndroid/workspace/source_code$

That's it, folk!


Authored by Win Myo Htet

Learning Scala : Boyer–Moore search 2

Authored by Win Myo Htet



In the previous post, I have explained the second table before the first table. The reason is that the second table is easy to understand. We will do the same and code the second table first.

def createTable2(str : String) : Map[Char, Int] = {
  val strOps = new StringOps(str).reverse.tail

    val result = strOps.foldLeft(Map[Char,Int](),1)  {
      case ((mMap,pos),ch) if ! mMap.isDefinedAt(ch) 
        => (mMap ++ Map(ch -> pos), pos+1)
      case ((mMap,pos),ch)  => (mMap,pos +1)
    }
    return result._1
We convert our search string, str, to StringOps type, reverse it and drop the first charcter while keeping the rest. If it is of StringOps format, we can do manipulation like List. It is being reversed so that we can operate foldLeft (foldLeft is preferred over foldRight whenever it is possible). We drop the first character, which was the last character before because the last character is not considered in the second table. After that, our usual foldLeft manipulates the string to have a tuple of Map[Char,Int] and Int. The last Int is to track the position of the iteration process. We add each character into the Map if it is not there. Anything else, we just increment the pos to track the iteration position. We return the Map from the tuple.

The second table is done and we will work on the first table. It is pretty much the same pattern: StringOps, reverse, foldLeft, tuple, Map, etc ... The only differences will be calling calculate function and using the filter on the List. filter filters out character based on the condition. Here is filter's signature "def filter (p: (Char) ⇒ Boolean): String"

def createTable1(str : String) : Map[Int, Int] = {
  val strOps = new StringOps(str).reverse
  val strList = strOps.foldLeft(List[Char]()) {
   case (l, ch) if !l.contains(ch) => ch :: l
   case (l, ch) => l
  }
  val result = strOps.foldLeft(Map[Int, Int](), 0, "") {
   case ((mMap, pos, subStr), ch) if pos != 0 => (mMap ++ Map(pos -> calculate(pos, subStr, `str`,
    `strList`.filter {
     case x if x == ch => false
     case _ => true
    })), pos + 1, ch + subStr)
   case ((mMap, pos, subStr), ch) => (Map(0 -> 1), pos + 1, ch + subStr)
  }
  result._1
 }

There also is a (`backtick quote`), which allows us to refer to the variables outside of the scope. So, the general structure of the code is similar to that of the second table. Here we have Map[Int,Int] instead of Map[Char,Int]. I have decided to use Int, the length of sub string for the Map key. strList with filter will give us a new List with valid character. For example, in valid characters List (A,M,N,P), M will generate List (A,N,P). The calculate function will use strList to create a new sub string and get their position in the search string by calling subSetJump function.

def calculate(pos : Int, subStr : String,
  str : String, strList : List[Char]) : Int = {
  
  val jList = strList.foldLeft(List[Int]()) {
   (j, ch) => subSetJump(ch + `subStr`, `str`) :: j
  }

  if (jList.max < 0) {
   if (subStr.length == 1) return str.length
   return calculate(pos - 1, subStr.slice(0, subStr.length),
    str, List[Char]())
  }
  if (str.length - 1 - pos - jList.max == 0)
   return calculate(pos - 1, subStr.slice(1, subStr.length),
    str, List[Char]())
  str.length - 1 - pos - jList.max
 }

def subSetJump(subStr : String, str : String) : Int = {
  val list = (for (i <- 0 until str.length)
   yield str.slice(i, i + subStr.length)).toList
  list.indexOf(subStr)
 }

subSetJump function will return the index or -1 if there is no match. We check the condition on end-of-string, mis-match and zero-jump; and recursively call the calculate function again with the updated pos and subStr. When we call the calculate function again, we are doing the case for backtrack. That means there won't be any variation of sub strings. Thus, I have added the code for that above jList. It can be a bit of redundant and messey but I am a bit more into getting the working code fast ;) .

calculate function with more code
def calculate(pos : Int, subStr : String,
  str : String, strList : List[Char]) : Int = {
  if (strList.isEmpty) {
   val jListMax = subSetJump(subStr, str)
   if (jListMax < 0)
    return calculate(pos - 1, subStr.slice(1, subStr.length),
     str, List[Char]())
   if (str.length - 1 - pos - jListMax == 0)
    return calculate(pos - 1, subStr.slice(1, subStr.length),
     str, List[Char]())
   return str.length - 1 - pos - jListMax
  }

  val jList = strList.foldLeft(List[Int]()) {
   (j, ch) => subSetJump(ch + `subStr`, `str`) :: j
  }

  if (jList.max < 0) {
   if (subStr.length == 1) return str.length
   return calculate(pos - 1, subStr.slice(0, subStr.length),
    str, List[Char]())
  }
  if (str.length - 1 - pos - jList.max == 0)
   return calculate(pos - 1, subStr.slice(1, subStr.length),
    str, List[Char]())
  str.length - 1 - pos - jList.max
 }

Now, we are done with the first table too. Next we will use these two tables to do the search.

Authored by Win Myo Htet

Thursday, November 17, 2011

Learning Scala : Boyer–Moore search 1

Authored by Win Myo Htet


I have been tipped off about Boyer-Moore for this Learning Scala series. Here is the wiki. Basically, it tries to do the matching of the search string at the rear, thus if it does not match, it can hop the length of the search string. So, the longer the search string, the more efficient it is.

As usual, the wiki will be our common reference. It is said that BM maintain two tables. The second table check if the character is in the search string or not. If the character is not in the search string at all, then, you can easily jump to the length of the search string. It also stores each character of the search string with the value which is the distance from the end of the string of the character.

It is the first table that is quite complex to construct. I would recommend you  to  read again and again until you understand the construction of this table before you get to the code itself. This table will determine if the program will do proper thorough search without bugs or not. Let's try to dive into the first 4 rows of the table since the rest of the value is the same as the 4th row.

(The current character inspected is in bold and next nearest possible match is underlined )
Case : AANPANMAN 
1st row. If the character is not N, it'd better be A or M or P. Otherwise, we will hop the length of the search string, which is 8, the length of the search string, according to the second table. So what is the minimum jump we can make, now that the character is not N. It has to be 1 jump to cover the possibility of the character being A. We assign the value 1 here.

Case : ANPANMPNANPANMAN 
2nd row. Now that we have the last character matching N and get to the character before it, we have the case of string "AN". So if the character is not A, as usual, it'd better be M or N or P. Then, we have strings of "MN", "NN" and "PN". None of them are part of the actual search string "ANPANMAN". The value for not matching the needed character lead us to assign 8, the length of the search string.

Case : AAAANPANMAN
3rd row. "MAN" means "AAN", "NAN" and "PAN". Of these, "PAN" is the sub string of our "ANPANMAN". The wiki said that its jump value is 3. How do we get 3 ?  We know that the matched P (which is M) from our current search position is 5. And P position in the search string is 2. We will subtract these values and we get 3.

Case : "AAAAAMANPANMAN"
4th row. "NMAN" does not give us valid sub string.  So, we will backtrack one more to look for the valid sub string, "NMAN", "MAN", which is a substring of "ANPANMAN". Our position of M in our search String is 5. The position of the first sub string of MAN in search string also is 5. If we do the difference, 5-5=0 and we are not making any jump. We don't want that! Let's backtrack one more time. "NMAN", "AN" is a sub string of "ANPANMAN".  Our A position in sub string NMAN in our search is 6 and the position of the first sub string AN in ANPANMAN is 0. 6-0=6.

If you have thus far followed me, good! Let's go over this table  one more time!

1st row redux. This is the first case and there are little info to do any sort of meaningful processing. We simply can assume that the value will always be 1.

2nd row redux. This row is an odd one in the sequence of 1, 8, 3, 6, ... The jump value 8 is meaningful for the search string ANPANMAN, where AN is not a sub string of the search string,ANPANMAN . How about if these two characters string is a substring of the search string. For example, search String is ANPANMPN and our case is AAAAAAANPANMPN, where PN is AN. So, it is similar to 4th row above now. So, in this case, we will have have 6-0=6.

Now that we have a bit of general pattern, we won't go over the rest of the row. The first row will always be 1. From 2nd and onward, we can do processing in a way that if the string of length 2 is not substring of the search string, the value will be the length of the search string, else we will always look for the difference of the position of the character in a current string to the position of the first valid sub string in the search string. If the difference is 0, we will backtrack the character and do the processing again. Now that we have analyzed what we can, let's try to put them into code next.

Authored by Win Myo Htet

Monday, November 14, 2011

Learning Scala : writing Quicksort the functional programming way

Authored by Win Myo Htet


The previous post has been about doing Scala exercise from the Haskell example. Thus, the code is already written in the functional programming paradigm and I have just translated it into Scala language. To learn Scala and functional programming, we have to learn to think in the functional programming way. How do we do that? As usual, there is no easy way. We can only learn through getting our hands dirty by exercising through thinking in functional programming way.

So, Quicksort has been picked for this exercise and the Quicksort wiki will be used as a common reference. I will start implementing Quicksort in Scala by following the pseudo code described under the sub title "In-Place version". The Wiki has talked about some methods of choosing the pivot. However, it is not trivial to choose the pivot and it all depends on the set of data for the pivot-chooising-method to be meaningful. So, we will just pick the value of the first index of the array. Remember, the first value of the array can still be any value. So, here they are, very Java like.

Partition method:
def partition(left : Int, right : Int, pivotIndex : Int, array : Array[Int]) : Int = {
 val pivotValue = array(pivotIndex)
 array(pivotIndex) = array(right)
 array(right) = pivotValue
 var storeIndex = left
 for (i <- left until right) {
  if (array(i) <= pivotValue) {
   val temp = array(i)
   array(i) = array(storeIndex)
   array(storeIndex) = temp
   storeIndex = storeIndex + 1
  }
 }
 val temp2 = array(storeIndex)
 array(storeIndex) = array(right)
 array(right) = temp2
 storeIndex
}

quickSort method
def quickSort(left : Int, right : Int, array : Array[Int], direction : String = "begin") : Unit = {
 if (left < right) {
  val pivotIndex = 0
  val newPivotIndex = partition(left, right, pivotIndex, array)
  quickSort(left, newPivotIndex - 1, array, "left")
  quickSort(newPivotIndex + 1, right, array, "right")
 }
}

The codes are simply the implementation of the pseudo code. The only thing that  differ from pseudo code might be that I have used the Scala's default argument variable for "direction" for debugging purpose.

To be honest, I don't quite know how to re-implement this code into functional programming way. So, I have to get back to the basic abstract under sub title, "Algorithm".
1. pick a pivot(done!)
2. put a lesser value on the left, greater value on the right, put pivot where it belongs, which is between lesser and greater, of course.
3. do the same process for lesser and greater by way of recursion.

So how do we do step No. 2. Well, first, let's pick List as the data structure for our method since List is the preferred data structure when it comes to functional programming. The input will be List. Then, there are lesser and greater created based on the condition that the value of element of the List lesser(<) than pivot go to lesser List and  the value of element of the List greater(>) than pivot go to greater List. How do we do that? We have to feed our variable List into some looping mechanism that will give us back two Lists.

From my experience with Oleg's Tic Tac Toe code. He makes use of foldLeft quite a lot. So, folding might be a solution?  Yes, it is. If you read that blog on my understanding of the folding process, you will see that we can feed any collection into folding and can process the data for all kind of type of data as an end result. If you want to know how I come into this conclusion along with the thinking  process, you need to read through that  Tic Tac Toe blogs series.

So, folding will be the looping mechanism. I have understood the folding process to involve (foldLeftSeedValue,foldLeftElement). The foldLeftSeedValue will be our end result and we want it to be two Lists namely: lesser and greater. So, the seed value has to be a tuple of two Lists : lesser and greater. Then, we will do the matching on the conditions : lesser value will be added to the left List and greater value will be added to the right List. If you can visualize this far, that's good enough and let's look at the code.

def myQuickSort(list : List[Int]) : List[Int] = {
 def myQuickSort(list : List[Int], 
      direction : String = "begin") : List[Int] = {
  if (list.length <= 1) return list;
  val pivot = list(0)
  val pl = list.foldLeft(List[Int](), List[Int]()) {
   case ((l, r), e) if e < pivot => (e :: l, r)
   case ((l, r), e) if e > pivot => (l, e :: r)
   case ((l, r), e) if e == pivot => (l :+ e, r)
   case ((l, r), e) => println("Error processing : " + e);
                                     sys.exit()
  }
  myQuickSort(pl._1.slice(0, pl._1.length - 1), "left") 
                ++ List(pivot) ++ myQuickSort(pl._2, "right")
 }
 myQuickSort(list)
}

I have used nested method, which is not necessary, here. Let's get back to our foldLeft and case condition here. The first case line start with the form of (foldLeftSeedValue,foldLeftElement) as expected, foldLeftSeedValue is (l,r) tuple. foldLeftElement is e. Then if the condtion expression is met, the foldLeftElement is added to the head of the list l and the (l,r) tuple is passed on for the next iteration. The same can be said of the second match case. The third match case add the value to the end of the list l. The forth match is added because of the compiler warning that matching is not exhaustive. It makes me smile tho. :)

Now, we have a tuple of two Lists: l being lesser and r being greater(on the right side of the pivot value). So we process the lesser and greater List the same way by means of iteration. We have to subtract one value from the lesser because we have added value equal to pivot at the end of the lesser list including the pivot itself. The pivot value has been put into a List which is placed between lesser and greater. The recursion will be exhausted when the list size is less than or equal 1. You know what's next.

So, that's the end of my Quicksort in functional programming way in my own style(TM).  However, you cannot count on me to write a truly efficient and fp style Quicksort because I am still a noob! Let's see how Quicksort is coded in Haskell land.

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

Wow! Much way simpler! If it were not for Alan J. Perlis's wisdom: "Simplicity does not precede complexity, but follows it.", I would have shot myself! Let's look into this Haskell quicksort code. The detail analysis of this code can be found here at section 3.1. The first line is method signature. The second line said that if the list is empty, return empty list. The third line syntax is a bit tricky. If you don't know that tricky syntax (along with the last line of greater = ... ), you will think that there is a bug(I do think tho). If you were like me, you would have deduced that p is pivot. Good. xs is an input List. Hmm...

If xs  were input List,  the resulting List size would have increased because in the last line, greater includes values equal to pivot while we are also manually adding pivot in a List in the middle of lesser and greater. Then, what is xs? Trust me, I am not Omniscience nor super genius. I have to ask my buddy! :D So, this Haskell buddy tells me that xs is a tail List and p(pivot) is the first element of the input List. Now, it all makes sense!

Here is the Scala implementation of that QuickSort

def hsQuickSort(list : List[Int], 
  direction : String = "begin") : List[Int] = {
 if (list.length <= 1) return list
 val tail = list.tail
 val pivot = list(0)
 val lesser = (for { l <- tail
     if l < pivot
     } yield l).toList
 val greater = (for { g <- tail
     if g >= pivot
     } yield g).toList
 hsQuickSort(lesser, "lesser") ++ List(pivot)
 ++ hsQuickSort(greater, "greater")
}

So, this has been my exercise on thinking in functional programming way. My original FP quicksort is not the most optimized nor simple but I can proudly say that it is my original. I would also encourage you to try to make yourself think and code in FP way. In the beginning, the code will be messy and inefficient but don't get discouraged. Jon Bentley, who gives talk on "Three Beautiful QuickSort", has mentioned of going over 11 versions to a piece of code to achieve the utmost simplicity. Cheer!

Authored by Win Myo Htet

UPDATE

MergeSort
def mergeSort(list:List[Int]):List[Int]={
 if(list.length<=1) return list
 val mid=list.length/2
 val left=mergeSort(list.slice(0,mid))
 val right=mergeSort(list.slice(mid,list.length))
 merge(left,right)
}

def merge(left:List[Int],right:List[Int]):List[Int]={
 if(!left.isEmpty && right.isEmpty)return left
 if(left.isEmpty && !right.isEmpty)return right
 if(left.isEmpty && right.isEmpty)return List[Int]()
 
 if(left(0)<=right(0)) return List(left(0))++merge(left.tail,right)
 else return List(right(0))++merge(left,right.tail)
}

Haskell MergeSort
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x <= y
                      then x : merge xs (y:ys)
                      else y : merge (x:xs) ys

mergesort [] = []
mergesort [x] = [x]
mergesort xs = let (as, bs) = splitAt (length xs `quot` 2) xs
               in merge (mergesort as) (mergesort bs)



Authored by Win Myo Htet
 

Sunday, November 6, 2011

Learning Scala: doing Scala exercise from Haskell's example

Authored by Win Myo Htet

In my previous post, I complain about the lack of exercise from the Scala book for the learner to do exercise. So, I have been thinking of getting myself some exercise and talk with my friend who has inspired me to try functional programming. I have done the mandatory fizzbuzz interview on him regarding Haskell. I have updated his Haskell solution along with my Scala one in the previous post. I tell him about my plight and he blesses me with the following Haskell code from here.

type Point = (Float,Float)
type Color = (Int,Int,Int)
type Polygon = [Point]

writePoint :: Point -> String 
writePoint (x,y) = (show x)++","++(show y)++" "

writePolygon :: (Color,Polygon) -> String 
writePolygon ((r,g,b),p) = "<polygon points=\""++(concatMap writePoint p)++"\" style=\"fill:#cccccc;stroke:rgb("++(show r)++","++(show g)++","++(show b)++");stroke-width:2\"/>"

writePolygons :: [(Color,Polygon)] -> String 
writePolygons p = "<svg xmlns=\"http://www.w3.org/2000/svg\">"++(concatMap writePolygon p)++"</svg>"

colorize :: Color -> [Polygon] -> [(Color,Polygon)] 
colorize = zip.repeat

rainbow@[red,green,blue,yellow,purple,teal] = map colorize [(255,0,0),(0,255,0),(0,0,255),(255,255,0),(255,0,255),(0,255,255)]

main = do
writeFile "tut0.svg" $ writePolygons (blue [[(100,100),(200,100),(200,200),(100,200)],
                                            [(200,200),(300,200),(300,300),(200,300)]])

The code has been modified to be able to compile as filename.hs. What does the code do? It creates a svg file, which draw a rectangluar with blue line. Since I have already provided the original link, I won't be going into it in detail. Let's try to implement our Scala equivalent.

The first three lines of the Haskell code is pretty much class declaration in Scala with Polygon having a List of Point.

In Haskell, the syntax structure for method is that method signature come first and then method body come next with leading method name. Haskell is space sensitive like Python.  In method signature, methodName::input->outPut is the format.

writePoint is getting the value of Point in String format and it is pretty much Point's toString. 

writePolygon is getting a String with the values of Color and Polygon embedded in some XML format. So basically, Color and Polygon does not have dedicated method like Point to print their value. Their values have been constructed in the line.


writePolygons create the XML String with the parameter being List of (Color,Polygon) Tuple, which get feed into writePolygon individually for the respective XML string result.

Ok, that's good enough to try some Scala, let's create three classes: Point, Color and Polygon. Since their values are always printed, we will override toString and return the desire format. It is straight forward to do that for Point and Color. Polygon has the List Collection of Point as its value and we will be using foldRight to print out the desire value. Now that overriding toString is done, we will just drop these object into our writePolyGon and writePolyGons method.

That is easy. Here comes more interesting part. Before I start coding, when I was look at colorize and my friend asks me to read it, I said that it is currying and he said that it is correct. He said that the method definition's expression has dropped their parameter since it is inferred. Thus, colorize=zip.repeat. Now, I have to read a bit at the site to see what they do so that I know how to code them in Scala. It is said that repeat make use of Haskell's lazy pattern. Ok, it seems alright, Scala also has lazy val. Basically, repeat wait for the the next parameter List and create a new List of the size of the paramter List and fill it with the first parameter . zip zipps'em up these two. Sound good.

I don't know rainbow@ line. My friend is gone now. But it seems that it is a list of partial function, functional literal, created by colorize by having an initialize color value. I skip to the last line and read it and it appears so. Since, my Scala knowledge is quite weak, I don't recall anyway to emulate the Haskell's rainbow@ line. (I will be very happy to learn if there is a way). So, I won't be having rainbow for now :(  but blue only. So, let's see what I have done.

class Point(val x:Float,val y:Float){
override def toString()=x+","+y
}

class Color(val r:Int,val g:Int,val b:Int){
override def toString()=r+","+g+","+b
}

class Polygon(val pList:List[Point]){
override def toString()=pList.foldRight(""){(x,s)=>x+" "+s}
}

def writePolyGon(c:Color,p:Polygon):String="<polygon points=\""+p+"\" style=\"fill:#cccccc;stroke:rgb("+c+");stroke-width:2\"/>"

def writePolyGons(tList:List[(Color,Polygon)]):String
={"<svg xmlns=\"http://www.w3.org/2000/svg\">"+
tList.foldLeft(""){(s,x)=>s+writePolyGon(x._1,x._2)}+
"</svg>"
}

def colorize(c:Color)(pList:List[Polygon]):List[(Color,Polygon)]
={
lazy val repeat=(for(p<-pList)yield(c,p)).toList
repeat
}

val blue=new Color(0,0,255)
val blue_ = colorize(blue)_

val rectangle1=new Polygon(List(new Point(100,100),new Point(200,100),new Point(200,200),new Point(100,200)))
val rectangle2=new Polygon(List(new Point(200,200),new Point(300,200),new Point(300,300),new Point(200,300)))

printToFile("tut0ofScala.svg",writePolyGons(blue_((List(rectangle1,rectangle2)))))



def printToFile(fileName:String, content:String):Unit
={printToFile(new java.io.File(fileName))(p=>p.println(content))}

/*http://stackoverflow.com/questions/4604237/how-to-write-to-a-file-in-scala*/
def printToFile(f :java.io.File)(op : java.io.PrintWriter => Unit)
={  val p=new java.io.PrintWriter(f)
try{ op(p)} finally{p.close()}
}

I have created 2 rectangle objects instead of feeding non descriptive data. Scala not having the file writing mechanism as part of the standard library does not also help. I have to run to SO for that as it can be seen in the comment line. Definitely, my Scala code will have a lot of room for improvement and I sure want to learn them.



Authored by Win Myo Htet

Friday, November 4, 2011

Learning Scala : Advice of the exp... Noob

Authored by Win Myo Htet



So I have learned a bit of Scala and I am exposed a bit to some Scala community and learning material. Well, I have one idea on helping the Scala learner. I have disclosed my level of knowledge on Scala and the length of exposure to Scala here. Having said that, let me jump the Gun! Of course the official Scala site would like to tout about the available resources to learn: the mailing list, the books, blog, irc and some training. It seems quite a lot but still they are not enough.

I want to Scala IRC channel. I feel like being in Shangri-la, very tranquil and quiet with all the sage like participants. Very very occasionally, there will be some discussions of topic beyond my level. I feel uncomfortable to disturb this tranquility with my ignorant questions. I have very little luck with mailing list  before and I don't even bother. The training courses are few and far between at far away location with the Enterprise price ... Books and blogs are my source of knowledge. Let's not talk about the collection of books you can choose from to begin learning Scala. I can recommend you two books: "Programming in Scala" by Martin Odersky (the Scala language author) and "Programming Scala" by Dean Wampler and Alex Payne . I pick the later.

From reading the book and some blogs, I have this idea to help learner grasp functional programming better. It is a very innovative idea of having an exercise with solution after each chapter! You see, I have been typing in the examples from the book I am reading. Yet, I am a bit clueless on where to start using these knowledge on my own. I would like to try them out but I don't know my capabilities or limits or the proper usage. I have learned that Scala is such a powerful language so that the developer can even shoot their own foot if not use properly from the comment from this blog.

So, I think that the book author can provide some exercises at the end of each chapter for the learners to get their hands dirty with. Then, they can also provides the solution with proper comments to guide the learners to learn the proper usage patterns and thus to avoid shooting their own foot. Isn't my idea wonderful?

Anyway, let me try to get my hands dirty on my own. I have heard that it is required to code fizzbuzz in the interview. So let's give it a try.

def createInList():List[Int]=(for(i <- 1 to 100)yield i).toList

def fizzBuzz(ilist:List[Int]):List[String]
= ilist.foldRight(List[String]()){
 (x,list)=>{
  if (x%15==0) "fizzBuzz"::list
  else if(x%5==0)"Buzz"::list
  else if(x%3==0)"fizz"::list
  else x.toString ::list
 }}

val inlist= createInList
println(inlist)
println(fizzBuzz(inlist))

Here it is, so am I in? When can I start ?

Authored by Win Myo Htet

UPDATE
my friend Haskell's fizzbuzz

fizzBuzz :: Int -> [Char]
fizzBuzz x | x `mod` 15 == 0 = "FizzBuzz"
| x `mod` 3 == 0 = "Fizz"
| x `mod` 5 == 0 = "Buzz"
| otherwise = show 


Authored by Win Myo Htet

Wednesday, November 2, 2011

Chewy code : scala TicTacToe part 4

Authored by Win Myo Htet


The print statements have helped me see a lot of the moving parts, enhanced the performance and demystified some of the code as stated in Part 3. Especially, the yield line in whoWon method. Even then, it still takes a while for me to understand won_? which is called inside whoWon.

private def won_?(moves: Seq[Option[Move]], m: Move): Boolean = moves.foldLeft(List(0)) {
    case (count :: rest, Some(`m`)) => count + 1 :: rest
    case (counts, _) => 0 :: counts
  }.max >= WinCount

The fault lies with my untrained eye for Scala reading and the column alignment of "count ::" and "counts," and the new introduction of back tick, Some(`m`), does not help too. At first, I don't even know why I cannot read the code. I keep staring at it and even the print statements are not helping. Then, I am able to analyze the construction pattern after a while. I came to realize that I don't know back tick and I am wondering how else the comma "," works in List. I know the con(constructor) "::" usage but not comma ",". Then, the eureka moment set in(please don't visualize me doing anything crazy, I just sit there before the screen and enjoy the moment.). Comma "," works like comma does and I started seeing the (foldLeftSeedValue,foldLeftElement) pattern. From there, the mysteries start breaking away.

For the first case match, foldLeftSeedValue is count :: rest, which deduced to List,  and foldLeftElement is Some(`m`). Since we are doing the matching here, we are matching foldLeftElement. The back tick allow us to refer back to the variable out of this code segment scope, thus we are matching foldLeftElement, which is in the form of Option[Move], to see if it is the same as object m of Move from the parameter, boxed inside Some. If it is, then we are passing over the expression Int::List while increasing the count.

If the foldLeftElement is not Some(`m`) or rather correctly, for anything else, _, the second case will be triggered. 0 is prepended to the List returned.

I still don't like how the Int object count is created in the thin air tho. Anyway, at the end max looks for the largest element in the List and compare. Thus, we come to the completion of understanding Oleg's Tic Tac Toe code. I have done a fork version of it with my glorified print statement and modification here. Again, I thanks Oleg for such an awesome Scala code and permission to blog about it. I hope to be able to write like this soon. Yeh, the code is quite chewy.





Authored by Win Myo Htet

Chewy code : scala TicTacToe part 3

Authored by Win Myo Htet


In part 2, we get to the method whoWon and I am stuck there. I couldn't understand the code here on ward. I am thinking that I want to see the moving part of the code literally. I have been just reading the code visualizing the flow in my own mind. Now that I cannot understand the code, I can't visualize the code flow. I want to see them. Then, I notice that what I have been missing is time-tested-log-print-out from these methods. At first, I don't even know where and how to add the logging print statement in this tightly compacted code. I look and look, and start to see  those one liner code. I made them two liner code.

One liner code
private def horizMoves(board: Board, x: Int, y: Int) 
= for (xx <- x until board.size) yield board(xx)(y)

Two liner code
private def horizMoves(board: Board, x: Int, y: Int) 
  ={println("horizMoves:"+x+","+y); 
  for (xx <- x until board.size) yield board(xx)(y)}

From my very limited exposure of the functional programming code (well, only this Tic Tac Toe code ;) ), I start to make assumption.  No wonder it is said that functional programming is hard. FP is touted as a very powerful language and thus, it will only need a few lines of code where else OO and other imperative  languages are very verbose. So FP dev would make an effort to show off their prowess by writing as few line as they can. There is nothing wrong with that. Of course, I would like to show off too. Well, the motive of writing a very compacted code has side effects. It requires more efforts on the next person to understand the code and to modify the code since the codes are tightly/compactly integrated. The  developers might overlook the benefit of a few more lines of code because they want to write this very cool compacted code. I might be wrong, please enlighten me. For me, method won_? is a testament to that. It is a one liner code with foldLeft (I will come back for this foldLeft later). After I added the print log statement and watching the printed logs, I kinda notice that it can have less execution of code by one condition statement.

  private def won_?(moves: Seq[Option[Move]], m: Move): Boolean 
  = {println(m+":"+moves)
  if (moves.size<WinCount) return false
  moves.foldLeft(List(0)) {
    case (count :: rest, Some(`m`)) =>println(count+"::"+rest); count + 1 :: rest
    case (counts, _) =>println("counts : "+counts); 0 :: counts
  }.max >= WinCount}

In method won_? , from the printed log, I notice that it is unnecessary to process on collection of size less than 3. Why not do the check and return if the collection size is less than 3 instead of processing everything.

To be honest, I am awed by Oleg, the original developer of this code. I wonder how he can manage to think up these code without needing a print statement to see the flow. I also wonder if I will get his mental prowess of thinking up all these code in my mind when I am proficient enough in FP. Well, printing a  log statement still pays tho. In method whoWon, where I am stuck at, I am able to reduce more execution because of the printed log. In the printed log, I come to see that this method is processing both X and O while the current mover can only win or draw, thus the other mover does not matter, so we only need to process current mover. Finally, it is also the printed statement, which I have inserted there after dismantling some code, and which lead me to the understanding of this code block and I am able to reduce more code execution.

def whoWon(game: Finished[Board],turn:Move): Option[Move] 
  ={println("\nwhoWon game turn : Function called
             \n==========================")
    val forYield=(for {
      x <- 0 until game.board.size
      y <- 0 until game.board(0).size
      curr <- game.board(x)(y)
      if curr==turn
      if won_?(game.board, curr, x, y) 
    } yield Some(curr)) 
    println("forYield : "+forYield)
    forYield.find (_.isDefined) getOrElse None
}

Here, I have modified whoWon and added another paramter Move object turn and if condition statement to make sure it is only executed for current mover. I would like to think that's 75% performance improvement. I have also introduced variable forYield  because I am having a hard time understanding what the yield line does. So instead of returning directly from one liner, I have introduced print log at the top, a variable and print log for that variable. Then, I notice that the developer has written the code with no (programming)break mindset. Thus, the loop is allowed to finish and only after the loop, the validity of the result is checked before returning. Below is the revised code which will not need to do complete loop. If the result is valid, it will just return, thus not needing the break as intended in Scala design. When there is no valid result, the loop will finish and None is return. It's just that we developers are so used to a break from other languages.

def whoWon(game: Finished[Board],turn:Move): Option[Move] 
  ={println("\nwhoWon game turn : Function called
             \n==========================")
    (for {
      x <- 0 until game.board.size
      y <- 0 until game.board(0).size
      curr <- game.board(x)(y)
      if curr==turn
      if won_?(game.board, curr, x, y) 
    } return Some(curr)); return None
}

I have been introducing new lines or replacing the code to increase the performance. Well, I also reduce the code to increase the performance.

Shall we say, 1.5 liner code ?
private def finished_?(board: Board) 
   =board.flatMap(_.flatten).size == board.size * board(0).size 
   || whoWon(Finished(board)).isDefined
my truly 1 liner code.
private def finished_?(board: Board) =whoWon(Finished(board)).isDefined

In my 1 liner case, we don't even need finished_? and put its one line code where it is being used.

Here is the fix for the exception indexOutofBound bug mentioned in blog part 2

Here is the bug
 case List(Int(x), Int(y)) if x < 3 && y < 3 
  => Some(Position(x, y))

Here is the fix
 case List(Int(x), Int(y)) if (0 until 3 contains x) 
  && (0 until 3 contains y)  
  => Some(Position(x, y))

So I have put all the code that I have modified here and stated my assumption and reasoning behind these modification. I have been tinkering with this modification as I am learning and studying Scala and the code itself. I might be making the wrong assumption or making an unnecessary modification which can result in unaccounted cases of exceptions or they can be of not FP pattern and usage. Please do enlighten me.

Well, I still have left some of the codes I am stuck at. If you still have the stamina, please read on in the next part.

http://blog.aunndroid.com/2011/11/chewy-code-scala-tictactoe-part-4.html


Authored by Win Myo Htet


UPDATE
Oleg has written a blog on a very neat debug print method.

http://hacking-scala.posterous.com/side-effecting-without-braces


Authored by Win Myo Htet

Tuesday, November 1, 2011

Chewy code : scala TicTacToe part 2


Authored by Win Myo Htet


We have not got far in the previous blog, part 1, have we? We have not even made a move yet! In the previous post, we get the object Move from  whoseTurn to be printed as a turn prompt below the board, after that we return game. We will enter the breakable now. breakable(Java's break) is not part of the Scala language but included in the Scala standard library because of the popular demand (so the story goes) starting from Scala 2.8. The lack of break in Scala before has some impact on the developers who have to code with the knowledge that there is no break (no pun intended). We will even see such impact here later.

breakable {
    Source.stdin.getLines.map(_.split("\\s*,\\s*").toList 
    match {
      case List(Int(x), Int(y)) if x < 3 && y < 3 
           => Some(Position(x, y))
      case _ => println("Invalid position, should be: x, y");
           None
    }).filter(_.isDefined).map(_.get)
      .foldLeft(game: Game[Board]) {
      case (g @ InProgress(_), p) =>
        move(g, p) match {
          case game @ InProgress(_) => prompt(game)
          case game @ Broken(_, problem) 
          => print("Problem: " + problem); prompt(g)
          case game @ Finished(_) =>
            println(draw(game) +"\n" + 
            "Game finished, " + whoWon(game) + " won!");
           break; game
        }
      case _ => println("Wrong state!"); game
    }
  }

The program is running and waiting for the input from the prompt. Once the input data in the format x,y is read from the stdin, it is converted to a collection of List type. Then, do the match to see if the List is of a collection of 2 integers where both integers are to be below 3. (Oh! I just notice that there is a bug. If the coordinates are below 0, it will crash with exception, I have looked up and found the fix in SO! I will include the fix in the code at the end of the blog.) The valid entries are used to create an object Position which is boxed in Some and returned. Anything else will result in println("Invalid position, should be: x, y") and return None(Option, Some and None is used there to avoid "the billion dollar mistake", null. Here is some reading on Option.) The line with the keyword filter said that if the valid data boxed in Option is defined, (_.get) will unbox object Position from Option and feed it into foldLeft. If the Game[Board] g, seed value, is of InProgress type and foldLeft element p of Position object, we will do the move!

def move(game: InProgress[Board], p: Position): Game[Board] 
   =(game.board(p.x)(p.y),
    placeMove(game.board, p, whoseTurn(game))) 
    match {
      case (Some(move),  board) 
      => Broken(board,"Position was already taken by "+move)
      case (None, board) if finished_?(board) 
      => Finished(board)
      case (None, board)  => InProgress(board)
}

Method move is a one liner code, sort of, because the return data is executed by the very first line where the keyword match is. The rest of the lines are part of that match's scope only. So what is match matching here? match is matching the tuple of (Option[Move], Game[Board]). (More on tuple here) placeMove, which return Game[Board], is used as a function literal (delegate in C#, no direct Java equivalent but Java's interface+anonymous inner class comes close) here.  From here, the next method to read is finished_?, which lead to whoWon. So, I have been able to follow the methods flow quite nicely and quite happy about it. Yup, the time has come and whoWon stops my progress for another day.

http://blog.aunndroid.com/2011/11/chewy-code-scala-tictactoe-part-3.html



Authored by Win Myo Htet

Chewy code : scala TicTacToe part 1

Authored by Win Myo Htet


First of all, I would like to thanks Oleg for permission to write a blog about his Scala Tic Tac Toe code. I got to his code through Tony Morris' Scala exercise with types and abstraction which defines the requirement.

I have started learning Scala 10 days ago after getting some inspiration from a friend who is poking around Haskell on a side. With my background in java and after a few R&D on the web, I have chosen Scala, my language of choice for functional programming. Since, I have never been exposed to functional programming in  any form before, I might be making an ignorance assumption and I would be glad to be enlightened.

When I first saw Oleg's Tic Tac Toe code, I have barely finished reading the first 3 chapters and half of Chapter 8 of "Programming Scala" by Dean Wampler and Alex Payne. Scala run on JVM and is credited with having a similar syntax to Java. Yet, the code looks very foreign to me at first sight. I don't even know how to run the program. So before I can digest the Scala really well, I have to chew this code thoroughly. As I chew, I am able to increase the performance of the code 80%-120% (my own guesstimate.) Let's the chewing begin.

God bless StackOverFlow! SO have been such a great resource for me in this journey of understanding the code. I am a bit eager and overwhelmed, and does not know how to run the program, while running the code should have been as easy as follow :

aunndroid@ubuntu:/host/linux/learning_scala$ scalac TicTacToe.scala 
warning: there were 1 deprecation warnings; re-run with -deprecation for details
one warning found
aunndroid@ubuntu:/host/linux/learning_scala$ scala TicTacToeGame
Welcome to Tic Tac Toe!

  |   |  
--+---+--
  |   |  
--+---+--
  |   |  
X>

Scala does not require that the file name be the class name (though "Programming Scala" has already mentioned about that in the chapters I have read.) and the real execution begin in the singleton TicTacToeGame, thus, a bit of confusion ensues and I have to run to SO and still have to figure them out myself. (The deprecation is because I am compiling with Scala 2.9.) I have fixed that in my learning code. I am able to run the program, Yay!

Since, I have not read the chapter 4 of "Programming Scala", I don't know much about trait, one of the major feature of Scala. I didn't know the keyword sealed also, I just take a brief look at the object structure of the inheritance (I even didn't notice the keyword case, which go in hands with sealed, being there at that time.) For now, I am more interested in the moving part and the flow. So, I get into TicTacToeGame object scope. First method, createGame, which return a Board with the data structure being Seq[Seq[Option[Move]]] (which can be similar to Option<Move>[][] in java lingua) created using for and yield. OK, I am able to follow them.

Next come method prompt, then method draw. Because I don't know that mkString is like Java's StringUtil.Join, (mkString is used in the book in Chapter 2 and I just don't remember.) I was a bit lost on how those lines are drawn nicely but all is good when I google "scaladoc mkString" and find this. All is well.

Then came method whoseTurn, where I am stuck for a day.

def whoseTurn(game: InProgress[Board]): Move =
    game.board.flatMap(_.flatten).foldLeft((0, 0)) {
      case ((x, o), X) => (x + 1, o)
      case ((x, o), O) => (x, o + 1)
    } match {
      case (x, o) if x - o <= 0 => X
      case _ => O
    }

I have seen foldLeft in the book but not flatMap and flatten. Again, SO to my rescue! game.board of Seq[Seq[Option[Move]]], has been changed from 2 dimension collection to 1 dimension collection of Seq[Option[Move]] by flatMap. After that, flatten has unboxed Option so that the collection will become Seq[Move], on which foldLeft has iterated through for data processing. Wow, quite easy to narrate these two process now. Easier said than done, I guess.

We still have foldLeft to talk about. I have to refer back to the book cause I  have basically been skimming that chapter 8 : "Functional Programming in Scala" to the half way where it has touched the subject, folding. It covers folding but not in conjunction with matching.  foldLeft's parameter is a seed value and also is "the end result data type(and value)" (that is my wording, correct me if I am wrong.) Through foldLeft(or foldRight), the collection can be manipulated/processed to be a single variable or a totally different form of collection as a result. Here we are feeding Seq[Move] to foldLeft where it will give us a pair of integers as an "end result data type" as its parameter(seed value) (0,0) dictates.

Usually inside foldLeft scope, the data to be processed is in the form of (foldLeftSeedValue,foldLeftElement) upon which the expression is applied. So we will have ((Int,Int),Move). I have to express using the data type because the variable names: x and o along with unhelping object types:X and O can confuse the beginner (it sure does to me.) Here the developer has used pattern matching to check Move type to powerful effect. If element Move is of object type X, variable x is incremented by 1 (Java's x++ won't work) maintaining the form of  (Int,Int) thus (x+1,o). Then the seed value (Int,Int) is passed along for the next iteration. The case will be the same for object O.

Finally, we finish the folding and get to the line with keyword match where it is doing the matching again. If the end result is of (Int,Int) tuple with the if condition, then the object X of descendant of Move is returned. Otherwise, for anything else, the object O is returned. The variable x,o of (x,o) inside this match has nothing to do with the  x and o of the previous foldLeft scope. Can you imagine the kind of effect these x,o,X,O,x,o, <=,=>; and double matching and double flattening has on me ? I sure would never turn back to Scala, if Oleg, the developer, had used operator /: instead of method name foldLeft. :) (Please tweet using hashtag #dietscala to discuss about it)


Having chewed through this code and compiled some note on flatmap, flatten and foldLeft from my research, I thought I get foldLeft good. I am in for surprise by more chewy code of foldLeft coming up, along with the delight in enabling performance enhancement and some find on debugging. 


http://blog.aunndroid.com/2011/11/chewy-code-scala-tictactoe-part-2.html


Authored by Win Myo Htet