Wednesday, December 3, 2008

Problem Solving Episode

In this section of the slog, i have decided to write out my solution for question 4 in assignment 1. This was perhaps one question in all the questions we did in all lectures/assignments which caused me the most confusion and took the most time so i think it is worthy of being given a detailed planned solution for.

1) How i Understood the problem:

I had to read the question multiple times to understand the gist of it. I am a visual learner so its very difficult for me to grasp a concept until i can visualize it and make neat diagrams of. I was to soon realize in next couple of weeks that this was perhaps the best example of a question where drawing diagrams is your enemy!

The problem seemed to be revolved around ternary trees, an unheard of term for me. We are told that an empty tree /\ is an element of this ternary tree. This immediately rings bells for a base case in the making. Next it described a typical element in a ternary tree where (T_L, T_r,T_m, R) would combine together in a unique way to set up a tree. R is also given to be a node which is not common to either of T_L, T_r, T_m so i was thinking of it as a root node.

We are also given the definition of equivalent ternary trees, and told that there is one ternary trees with zero nodes and one ternary trees for one node. This seems to be leading to 2 base cases.

We are to find a recursive formula for the number of non-equivalent ternary trees for n nodes and prove that our formula is correct. It was here where i had a hunch that this would be a classic example of a complete induction proof since :

a) Complete induction wasn't tested for in any of above 3 questions in A1
b) It looks like our formula will account for the number of nodes of each subtrees which will naturally be smaller than the number of trees in the entire ternary tree. So complete induction is a perfect tool for a proof of such a case where a problem is non-uniformly dividing into many sub-problems.

2) How i devised a plan.

As i mentioned above, the most natural thing for me to do was to immediately start drawing diagrams. And so i began. Within half an an hour i was able to draw 3 non-equivalent trees for two-node, 12 non-equivalent trees for three-node. A 4-node tree was much more complicated but i eventually drew out a model which counted up to 55 trees. I was hoping i hadn't ran into the famous off-by-one computer science error. And it added to my anxiety when i caught a discussion on the bulletin boards where a group of students were switching back and forth between 55 and 54 trees. Danny soon cleared the air and let it be known that there indeed are 55 trees. I was glad i got it right in the first shot!

And then i got worked on making pictures for 5-node trees. I wished one of the imaginary angels i had learnt about in high school poems had descended down on the earth and whispered a piece of wisdom in my ears warning me against heading this direction. It sure would have helped me save ton of time. I wont walk through my process of devising pictures and bore the already bored reader of this episode but i did wind up somehow making around 350 different trees which shamefully contradicted Danny assertion that there exists only 273 trees for 5-node ternary trees.

After 3 days of annoyance and head-beating against my table i decided to change my course of action, devise a new strategy, think out of the box. I reduced the problem from a ternary tree question to a binary tree question and again started on ill-advised long and arduous task of drawing pictures. But drawing pictures for binary trees turned out to be much more sweeter than for the ternary trees and immediately i was staring to see a pattern. However it was combinatorial and not recursive so i did not help my cause.

I went to see Danny in his office hours and whined about his evil genius mind and how he can devise such problems. He was kind enough to realize the gravity of the difficulty of the question and announced a new due date. It was much appreciated but i was sure not going to leave this problem until later. So i went back to work later that night.

This time, i had convinced myself to not spend 3 hours in drawing pictures only. Danny mentioned that the number of nodes in left, middle and right subtrees will add up to be n-1 (quite obvious since we don't account for the root node). After playing around with this number for a while, i finally had my "ah-ha" moment. Suddenly i was kicking myself at not getting this at the first attempt. I had a plan.

3) Carrying out the plan

I noticed that since the sum of the three subtrees will add up to be n-1, the left subtree can at most have n-1 nodes ( since by the definition of the question, if the right and middle trees are empty then left tree is not then we have non-equivalent subtree). This property was critical at realizing the solution to the problem. If the middle and right subtrees were empty, the left most could have at most n-1 nodes so i formed the first part of my equation

T(n) = n-1

Immediately after this i snapped that even if the middle and right subtrees are not empty, the left tree can still not have more than n - 1 nodes. I was starting to smell a summation equation here

T(n) = \Sigma(from i=Zero to i=n-1)T(i)..........

Then it was easy to notice that if the left subtrees have any nodes, then the middle subtree can have at most (n-1-(total number of nodes in left subtree) ) so i could further develop my equation above and get :
T(n) = \Sigma(from i=Zero to i=n-1)\Sigma(from j=Zero to j=n-l-i)T(i)T(j)

And then the right most subtree will end up with n-1-i-j nodes from basic math so the total equation was finalised to be

T(n) = \Sigma(from i=Zero to i=n-1)\Sigma(from j=Zero to j=n-l-i)T(i)T(j)T(n-1-i-j)

It was important for all terms to be multiplied by each other since for each i, there would be different pair of nodes for j and hence the nodes for right subtree ( I must mention that i would not have realized this easily if i had not taken STA257 this semester)

4) Now i of course had to prove this equation as well.

First my preposition,
P(n): n For n=0,1 T(n)=1
For all n>1, T(n)=Sigma(i;0;n-1)Sigma(j;0;n-1-i)T(i)T(j)T(n-1-i-j)

As i had a hunch, the question did turn out to be a complete induction proof. (Duh!!)

Proof by using complete induction:

Assume j is a natural number such that 0<=j
Case 1: If n=0, 1.
If ternary tree T has zero node, then there is only one ternary tree with zero nodes. The number of non-equivalent ternary will be one. Alternatively, if n has one node then there will be one ternary tree. The number of non-equivalent ternary will be one as well. As our preposition states, so P(0),P(1) holds.

Case 1 can easily go down as our base case as well but since Danny likes to break down his complete induction proofs on cases, i am following the in-class method to proof this question

Now our induction Step:

Case 2: If n>1, then there is at least one node in this ternary tree T.
Then T must have a root node and subtrees or subtrees. We call them L,M,R, namely, left subtrees, middle subtrees, right subtrees, respectively.

* Then the ternary tree with n nodes will have n-1 nodes in its subtrees.
* Then number of nodes in L (denoted as i), will be in range of [0,n-1], so by induction hypothesis, P(L) holds.
* Similarly, the number of nodes in M (denoted as j) will be in range of [0,n-1-i], since L occupies up to i nodes. So by induction hypothesis, P(M) holds.
* So the remaining number of nodes (denoted as k) in this ternary tree must be in R with it holding up to n-1-i-j nodes.(range[0,n-1-i-j] where L and M will occupy i+j nodes.) So by induction hypothesis, P(R) holds.

If we have i nodes in the Left subtree, j nodes in the middle subtree, and k in the right, then one of the possible tree, say, denoted by T[i,j,k]. Since we know by induction hypotheses, P(i), P(j), P(k) holds. Thus, for the tree T[i,j,k], the number of non-equivalent subtrees will be T(i)*T(j)*T(k). Moreover, i,j and k all having the possible values in their ranges as mentioned above. The non-equivalent tree will n nodes will be the sum of all the possible T[i,j,k] trees.

Since T(i),T(j),T(k) holds as proven above, the sum of product of these subtrees give us total number of non-equivalent trees for nodes n. The formula is presented as Sigma(i;0;n-1)Sigma(j;0;n-1-i)T(i)T(j)T(n-1-i-j) holds for n nodes ternary tree.

By the prove above, we conclude that for n /in N, P(n) holds


The above case 2 basically follows from our discussion above.


Looking back

Woah. In my two years in university, i guess this is the first question which has really exercised the hell out of my brain. It was however very enjoyable to think about a tree question in such an analytical way. It perhaps taught me a lesson on how to systematically connect every step to one another and devise a neat proof by putting it all together. A Perfect assignment question!

No comments: