Zen of programming

March 18th, 2006

Key belief of Buddhism is that human life is suffering (苦海). Suffering from birth, aging, sickness and death (生老病死). One important theory of Buddhism is that how the world looks like is interpreted by how you think it is (象由心生). Cause of suffering is really because of greediness, emotions and addiction (贪,嗔,痴). To avoid all these, you need to become a Buddhist, meaning taking a training, so that you learn deeper and deeper about your life, and you rank higher and higher in this training course. Very much like a school system that you are in primary school, gradually up to middle school, high, college, graduate, postdoc, etc. In Buddhism, depending on how well you get trained, you get different titles as well, and all these titles are named after Buddhism Gods, thus encouraging people to consider themselves becoming different levels of Buddhism Gods and also allow people to compare with each other on what level they are at.

So what is “Zen”? Zen is simply one way (法门) of getting trained among many many different ways. Zen is an accelerated program, just like a talented student program that gives people who are born to be smart (慧根) to take a quicker route to become highest ranked Buddhism God. Key theory of Zen is to “think out of box”(顿悟). When I use my finger to point to moon to tell you, “hey, look at that moon”, you don’t look at my finger and you should look at the moon. When you read a book, you don’t care how letters are printed, and you want to grasp what’s the meaning behind those characters. Zen is all about thinking on a higher level. When a flag is waving, everyone sees the flag is waving, but a Zen student should identify that it’s the wind that waves the flag. An even better trained Zen student should realize that it’s not the flag, neither the wind, it’s really your mind/sensor that makes you perceive a waving flag. It is this deeper and deeper understanding about why and what we are here and how we feel about different things that makes you finally understand why human life suffers, thus eventually saving your life from being miserable.

Taking a break here, you may have learned different interpretation of Buddhism and Zen, and if yours is different from what’s here, yours could be wrong or derivatives :-) There are so many different intepretations out there, and it really polluted Buddhism theories and Zen theories, and both are actually very concise and elegant, yet people derived so much unrelated stuff from them. Maybe I should start a new blog on Chinese philosophy and Buddhism theories. Evey single bit of it is a shining piece of wisdom.

Anyways, why is it related to programming at all? Well, practicing Zen of programming is just my way of saying generalizing your program to make it run on a higher level. A higher level means that it can take a larger set of inputs, or in other words, it can handle more situations so that it becomes a more general program.

But why? Isn’t extreme programming telling us the exact opposed? “Don’t do any extra work to over-engineer anything”. No, no, it is dangerous for a young and junior programmer to take advices like that, and it’s only okay to veteran and experienced fellows to pull themselves back from having a habit of over-engineering. When I first start programming, all my focus are on how I am able to write a program to finish the job. There is absolutely no over-engineering at all. I didn’t really know how to make my programs general to handle more problems. Oh well, not that “I didn’t know”, just I didn’t have a tendency to make that happen at all. Only when you develop yourself on a higher level, so that finishing a programming task is no loner a problem, you then start to pay more attention to interactions of different programs in a system, to how quality of software affects maintainence, to how easy a program breaks on new requirements, to how data model is important in minimizing code changes, etc..

So, when you are young, don’t read anything about over-engineering, because you won’t, and you are just not capable of doing that. When you are older, maybe.

More importantly, making your program more general is a learning process. This is a process that helps you identify how programs and programs are related to each other, helps you understand data transformation, helps you understand code transformation, and helps you understand how software should be maintained over years. Yes, “years”, and we are not talking about some graduate school projects that only has a life cycle of a couple of months. When we write a piece of code that you know for sure it’s only for short period of time, just get the job done, and forget about it. When you are writing a piece of code that’s meant to last, you better be serious about over-engineering it. The longer it will last, the more thoughts you should put to think ahead.

Many argued with me that it’s not fast doing that way. An otherwise 3-day written code becomes longer to finish. Yes, this is exactly why you need to keep practicing on this, so that 3-day will become 4-day, instead of 4-month. This is exactly why you need to take any daily programming task as a test to develop your habit to become a good “think-ahead”-er. It is all about experience. When you become acquainted, planning never takes unreasonable amount of time. When you are on a higher level, it never becomes a problem that “whether I should over-engineer or not”, because there is no time difference at all!

Many also argued with me that a more general program is cryptic to understand. This can happen, but normally it’s not because the complexity the program becomes, it’s really because code is not written cleanly enough, and this is why I’m writing all these blogs :-) . There is of course a tendency of that, and esp. with higher order constructs like templates and generics, but there are a lot more time a program all the sudden becomes so clear, just because it’s written on a higher level.

In real life, a nicely shared components, meaning a general purpose library, will saves tons of developement time and QA time. How much savings there is considered into the “over-” “under-” engineering equation? It is only badly written over-engineer-ed components that might be taken as bad examples of “planning too much”. The Zen training of programming will exactly help you avoid that.

How though? How can I improve my code to become a more general one? I may use my next blog to show you one real example that you may then tell me it’s not that hard at all.

 

Cross ruff or set up long suit

March 18th, 2006

Very often that you need to ruff one or two cards to set up a long suit, and sometimes you need to cross ruff to ruthlessly take number of tricks you need. There are also times you need to prepare for both.

Board 21

North led HA and H8. It’s a piece of cake to me, as I simply cashed diamond AK and club A, and ruffed one round of H with spade 2, and then claimed that I will crossruff the rest for making 6.

What if north clears trump as lead? East takes it, and immediately plays club to start to set up club suit. Use spade AQJ to ruff, so that west can play two trumps back to dummy for 2 more rounds of club ruffing. Ruff diamonds for two rounds for additional two entrances for east so you can finally cash good club suit. Still making 6.

Well, we only bid to 4S…

Mission impossible

March 11th, 2006

Sometimes there is only one chance to make magic happen. This 4H is just that close to me:

2006-03-04-01

East led s9, and west put s4! I didn’t ask LaoGao why he put that card, but this is really the only chance to make this 4H. When North plays H10 to start to clear trump, surprisingly enough East pulled A, indicating strongly that’s a singleton! The best defense for now is to play two rounds of spades, and North cannot ruff. Instead, throw a diamond and 4H is sure make now. No matter what color west returns, play diamond to A, and use H9 to play two rounds of trumps to capture the HQ. Losing two tricks of spades and one trick of HA, 4H is just made.

This isn’t what happened though. I pulled H4 for no reason (just as the s4 from west) and sadly found out HA dropped. Right at the time seeing HA, I knew that I blew it, an otherwise beautiful H4-1 distribution silently slipped away with all agony of defeat. W-E didn’t make any further defense mistakes and continued to play spades until I ruffed from north. My last hope was to cash 3 tricks of club and overruff HQ on diamond, but it didn’t go in my favor. So down 2 on this hand with all regrets. But hey, that was the 1st hand 6:00am in the morning, and yeah I told myself “wake up”.

Data-code seperation

March 7th, 2006

When I was writng my last post, I spent more time struggling with WordPress. It just doesn’t handle code examples well. I lost my formatting every time, and I had to go back to fix them manually. Anyways, I felt I missed something from the posting that I’d like to add here.

The goal of data-code seperation in logic-oriented programming is to purify logics. A logic is considered more “purified” if it has less dependencies on values of input data. For example, the string parser code I gave should continue to run when delimeter is some other character. When logic is purified, it becomes a more general program that potentially handle more input data (size of acceptable set of inputs is larger). This increases chances of code sharing and makes localization of code logic easier. [Of course there is a tradeoff between convenience and power, but that’s a seperate discussion about how to balance them]

One great example of data-code seperation is the success of XML. No matter how many people don’t like it and consider it bulky, there are a lot more people who have adopted it. One of the reasons is it helps many applications seperate data from coding really well. There is no doubt other formats can do the job as well (or better), but just historically XML becomes the first winner. A winner of? Data-code seperation. By describing input/output data in a generic way, data can be organized in a hierarchy and even be constrained by means of schemas. All the sudden a program that is written for one piece of data also works for other pieces of data, as long as they are in the same shape (conforming to the same schema). All the sudden programs written by different groups of people work for the same shape of data as well. This process of generalizing a piece of code so that it handles a group of data is called logic purification.

Logic purification can either be as simple as converting a literal into a primitive typed variable (like the string delimiter example), or grouping a couple of variables to form a class, all the way to defining a schema of XML objects. They are all great examples of identifying coding that would work for one group of data. In this sense, object-oriented programming somehow did a similar job in terms of seperating data from code as what XML is doing, except they have other different emphasis.

This just reminds me another story of data-code seperation — SQLs or relational databases. By defining data in tables and describing relationships between them, generic coding like SQLs can be written to perform the ultimate quadruple — INSERT, UPDATE, DELETE and SELECT. Similar to XML story, all the sudden an SQL statement would work on any tables with the same schema. Standization like SQL92 and data access model of ODBC are all great examples of how generically we can escalate our coding onto a very high level so that data become uniformed. This level of code-data seperation benefits lots of people and applications.

But both XML and SQL are examples about how the best we can do to generalize data. In real life, we don’t necessarily have to always borrow them for simple tasks. A point can be defined simply as a class with (int x, int y), and code can already be just well enough written to handle simple functions like DrawDot(). It is the habit that you always look for possible ways to generalize your data model that eventually benefits you and your group a lot by sharing pieces of code to great extent. When that happens, it’s called “Zen of programming”. See my next post about what I mean by that.

Offensive unblocking

February 27th, 2006

My partner gave me this hand.

            North nbyuhong
            S  9 6 3 2
            H  K 9 6
            D  5 2
            C  Q J 9 3

West                        East
S  K Q 10 8 7               S  A 5
H  10                       H  J 7 5 4 3
D  J 10 8 7 4 3             D  A K Q
C  K                        C  7 6 4

            South LaoShen
            S  J 4
            H  A Q 8 2
            D  9 6
            C  A 10 8 5 2

West    North   East    South
pass    pass    1H      2C
2S      pass    3D      pass
4D      pass    4S      all pass
==============================================

North led CQ-4-A-K, and South continued with C5-H10-C9-6, and so did North CJ-7-10-S7. Plan the play.

It would be dumb if west clears trump until last one, then play diamond, got stuck and surrender. It’s also not that clever to play 3 rounds of trumps and then 3 rounds of diamonds, waiting for North’s ruffing, as North will not until East plays H or C into West. With no control on trump, West has no way to cash all diamonds. My partner gave the best play. Clear one round of trump to East’s A, and immediately play one round of diamond. Start to clear ALL trumps from West hand, but throw away all diamond cards (K and Q) of East!

Data make code dirty — purifying logics

February 25th, 2006

Last time I talked about one single master principle of logic-oriented programming — “keep one and only one logic in one single place”. Today I’m going to talk about a technique that’s a derivative from this master principle. This technique is very simple and you may have already used it in your practice for long time, that is, seperate data from code.

When you mix data with code, your code becomes (1) less general, because code has been specialized more or less just to handle data that are hard coded; (2) prone to copy-and-paste, because when someone else wants to reuse the same logic, if he/she is afraid of modifying your code, he/she will more often than not copy your code and paste into a new place and do some minor changes with new set of data; (3) normally less readable.

In general, data and code are different beasts. To better organize data, you need to aggregate them. You need to collect them and don’t let them scatter all over the places, and you need to sweep them into one single place so that the rest of your program don’t have them. For code, on the contrary, it is better to scatter them into different files and places and organize them into different packages, layers and draw lines and interfaces between them.

I have read date/time coding and format string like “%Y-%m-%d %H:%M:%S” is repeated and hard coded everywhere. In general when a literal value is repeated in different places, it’s simply a bad practice, and it directly violates “keeping single logic in one place”. Code like this is easy to break. A typo or a later change of the value missing one place will all break the code in mysterious ways. No matter how sure you are about a string delimiter you will use, give it a name and keep it in one single place like this:

const char MY_SO_SURE_DELIMITER = ‘t’;

for (size_type pos = mystring.find(MY_SO_SURE_DELIMITER);

    pos != string::npos;

    pos = mystring.find(MY_SO_SURE_DELIMITER, pos+1)) {

  ….

Use #defines to give literals names to tell other people their meanings. Don’t use comments. Or, even better, use “static const” data members of a class to scope them. In Java use static finals. In C# use static const. In Perl, hmm, what about this,

use constant BUFFER_SIZE => 4096;

Of course, when a literal doesn’t have to be a constant value, you will always be able to extract them out by writing a function and passing them in as parameters. In this case, a “function” is used purely for seperating data from code, and this is totally okay with logic-oriented programming, even if your function is only being called from one place. In other words, a “function” is not conceived just as a code sharing construct, but a construct that we use for any possible situations. This extends to all other programming constructs. Any constucts, regardless of why they were originally devised, as long as it helps our logic-oriented goals and styles, it can be used that way. I’ll show you more of this later.

Here comes an interesting comparison between functional style of programming with object-oriented one. I’m pretty sure you have written something like this,

class CountByOne {

  virtual void Inc() { m_count++;}

protected:

  int m_count;

};

class CountByTwo : public CountByOne {

  virtual void Inc() { m_count += 2;}

};

This is a typical object-oriented polymorphism. There may be a problem here though, because Inc() function is stateful and it relies on an interal data member “m_count” to be available. When CountByTwo is written, CountByOne really can’t change implementation of m_count that much, as it may break all derived classes (for example, by removing m_count or changing meaning of m_count). Note that this is a very simple example, so the problem isn’t very a big of deal. But when you deal with complex classes, with complex derivations, with virtual functions working on data members from base classes, making assumptions about those data members, you are subject to this “stateful function” problem. Side effects of functions are simply harder to deal with, esp. asking different classes (even base and derived) to co-manage states that may require some integrity. So what can be an alternative here? You may feel this is silly, but think harder why this is better sometimes:

class CountByOne {

  virtual void Inc(int &count) { count++;}

private: int m_count;

};

class CountByTwo : public CountByOne {

 virtual void Inc(int &count) { count += 2;}

};

Here m_count is completely managed by CountByOne, thus seperating this data from the rest of world. This is probably the hardest data-code seperation to understand. However, it doesn’t matter what we are seperating, we are keeping m_count coding in one single place and we are free to even remove the data member or change its meaning if we want to.

Don’t take the above example as “never use protected data members” :-) I was just giving you one extra option when you do have problems or requirements that one data member is subject to change in the future. Logic-oriented programming only recommends you to do things in certain ways, and there should be no way that every piece of code is completely seperating from each other in handling different logics. But it is certain code will become cleaner when you do.

Losing Trick Count (LTC)

February 25th, 2006

board 4

If you know how Losing Trick Count works, my 5-point 4S bidding starts to make sense to you. My partner’s 1c normally has 6-8 losing tricks with 7 the most possible. He actually has 6 for this hand. I have 7, so I know my 4S has a great chance here. If I want to try 4S, I give opponents no chance to communicate in any way, so I jumped directly to 4S. A double was entirely expected.

South led club 3. If I have played club 10, 4S might be easier, but there is a chance South has club A and stealing one trick of club will also make 4S a lot easier. So I put club K. North won by A. It turned out that the only way to defeat 4S is to return a diamond!

- Returning spade? I will win in dummy’s spade A, and use club Q to throw away diamond 6 in my hand and ruff one club to set up club 10. Then I’ll play small H to dummy. Because HJ is in good position, I will only lose one trick to HK and one trick to spade Q plus the 1st trick to club A. I shouldn’t have to worry about South ruffing club 10, as south may have sK singlton or sKQ. Either case is okay.

- Returning H? It will only help me to over 1.

- Returning club? Then I don’t even have to play club and heart before spade. I can just clear trumps and club is good to go.

Unfortunately, it is not obvious to North that diamond is a good return, esp. he expects South’s club 3 is a singleton. So north returned a small club instead. This made my life easier, as I won by club 10 and quickly threw away the diamond 6 and a small H under club 10 and Q and start to clear trumps. When North won by spade Q, he returned HK. This isn’t a mistake, as the only chance to defeat the contract is to expect South to have HA. Again, unfortunately, I won by HA, and clear the last trump, and finessed HJ with H10. 4Sx was made with over 1 trick. Sweet.

C vs. C++ — is object-oriented programming going to make code clean?

January 18th, 2006

When we all start to learn coding, the most intuitive or straightforward way to write a piece of code is to use imperative style. You finish one job by another, and finally get an answer. It’s called “imperative”, because a piece of code can be conceived as a sequence of commands that you give to your computer. When you start to write more and more code, inevitably there will be repetitive commands that you want to execute. You start to write loops and functions. It becomes so natural to define your own functions, because almost every language has built-in system functions that you can call and they are normally in exact the same format as those you write. So we start to have functions here and there. This is pretty much what you can do with C programs, writing a “main” program, and calling different functions. Functions can in turn call other functions.

Everyone loves C. But when it comes to era of object-oriented programming, everyone loves C++ more. In addition to functions, code can be more organized with classes. Functions can be overridden or extended. Wow. Do I have to say more words about how much OOP has fundamentally changed people’s perception of coding and programs? This is when I got my impression that C++ is strictly better than C, in terms of how much help it gives people to write clean code. Is this really always true though?

No, it isn’t. I’ve seen great C code (or functional style coding in C++) written by veteran and disciplined C programmers, well organized C libraries, and large scale C packages being used by millions of people. I’ve also seen lousy C++ coding with bad classes and namespaces, bad definition of static functions (or simply grouping a list of static functions and calling them a class, amusingly enough), and dauntingly big base classes, messy hierachy or easily-get-lost multiple interitance.

So what is going on here? Why isn’t OOP enough to let people automatically write well-organized coding? Why sometimes C just seems to be as good as C++? The truth (or my understanding) is that both functional style programming and object-oriented programming are logic-oriented with slightly different emphasis. To write clean code, not only you need to understand how to write in either style but also you need to know why they make code clean. Once you know what is really behind the scene, you don’t really have to stick to any one of them and your code will look just as clean as you can with any styles or any languages.

Functional programming localizes a piece of logic into a function so that it’s well isolated from the rest of the world. Any changes to make about this logic is only needed to be done within a small block of code. There is no littering of the same logic in any other places, thus giving people feel of “clean”. A function declaration lists all input parameters that this function ever needs, nothing else, and thus making people feel seeing all of the requirement, which in turn gives people feel of “certainty”. One function can be called from different places, helping different other parts of the system, giving people controls on the entire system by manipulating very few lines of code, thus giving people feel of “power”. Greater control and greater visibility yields better looking of code.

Object-oriented programming emphasizes grouping of “related” functions by types of data that they all work on. It localizes logics into classes and isolates them from other classes or functions. Class declaration tells the outside world what operations are available on this type of data and what are the minimum requirements for them. It’s a data centric approach, comparing to functional programming, and it emphasizes relationship between different types of data. This relationship is expressed in shared functions (base class) and differences (virtual function overrides).

Watch the word “localization”. Localizing code is very important in both functional programming and object-oriented programming. “Localizing code” means moving code around so that lines of code that are tightly bound with each other are close to each other as much as possible, where “tightly bound” means changes of one line of code will normally change other lines of code that are tightly bound. The reason some lines of code are tightly bound with each other is because they are conveying one same logic. The goal of logic-oriented programming is to have coding about the same logic in one single place and one single place only has coding about one logic.

If someone asks you to write a network protocol that has marshaling and unmarshaling two counterparts, or if someone asks you to write data transformation program that has imports and exports, or if someone asks you to write serialization of objects that serializes and deserializes, whether you choose C or C++, as long as you can organize your code so that two functions are written together in one file, preferrably right next to each other, it’s then a great example of logic-oriented programming. This is something that’s never dictated by functional programming or object-oriented programming, but it is required by logic-oriented programming. In logic-oriented programming, all functions and classes are treated as programming constructs that serve as “tools” to achieve one single goal, i.e., organize code so that one single place deals with only one logic and one logic only lives in one single place. Logic-oriented programming emphasizes code organization into logical way and utilizes criteria or guidelines to help people mechanically move code around. When code is in its right place, code becomes clean and beauty of programming starts to show through…

(I’m comparing C vs. C++, and I also mentioned functional programming vs. object-oriented programming. They are technically different comparisons. In addition, functional programming also requires that functions don’t have “state” or “side effects”, and this is consistent with one of my logic-oriented programming “theories” that logics can be purified or generalized by seperating data from code. I’ll visit this later when I talk about why data make code dirty.)

Another 4H

January 17th, 2006

Not every 4H is easy to tell the best approach without disclosing all hands.

board 11

South’s D9 is clearly a singleton to me (North) when east played D3, because I saw all small diamonds and there is no way my partner would lead from KJ9. A diamond return gets us a ruff, and South switched to club 9, really giving east a lot of pressure. East cannot afford a 2nd ruff if north has club K. So east took it with A, losing a great chance setting up 2nd trick from club. East cleared two rounds of trumps, and DJ, SA and DQ with a small spade from east. East then played a club 2 from dummy, north club 5. Decisions, decisions, decisions. Unfortunately east put club Q, losing both tricks of club, down 1 on the 4H deal.

If East knows D9 is a singleton, East would throw away DJ under DA. That will give me big headache reading diamonds. If east has DKJ, a diamond return would be disaster to our defense, because spade A would be one stopper that east can use to dump a small spade under DQ before we cash one trick of spade. Mostly like I would return a spade. East can win the trick with A and clear three rounds of trumps. I (north) have to throw away either a spade or a diamond, as leaving two clubs in hand will make 4H too easy to cash (just pull club A and small club, my J will be exposed). East can play a small spade, and either north or south takes it, no club can be returned. East eventually ruffs a spade or plays diamond until every one has 4 cards in hand:

                       s -
                       h -
                       d T
                       c J54
s -                                             s -
h -                                             h T
d 7                                             d -
c A62                                           c QT8
                       s Q
                       h -
                       d -
                       c K97

East plays d7 from dummy and ruffs from hand, what does South do? Again, throwing away a club is disasterous, as east can then play cT from hand, ducking from dummy. Hoping south has cJx or cKx, east can play small and use AQ to win the last two tricks.

So south has to throw away the last sQ. East can play cQ from hand, expecting cKJ split, and either cK at south, or cK at north plus c9 at north as well (if north wins by K, plays a small club back, put club 8 for the very last chance).

As you can see, defense becomes much harder for both north and south and east can make 4H with higher than 50% probability. But who can see that far at the 1st trick? I won’t :-)

College bridge team reunion after 15 years

January 8th, 2006

I cannot believe we can still “sit” down and play bridge together 15 years later after college. Although we cannot see each other’s face, everything is so familiar and nothing is really changed: the bidding, the card play, the aggresiveness and the mistakes :-)

Yeah, mistakes. I can’t believe I didn’t pay attention that my club 5 was good, down 1 on that 3NT. But I’m glad I made this 4H:

board 22

South’s S2 lead certainly helped me (east) a bit on spades, but not enough for the whole hand. After I won the 1st trick with S7, I quickly played two rounds of spades, throwing away two clubs from dummy. I was hoping I can pull out spade Q, and then D K will not be a problem at all. But this didn’t happen. Oh, boy, that long suite of diamonds in dummy looks good to me, but with only one entry in H, I can’t really count on it. So I played the last spade and put club 8 from dummy. South’s diamond 5 worried me a bit. With 1H double, south should have diamond K. My partner was even clearer on this after the game: if he didn’t lead club K, chance is good he didn’t have club AK. So all other points are on his hand. Anyways, I played diamond 2 for safety. North won with diamond J, and played H 10 back. South didn’t take it, and played 7 instead. Diamond ace from dummy pulled the diamond K from south. H3-5-A-9. C6-7-HQ-4 ruff. Diamond Q - 9 - club Q.

If we knew D K is on the right side, the simpest would be just ruff one spade at the most beginning and start to clear trumps. But that’s a 50% play and intermediate level of play. I think my play is about 60% - 70%, and I will make it as long as one of D K or J is in south’s hand (and Kx).