kodiak
The random moments of a programmer and his web wanderings
Dvorak in the house
July 2, 2007 on 11:00 pm | In programming | 3 CommentsWell, I guess it was inevitable, but I have started developing RSI. Eight years of coding and writing have added up and the sum total is pain. I began having trouble about a month after I started at Google- but that is just coincidence. In order to counter the effects, I have had to make a couple of changes- the most notable of which was switching to the Dvorak keyboard layout.
I have to say that the first three weeks were very difficult. I basically had to re-learn to type. Not easy. My productivity was near zero which is so frustrating!!! Having to think about typing is laborious at best, and almost reduced me to tears more than once. Consider going from 40-60 words per minute down to under 10. The simplest action and commands took forever, much less writing an email or chatting with friends on AIM.
I think the hardest part has been unlearning the years of muscle memory for Emacs shortcuts and UNIX commands. I still type “je” sometimes when I really want “cd”, but at least the new layout of Emacs shortcuts has pretty much stopped my single-hand chording, which was probably the source of many of the issues.
After four months on the Dvorak wagon, I am so glad I made the switch. Most of the issues have gone away entirely and I am almost back up to my previous typing speed. I am not out of the woods yet, but things are on the upswing and definitely much better than they were three months ago. Fingers crossed!
See you in Banff!
March 3, 2007 on 10:15 am | In travel, programming | No CommentsUnfortunately I don’t have something to present this year, but I am heading to WWW2007 anyway. The plenary program looks strong this year and as always- there are papers from some of the best people in web research, systems, data mining and machine learning. I am only heading up for Wednesday, Thursday and Friday due to time constraints on either side of the conference, but I am excited to see the material for those days.
Register now, because the deadline for early registration expires on Monday. Not that it is cheap in the first place, but if you want to go- there is no point in paying more.
For those of you debating whether or not to submit, the poster deadline is the 12th and the developer track deadline is the 16th. I have to admit that I am considering putting something quick together on web services with python and JSON for the developer track, but maybe that is better left to the experts :-P
See you there!
The Constant Coder
February 12, 2007 on 9:49 pm | In programming | 1 CommentIt has been too long since I had an entry about programming. I mean a real entry about programming, not just some random quip about Java or the wonder of Python. Today let’s talk about C++ a little.
C++ was designed a few decades ago and for all its potential power- it is a double edged sword. I am not the first person to say this, but it is so undeniably true that I don’t feel too bad about repeating it. C gave you plenty of rope to hang yourself and C++ sort of continues that thread, but goes a little further- providing rope in a variety of lengths, colors and even ropes that have a noose already tied in them.
That being said, software can be written in C++- good software, and it doesn’t have to be insanely painful. Simple guidelines can really help this. Basic things like, prefer references to pointers. Why prefer using a reference?
1) Using references tends to encourage allocating memory on the stack instead of the heap, it is hard to leak something you never allocated, and 2) a reference cannot be re-assigned. Yup no more memory leaks from careless pointer manipulation.
What is a reference anyway? A reference literally is the variable it refers to, also known as the “referent”. That is why you cannot re-assign it- the operation doesn’t have a meaning. Consider a simple reference and its pointer equivalent:
string& my_str = func_returns_a_ref();
is the same as:
string* const my_str = func_returns_a_ptr();
Don’t confuse that with const string* my_str, they are very different. The former is a constant pointer, whereas the later is a pointer to a constant. If you poke around with references you will find that you can use them pretty much anywhere you would have used a pointer. Though somethings do look a little unnatural if you try and force the use of references. For example:
string* my_string_function() {
// do something and return a string pointer
}
...
string& mystr = *(my_string_function());
That looks weird to me too. So maybe forcing the reference thing isn’t such a good idea, maybe we should just use them when they make sense. Since you already limited in the operations you can use on a reference, it is pretty common to take the next logical step and use references for const variables, instead of pointers. This quite useful for accessor methods that return handles to member variables. Instead of saying:
class MyClass {
public:
const string* my_var() { return &_var};
private:
string _var;
}
use this form:
class MyClass {
public:
const string& my_var() { return _var};
private:
string _var;
}
Great! The code is cleaner (we lost that ampersand on _var) and we have really locked down the handle we returned to _var. Not only is the variable const, but the handle to the variable is immutable as well. Sweet! we protect our data, and stopped the caller from potentially doing something stupid as well! Double word score!
For more info on this kind of stuff, check out the C++ FAQ Lite. It has more examples and most importantly- good explainations of C++ constructs. Enjoy!
Binary Search is Broken
June 6, 2006 on 10:54 pm | In programming | No CommentsHere is a great post from the Google Research Blog. I love it because it points out a couple of great pearls of programming wisdom. Please read for yourself, and the go fix your code!
Every programmer makes mistakes, or does account for all possible interactions of code. One of my personal favorites was a hashing library I can across that was in heavy use within a past project. The hash functions would returned signed integers, but they were being used functions expecting unsigned values. This caused a couple of pretty interesting bugs in the system. The fix is trivial- just take the absolute value of the hash, but it severs as an example of an unexpected interaction.
ClearCase as a sign of failure
April 29, 2006 on 3:37 pm | In programming | No CommentsBitWorking has a good post asking for feedback on whether or not the following hypothesis is true:
ClearCase as a leading indicator of small technology company failure
I would say there are many other indicators of small technology company doom, personally I think the best indicator is still having a really nice office, with really bad (or no) coffee. It just shows that the company priorities are in the wrong place.
While you are there be sure to check out the link to a slightly more sadistic GoogleFight variant: KittenFight, and then perhaps conduct a few of your own:
– Yahoo over Google
– Microsoft over IBM
– Skiing over Snowboarding
The Joy of Systems
April 8, 2006 on 12:50 pm | In programming | 2 CommentsThere is something pleasing about systems integration- at least when things go right. I got the first end to end prototype working today of our latest project, and I have to say I love watching a SATA drive scream for mercy while you stream data off it as fast as the platters can spin. Of course it goes without saying that seeing the all CPU’s of a couple dozen blade centers start to smoke because of the load you just shoved on them is also a beautiful sight. Unfortunately, I am in New York and the blades are in California, so I have to imagine the fact that the temperature machine room is rising ever so slowly and that the majority of the racks resemble a Christmas Tree, but I can picture it. I am also pleasantly awaiting the deluge of email from the network administrators at Almaden and Hawthorne wonder why there is so much traffic heading to Almaden and so little coming back.
Distributed Systems are non-trivial, and getting everything working together is a chore sometimes. WF really sticks to the idea of design by contact with its service interfaces and that is one of the reasons that we can throw stuff together quickly. Also having a ton of hardware doesn’t hurt, but we can always use more. I have to admit that I am annoyed at having to ask for permission to use the larger hardware systems from time to time, but the alternative is that I have run my *own* clusters, not fun either. I have always said that when I had the headcount the first person I would hire would be a personal assistant, but now I am thinking it might be a personal sysadmin. Do they have those?
Job Description: Personal Systems Administrator
Administrate and maintain small-ish cluster (100+ nodes) of Linux and FreeBSD machines for young researcher who doesn’t play nice with others and has no common sense about “unsolvable” problems. Cluster is fault tolerant, but please note it is expected to have 90% of resource available at anytime, the other 10% is yours to play with. Experiments are run at odd hours of the day and on weekends. Must be flexible with regard to hours worked and be willing to write software that makes your life easier rather than repeat menial tasks. Expert Quake skills, Knowledge of python and use of Emacs are a plus. Vi is acceptable, but expect abuse.
The Eight Fallacies of Distributed Computing
January 24, 2006 on 1:03 am | In programming | No CommentsHere is the source, but I will copy and paste for your benefit:
The Eight Fallacies of Distributed Computing
Peter Deutsch
Essentially everyone, when they first build a distributed application, makes the following eight assumptions. All prove to be false in the long run and all cause big trouble and painful learning experiences.
| 1. | The network is reliable |
| 2. | Latency is zero |
| 3. | Bandwidth is infinite |
| 4. | The network is secure |
| 5. | Topology doesn’t change |
| 6. | There is one administrator |
| 7. | Transport cost is zero |
| 8. | The network is homogeneous |


