Sunday, September 7, 2008

It's New & Shiny


This week I did an initial analysis of the security architecture of Google's new Chrome browser from an operating system standpoint, i.e. how Chrome leveraged Windows to apply the principle of least privilege. This immediately came to mind when I saw page 26 of Google's online presentation of their new Chrome browser.

It also turns out the software architecture in Google's new browser uses an approach that is much more common when writing back end network services. A technique known as privilege separation which page 26 also eluded to.

Privilege separation is merely a technique where the principle of least privilege is applied.

The crux of privilege separation is to create copies of yourself (an .EXE) and strip administrative rights as you do so. It is then those copies then handle inbound requests. So in the case of a web server, the initial instance of the executable (master copy) creates copies and it is those copies that do the actual work of handling inbound HTTP requests. The master copy (the parent of all the children) is simply an orchestrator and never directly interfaces with the outside world where HTTP requests originate. Other network services that follow this model include the venerable OpenSSH server (see image).

The advantage of privilege separation is that if malicious input is fed into a copy handling a request, it becomes much harder to compromise a system since copies have had their administrative privileges stripped. Malicious input's goal is to proxy through whatever is handling a request (copy of an .EXE) to in many cases modify operating system files and directories to nefarious ends. If a copy does in fact serve that end, that would constitute a system breach and what you now have is a serious flaw in the software system. Over the years many buffer overflow exploits in sundry network services have been significantly more severe simply because the technique of privilege separation was not employed. The difference can easily be a system that is compromised and falls into someone else's remote control vs. a much more likely denial of service scenario. Given the choice, I'll take the latter.

Not to say that privilege separation is a panacea. It is not. The technique simply makes a system much harder to breach. And being a technique, nothing says it can't be used on the client side of the equation (a web browser). So while all other web browsers have a single process (.EXE) that manages all browser windows and the tabs within them, Google's Chrome browser creates one copy of itself for each tab where HTML will be rendered and JavaScript executed. As those copies are made, their administrative rights are stripped making it much harder for malicious web content to compromise the computer system where Chrome is executing. Thus Google's Chrome browser applies the technique of privilege separation and immediately affords a level of insulation and resiliency to malware that heretofore hadn't existed in any other browser under Windows XP with its default security settings.

The principle of least privilege in action. Hallelujah.

This is not as big a deal for Windows Vista users but it is a very big deal for Windows XP users. That's because Windows Vista by default does not give users administrative privileges. Under Vista, when you want to do something like install a new application, you're prompted for an administrative password.

Windows XP on the other hand gives full reign (administrative rights) to users out of the box and this is precisely *why* malware has been such a large problem there (much to the delight of anti-virus vendors). Namely, people's browsers acting as proxies for the installation of malware as they promiscuously connect to foreign computer systems on the Internet with administrative privileges. Unfortunately, very few people understand the ramifications of operating in this manner and this extends to most people who make their living with technology.

Google's Chrome browser takes advantage of the Win32 security programmatic interfaces and strips administrative rights as it launches each copy of itself to handle each tab. The magic that allows this under Windows XP is the CreateRestrictedToken system call. The net result is that Chrome under Windows XP makes it much harder for your computer system to be compromised, irrespective of stop gaps measures like anti-virus software.

The level of mitigation is big. How big? Well, the reason Mac OS X has such little malware written for it is because Macintosh users do not run with administrative privileges making the economics of writing malware for Mac OS X a completely different propostion. Malware authors go for the low hanging fruit that are Windows XP users of which there are still plenty. Naysayers often claim, "Mac OS X doesn't have the same market" but that is simply not true. In August of 2007 Apple's market share of laptops hit 17.6% (20% is likely not far off). Apple is now the third biggest seller of laptops behind HP and Dell.

By applying the principle of least privilege Google with its Chrome browser severely mitigates the chances you will contract something like a keyboard logger while innocently visiting your favorite site. Therein lies the danger, increasingly sophisticated and organized criminal groups are breaching the web servers of legitimate businesses to serve up malicious content to unsuspecting users. In many of these cases, malware often comes in the form of keyboard loggers where once installed, absolutely everything you type is transmitted elsewhere in the hope you'll eventually use an online financial account. At which time you unwittingly divulge your username and password despite connecting to a legitimate financial institution (no, this is not phishing). Eventually Electronic Fund Transfers ensue. It's akin to poisoning the water well since as people visit their favorite watering hole (a legitimate web site) they are served up far more than water. As a play on phishing, the term pharming was coined to describe these types of scenarios.

In closing, kudos to Google for leveraging a programmatic mechanism (CreateRestrictedToken) that has been in place for many years. To this day I still do not understand why Microsoft never exposed the power of CreateRestrictedToken with a mass market end user tool or simply embellishing the way desktop shortcuts are created by providing end users with the ability to strip administrative rights in the form of a simple check box. That's because CreateRestrictedToken has been in every copy of Windows since Windows 2000 was released. As I said, many years.

Why 9 > 10


So I've been resistant to upgrade to Microsoft's Windows Vista for a number of reasons but probably the biggest is consistently observing Windows XP yielding higher Frames Per Second (FPS) with many DirectX titles. Windows Vista introduced DirectX 10 to the world and Windows XP users have had to be content with the older DirectX 9. I had read that Microsoft changed the video driver model for Vista to mitigate the infamous Blue Screen Of Death (BSOD) that has been the butt of jokes for many years but did not investigate further since my plans are to stay on Windows XP for quite some more time.

Here is a shining example of Microsoft being canon fodder on account of BSOD issues, it's a very old Sun Microsystems commercial from yesteryear (before Windows 2000) and pays homage to the late Jacques-Yves Cousteau (the narration). It's quite good!

http://www.youtube.com/watch?v=eNqPTOb31S8

It's also testimony to a long standing problem on Microsoft's operating systems based on the Windows NT kernel introduced in the early 90's. In Microsoft's defense, it can't control how video drivers are written by third parties, but nevertheless, less than perfect video drivers have been one of the leading causes of the infamous BSOD.

So curiosity finally got the best of me as I was talking about DirectX performance with a friend and I decided to investigate why seemingly Vista always loses to XP in sundry benchmarks I've seen in online articles and the printed word. I started with the following Wikipedia article about the Windows Display Drive Model introduced in Windows Vista:

http://en.wikipedia.org/wiki/Windows_Display_Driver_Model

When the Wikipedia article mentioned that part of a Vista display driver lives in userland the light bulb went on.

User space (aka userland) is where applications live, e.g., your web browser, your email client, your IM client, etc., etc. Kernel space is where the code that makes up the OS lives. Both of these terms are commonly used in systems programming.

So the question that beckons, when an application asks the OS (operating system) to do something on its behalf and execution transitions from your application to the OS, how does the operating system protect the stability of the system? After all, you as an application programmer might have just passed some errant arguments that could potentially bring down the system… or on the more sinister side, malicious code could have slipped in through your browser and is making system calls under the hood.

The answer is… with hardware!

The Intel architecture has a notion of “rings” when it comes to executing code (other processor architectures I’m sure have the same semantics, the nomenclature may vary). More on this in a second.

It turns out when executing code, more bits are at play than the usual 32 bits you sometimes hear about, i.e., I’m running 32 bit code versus I’m running 64 bit code. On the x86 architecture memory access is implicitly qualified with additional registers. These registers are called selectors and they happen to be 16 bits. The first 13 bits of a selector are an index into what’s called a Descriptor Table. 2 to the 32nd power is roughly 4 billion so it would seem a 32 bit Intel processor is incapable of seeing more. Such a statement is not accurate of what really is going on. You see, 32 bits refers to the context of a single application, the limits of its addressibility, not that of the processor's.

This means by varying the first 13 bits in a x86 selector register you jump from only having 4 billion memory addresses to:

2^13 (8192) * 2^32 (4 billion)

Which calculates to:

18,446,744,100,000,000,000

But wait, there’s more. The next bit (just one) says whether to go to the Local Descriptor Table or the Global Descriptor Table. So that means you can now double this figure. So you see, the “32 bits” people speak of does not really reflect the mount of memory x86 processors are really capable of managing.

Back to those rings. The last 2 bits in a selector register has to do with rings.

Applications (your browser) run in ring 3 a.k.a. user space. When an application asks the OS to do something, the OS changes last two bits of a selector register and transitions from a higher privileged ring to a lower privileged ring, e.g., from ring 0 (kernel space) to ring 3 (user space). Code running in a lower privileged ring cannot manipulate a selector register to transition back to a higher privileged ring. That’s by design since the OS cannot trust arguments passed to it. From a security standpoint this is sound. You maintain the integrity of the system continuing to provide basic services to other applications and you make malice hard at this particular layer. And there are so many layers… this is why the term computer security is such an overloaded term, has become hackneyed and means different things to different people.

But it turns out these ring transitions have an impact on performance! Any time you have boundary checks, whether they are metaphorical or literal (customs at the US border), things are slower than if you just let things through.

And with Vista’s display drivers partially living in ring 3 (user space), there are likely more ring transitions than the driver model found under Windows XP and its predecessors, hence the 10% to 15% drops (or worse) in performance under Vista.

So while Windows Vista is more stable and mitigates the blue screen of death problems of its predecessors, you get less bang for your buck as far as that screaming new video card goes (at least versus the guy who has Windows XP).

Monday, May 12, 2008

Practical *Extraction* and Report Language

Since its debut in the 80's PERL entrenched itself in many computer systems as its power found many uses. The advent of the web applications in the 90's saw PERL spread beyond its roots as the nascent web development community used it to write CGI scripts.

Nevertheless PERL's forte is clearly communicated in its acronym. Perhaps you're after the extraction of information from a large data set based on some arbitrary criteria. Perhaps the only thing you need is to generate a CSV file that will be fed to Microsoft Excel to generate a chart, a form of a report.

The former was the case when a coworker brought me a log file with thousands of lines. He specifically wanted to filter lines in a log file where the criteria was an email address, here's a sample:


./app.comp.log:2008-05-01 08:35:53,288 [WorkExecutorWorkerThread-3] ERROR com.mycompany.events.bo.ejb.NewsSubMDBean - onMessage: Failed: <root><com.ftd.events.core.NewsEventVO Status="UNSUBSCRIBE" OriginAppName="BOUNCE" EmailAddress="anemail@nowhere.com" CompanyId="123" OperatorId="HARD BOUNCE" OriginTimestamp="1209600000000"/></root>

./app.comp.log:2008-05-01 08:35:53,221 [WorkExecutorWorkerThread-3] ERROR com.mycompany.events.bo.ejb.NewsSubMDBean - onMessage: Failed: <root><com.ftd.events.core.NewsEventVO Status="UNSUBSCRIBE" OriginAppName="BOUNCE" EmailAddress="anemail@nowhere.com" CompanyId="XYZ" OperatorId="HARD BOUNCE" OriginTimestamp="1209600000000"/></root>


In these two lines the email address is the same, so the second line and any other lines that followed would be disregarded.

If the goal was to simply extract unique email addresses then the ubiquitous cut, sort and uniq text utilities on *NIX platforms could trivially solve the problem. Except the goal was not to extract email addresses, the information around the email address needed preservation. This meant I couldn't apply a cut/sort/uniq solution.

The solution was to employ PERL to save state. I was able to do this in a one-liner:


cat unfiltered.txt |
perl -lane ' m/EmailAddress=\"(.*)\"\sCompanyId/; if ( $have{$1} ) { ++$noise; } else { $have{$1} = 'yes'; print $_} ' > filtered.txt


-lane is one of my favorite command line switch combinations in PERL. a puts every line of input into the anonymous variable, $_. n gives you an implicit while loop to traverse input coming from stdin, l allows you to print information with an implicit newline to help shorten the length of your one liner when sending information back out. e simply is evaluate the expression (code) that follows.

Using a regular expression and a hash allows me to keep track of which email addresses I've already seen. By not printing a line when an email address is already in the hash, I wind up with unique email addresses with surrounding context.

Easy.

Tuesday, April 8, 2008

Ultramon-itoring the Crysis


Several years ago I upgraded one of my CRTs to an LCD. The primary benefit of an LCD over a CRT is simply its increased clarity when working with text. Spend a short time on an LCD looking over text and going back to a CRT is painful (on the eyes).

But my CRT has continued seeing use for one specific reason.

If you've ever played a game that has panoramic views, e.g., first person shooters, a CRT has one big benefit over an LCD - fuzziness. "What's that?" you ask. It turns out that the great ability of LCDs to show razor sharp content isn't so great with real time generated panoramic views - the geometry of a scene is filled with noticeable stair step edges of polygons that make up the models that appear in the view.

To combat this problem, over the years Graphics Processing Units (GPUs) from mainstream companies like nVidia and ATI have become much more powerful. With that power, GPUs have increasingly been tasked with performing anti-aliasing to smoothen the edges of an image. The catch in this equation is that your GPU expends far more cycles computationally eliminating the jaggies and this impacts the maximum frame rate you can achieve. It can turn a playable game on a CRT (with no anti-aliasing) into a sloth (unplayble) on an LCD (with anti-aliasing).

A CRT's inability to show razor sharp text lends itself quite nicely to panoramic views that fill the entire screen vs. a small area where your focus is fixated on a word (like the one I just italicized). By turning off full scene anti-aliasing while using a CRT, you spare a GPU a tremendous amount of work. It so happens a CRT’s lack of sharpness for text is great for wide vistas - the natural perspective of first person shooters.

Only thing is, when I've wanted to use my old CRT for gaming, I've had the hassle of making like a monkey and going through GUI motions - right clicking on the Windows desktop, selecting Properties then making the CRT the main display. Then when I'm done, go through this all over again except this time making the LCD the main display.

Yes GUIs are great but not when you need to perform a task repeatedly. Go through this enough times, which I have, and it gets old. Very old. There have been times I don’t even bother firing up a game on account of the hassle factor (but there’s another hassle factor that I’ll get to in a moment, it has to do with the position of application windows on the Windows desktop).

But because of Crysis I’ve been going through the motions as annoying as they are. The game is simply that good. The visuals and realism are stunning. It really ups the ante for first person shooters in a big way.

Before I go further I should point out that I use a very nice little utility program called Ultramon:

http://realtimesoft.com/ultramon/overview/

One of the benefits of Ultramon is a task bar for every extra display you have. Applications sitting in other displays will fill the task bar found on that display versus the task bar of your primary display. This nice touch keeps the task bar on the primary display from getting cluttered. If you work with multiple displays, once you get used to Ultramon in this regard, it's hard living without it.

Another nice Ultramon feature is the ability to set independent wallpapers for each display. Windows XP by default uses the same image on every display when you set your desktop background.

Ultramon also happens to provide an icon in the task bar area that when navigated allows you to quickly set which display is the primary as well as disabling the secondary display. So firing up a game on my CRT, I would use Ultramon’s icon in the task bar to make my CRT the primary display then quickly disable the LCD. Why disable the LCD? Some games like Quake 4 consume a lot of video memory for textures. I found that unless I disabled my secondary display, I could not run it at its maximum display settings (Ultra).

While using Ultramon is faster than right clicking on the Windows desktop and going through the usual motions, even so, the GUI (monkey) way still got old. Very old. Reverting was even even more work - I’ll get to that in a moment.

Then I had some inspiration, I had a recollection that Ultramon provided COM objects with OLE Automation support. This meant I could probably write some VBScript code to automate switching displays and setting things back to the way they were. After reading Ultramon's documentation I came up with these two scripts.

gameCRT.vbs
------

Const POS_ALL = &H7

Set sys = CreateObject("UltraMon.System")
Set Sony = sys.Monitors("2")
Set ViewSonic = sys.Monitors("1")

sys.SavePositions POS_ALL
Sony.Primary = True
sys.ApplyMonitorChanges
sys.SecondaryDisable



enableLCD.vbs
-------

Const POS_ALL = &H7
Set sys = CreateObject("UltraMon.System")

sys.SecondaryEnable
sys.ApplyMonitorChanges
Set ViewSonic = sys.Monitors("1")

ViewSonic.Primary = True

sys.ApplyMonitorChanges
sys.SecondaryDisable
sys.SecondaryEnable
sys.RestorePositions POS_ALL


So the start of gameCRT.vbs makes my Sony CRT the primary display, I apply this new state then I disable the secondary display which implicitly means my ViewSonic LCD. This bears some qualification. I expected to be able to set properties on my two displays then finish with a call to ApplyMonitorChanges but that caused a very annoying problem, at least with a couple of permutations I came up with. Namely, when I re-enabled the LCD later… the positions of the monitors had become inverted so if I moused off the left edge of my LCD (the left display), the mouse pointer would appear on the monitor to the right (the CRT).

Getting around this problem by playing with Ultramon’s object model consumed more time than anything else.

The second script enableLCD.vbs starts off with enabling the secondary display, which is my ViewSonic display on account of the first script implicitly making it the secondary. Then I set the Primary property making the Viewsonic LCD the primary display and again apply my changes with an invocation to ApplyMonitorChanges.

At this point in the script you would expect to be done but it turns out the Windows start menu is still displayed on the CRT even though the CRT is now the secondary display. You would think the Windows Start menu would naturally move itself to the primary display but that wasn't happening. As it turns out, this is not a new problem - I was experiencing the exact same behavior when I would do things interactively.

To solve this problem I disable the secondary display (the CRT now) at which points Windows’ Start Menu gravitates to the LCD (where it was before I decided to fire up a game). Finally, I re-enable the secondary display (the CRT). So reverting things takes additional steps whether you do it interactively or via my VBScript, the steps are logically the same.

That’s why the script has:

sys.SecondaryDisable
sys.SecondaryEnable


It's not immediately obvious why I would disable the secondary display and then enable it immediately afterward but now you know.

Back to back after the ApplyMonitorChanges call.

Ultramon can also keep track of the positions of the windows of applications as you flip flop. Notice this statement in the first script:

sys.SavePositions POS_ALL

And the following statement in the second script:

sys.RestorePositions POS_ALL

WinAmp is very, Very, VERY annoying when you change which display is the primary. The lower half of WinAmp, the playlist, detaches itself from the top half of WinAmp and the two windows get tossed to random positions on the desktop. Meaning WinAmp winds up in an altogether different location than where it was running. Then you have to spend time re-juxtaposing WinAmp's windows. Like I said, extremely annoying.

I also happen to run my CRT at a lower resolution than my LCD. This means when the secondary (LCD) is disabled after making the CRT the primary, applications like Firefox snap to the new smaller dimension of my Windows XP desktop. Unfortunately when I re-enable the LCD and make it the primary, applications do not snap back to their original size and location. Which means you wind up having to resize and re-situate all your applications.

It shouldn't be surprising that there are times you don't even bother firing up a game on account of all this. But with scripting and Ultramon being able to save state, all these hassles are eliminated.

Lastly I created two batch files, gameCRT and lcd which simply map to:

cscript /nologo gameCRT.vbs

cscript /nologo enableLCD.vbs

I have to give kudos to Microsoft for having architected Windows in such a way that it facilitates the development of scriptable objects by third party vendors. Considering the hassles eliminated, I really didn't care what scripting language was at play - VBScript, PERL, Python, etc., etc. In the end I automated the swapping of displays to more easily get my game on with the CRT whenever I saw fit.

Wednesday, April 2, 2008

Are You A god? Probably. You Just Don't Know It.

What could I possibly be talking about with a subject like that? Readers will find out soon enough. I've been working on a project. Something for Windows XP/2000 and to a lesser degree Windows Vista.

I submitted the program (with an installer) to Download.com and was given a public availability date of April 11th.

Yes, with this blog entry I'm not writing a "small book". What I created goes beyond catering to a technical readership. It has application to anyone who uses a web browser, including your uncle John who may have gotten his first computer last week because it seems some of his coworkers at the factory got one. Or your aunt Mary because she's into cooking and her best friend Sally whom she bowls with every Friday has gotten tons of recipes off the Internet. Now your aunt Mary wants to dive headlong into the Internet and she's been bugging you to go to Best Buy with her to pick a computer. Or, your grandmother, if she's a vanguard for her age group. But, most importantly, you.

I've been trying to think of a catchy domain. More to come.

Monday, March 24, 2008

Just A Bad Memory

Java application servers have become pervasive in the last decade as web based applications have replaced old school Visual Basic (VB) applications. Nowadays many corporate web applications are written using Java where the code ultimately runs on a J2EE application server. Java's pedigree is C++ and much to Joel Spolsky's chagrin, Java makes things much easier on developers in one area, memory allocation. In C/C++ a programmer is completely responsible for allocating and deallocating memory. Because of this, bugs at every level, from drivers to operating systems to applications, have found life in C and C++ codebases since the inception of those programming languages in the 70's and 80's, respectively.

Java does away with this onus by completely managing memory. Developers are free to use memory liberally with nary a care. Memory is reclaimed by the Java runtime through garbage collection in the background.

But garbage collection has issues of its own. The biggest is that applications can appear to hang as the Java runtime expends more time trying to reclaim memory than executing code. If you have a time sensitive situation, it can result in outright application failure.

Such was the case when some applications at my employer running under Oracle's application server would die after running an extended period of time. Application failure in the logs was indicated by successive log entries in a short period of time indicating full garbage collection, e.g.:


2624761.994: [Full GC ... 13.5346127 secs]
2624787.633: [Full GC ... 13.4446663 secs]
2624824.282: [Full GC ... 13.4713927 secs]


The indicator here is Full GC. This example indicates full garbage collection had taken place three times in less than a minute. I was asked if successive garbage collection could be monitored, sending an alert if in fact that was the case.

Monitoring a single file is easy, a quick PERL hack would be:
open FH, "tail -f log_file |"
while( <FH> ) {
if ( m/some_string/ ) {
# Do something when I've seen "some_string"
# Like send an email
}
}

You could nohup such a script and leave it running indefinitely in the background.

The problem with such an approach is an application server tends to house lots of J2EE applications. You would need a script for each application.

This simple solution does not get around the fact that log files are typically rotated away in the middle of the night. In other words, after a day the log file that is being monitored is no longer reflecting transactions on account of having been renamed with a date extension and moved elsewhere. A new log file is created and the monitor would not be attached to this new file.

Such a solution would also be maintenance prone since a monitoring script would be needed for each application. Therefore with new applications, a script would have to follow in tow and over time the monitoring system may not reflect what is actually running inside a J2EE application server. This shouldn't be surprising since it's quite common for people who write monitors to be separate from application developers.

Furthermore different servers have different applications so now you have a set of monitoring scripts that varies depending on what server you're situated at.

Taking all this into consideration I devised a forking PERL script that gets around all these problems. Forking is not something that surfaces often when writing PERL scripts but its use can greatly simplify some problems. The solution I devised was to have a parent process spawn one child for every log file (associated with an application). The log files are enumerated through a regular expression and the file list is then fed into the PERL script. This way I do not need to keep a list of what applications are running where. In the case of Oracle's application server it prefixes application logs with OC4J~OC4J_.

ls | egrep "OC4J~OC4J_[A-Z]+" | egrep -v ":[0-9]+$" |
nohup perl monitorGC.pl &

And finally the PERL script itself:


#!/usr/bin/perl

$SIG{CHLD} = 'IGNORE';

# Variable a bit of a misnomer at this point, length of time
# for parent to sleep.
$logRotationInterval = 3600; #seconds

# If we see full garbage collection happening more than once in this
# interval, send an alert.
$interval = 60; # seconds

@files = <>;
chomp(@files);

# Process IDs of children;
@pids;

while ( true ) {
# Loop forever. After one day spawned children will quit since log
# files are rotated once a day. This main loop will then spawn new
# children (see comments below).

for ( $index=0; $index<=$#files; ++$index ) {
my $pid = fork();
$pids[$index] = $pid;
if ( $pid == 0 ) {
$last = 0;

open FH, "tail -f $files[$index]|";
while( ) {
if ( m/\[Full GC/ ) {
if ( m/([0-9]+\.[0-9]+) secs]/ ){
if ( $1 > 2 ) { # If Full GC is Over 2 seconds
m/^([0-9]+\.[0-9]+):/;
open LOGENTRY, ">/tmp/monitorGCevent.txt";
print LOGENTRY $files[$index]."\n\n";
print LOGENTRY $_;
close LOGENTRY;
if ( $1 - $last < $interval ) {
`./sendGCMail.sh $files[$index]`;
}
$last = $1;
}
}
}
}
}
}

# Sleep while children do their work
sleep $logRotationInterval;

# Kill children there's no reliable way to have them exit
for ( $index=0; $index<=$#files; ++$index ) {
`kill -9 $pids[$index]`;
}

# Kill the tail processes that were spawned
`killTails.sh`;
}


So there you have it. A PERL script that is application agnostic that monitors successive garbage collection events and sends out an email when such an event is detected.

Thursday, March 13, 2008

GotoMySshPC


One question that surfaces sporadically when managing Internet facing IT systems is What does the rest of the world see? Occasionally after a systems deployment Internet traffic cannot reach some or all of the systems deployed.

This past week for the Nth time in my career this question surfaced. Everyone uses email but few understand the mechanism by which email delivery actually happens. MX records are the key; more on the topic in a moment.

Domain Name Servers (DNS) translate easily read and memorized host names such as www.google.com into TCP/IP addresses, that are not so easy to remember. For example, my Windows XP desktop currently would try to hit one of the following TCP/IP addresses if I were to browse www.google.com:

64.233.167.99
72.14.207.99
64.233.187.99


Quick! Look away! Can you recall any of those IP addresses? If you're like most people, probably not. This particular information was retrieved from my ISP's DNS server. A DNS server therefore is not much different than a phone book.

DNS servers also play the crucial role of facilitating mail delivery. They store different types of information and the type used for facilitating the delivery of mail is the MX record. MX is short Mail eXchanger. When using an email client such as Outlook to send an email, your ISP's mail server which Outlook happens to be chatting with, ultimately has to talk to another computer to deliver the email you just wrote. What computer your ISP's mail server talks to, to deliver that email, is answered by MX records.

One morning mail delivery for a particular domain of my employer was failing and I was called to investigate. On a hunch I did an MX record query for the domain in question and noticed the MX records for the domain had disappeared. This meant that mail delivery to that domain would fail. After further investigation it seems a typo in a DNS configuration file associated with a publish the day prior caused the problem. The resolution was to correct the typo, republish our DNS information and have it propagate across the Internet.

DNS is a hierarchical system. If your local DNS server does not have information that is requested, e.g., what does www.crazydeals.com resolve to?, the request propagates upwards ultimately reaching the top of the hierarchy reaching the root DNS servers of the Internet. The top of the hierarchy is then responsible for disseminating changes is has become privy to, changes that have often occurred at the bottom of the hierarchy. As news of a local change reaches the top, queries from DNS servers at the bottom of the hierarchy percolate to the top reaching the authoritative servers who then pass that information back down about the change that was made in another part of the tree. This information exchange among DNS servers is constantly happening. It usually takes several hours for DNS changes one makes locally to spread across the Internet, sometimes longer.

As the day wore on, I was in the office and wanted to check what my home system saw, i.e., was the DNS change that was made to correct the typo making its way across the Internet.

Thus the age old problem of wanting to conveniently reach my home computer while at the office to check what it was seeing on the public Internet had reared its head again. Once while doing contract work I overheard an individual in a cubicle next to me call his wife to ask her to browse pages to see if they were publicly available. Suffice to say, this problem surfaces on a semi-regular basis over the course of one's career.

Several years ago I had setup a LINUX box that I would reach via SSH. SSH (Secure Shell) is a cryptographic protocol that provides a mechanism to allow issuing commands to a remote computer system. While most people are used to GUI desktops, most back end computer systems are controlled through command line interfaces. If you think the monolithic bank that houses your money along with that of hundreds of thousands other people manages it all by clicking on icons with a mouse, think again. SSH is commonly the substrate by which remote control/management of many large IT systems happens.

One particular command line program that can be used to query DNS servers is nslookup and it comes with any contemporary operating system that connects to the Internet. Every single eye-candy laden Windows and Macintosh system ships with this command line tool. Thus if I could reach my Windows XP desktop, I could use this tool to gauge how far along our change had propagated. At least one vantage point anyway, that of my ISP's DNS servers. Not all DNS servers are public facing, which is why I wanted to see what my home system saw. I wanted to see if the change had trickled down to the private portions of the DNS hierarchy.

There are commercial services that allow one to readily reach one's home computer. One of the better known services is GotoMyPC. My primary issue with GotoMyPC is cost. It runs about $180 for one PC and it is a service, which means count on paying $180 every year.

Another option would be to have the desktop you find yourself at become part of your home network. This is a more general scenario than what GotoMyPC does and is the reverse of what most people do when working remotely. So instead of you connecting to your office via a VPN client, where by though the powers of indirection, you suddenly join your office's network allowing you to work remotely, you would do the same except in the reverse. Your office desktop becomes an extension of your home network. To this end there is an excellent piece of open source software called OpenVPN.

However a VPN solution to the home has two problems. One is that almost all home TCP/IP addresses are part of a DHCP pool so your TCP/IP address changes when the DSL/cable modem sitting in your home has to renegotiate its connection if for any reason connectivity is interrupted, however brief the interruption may be. So if you were to take note of your TCP/IP address, you may find that the next time you were inclined to connect to your home PC, your home network's TCP/IP address is no longer the same. You would be completely out of luck.

A solution to this particular problem exists in the form of an organization known as DynDNS. They allow you to associate your home TCP/IP address with a world wide hostname/domain. This means after registering a domain and using DynDNS to provide DNS services for said domain, you could use something like mypc.mydomain.com when telling OpenVPN to connect to your home network while you are sitting at the office.

Since DynDNS and OpenVPN are free, they are a much more attractive scenario than the pricey yearly cost of GotoMyPC. However you still need to register a domain with a registrar before mypc.mydomain.com (whatever it really happens to be) takes on life. The cost of registering a domain is nominal and about an order of magnitude less per year (vs. GotoMyPC's yearly cost) if you shop around.

There is still one issue that GotoMyPC readily gets around that the OpenVPN/DynDNS solution does not. If your home connectivity is lost and a new IP address is issued, when this information reaches DynDNS' servers it can take several hours before this information (the new IP address) propagates across the Internet. But Murphy's Law may be in full force on a day you really are inclined to reach your home network. If per chance your DSL/cable modem did receive a new TCP/IP address, DNS propagation means it will be several hours before you can connect.

GotoMyPC gets around this by having agent software that chats with their own centralized servers. The agent communicates your home network's current TCP/IP address to centralized servers so DNS is not even involved when trying to reach your home desktop through their service.

If you do go down the DynDNS/OpenVPN route, then like any real business you better remember to renew the domain otherwise you will find your convenient hostname mypc.mydomain.com one day no longer works. It is actually a common problem and various companies including the likes of Microsoft have readily forgetten to do this.

Stepping back for a moment, this all boils down to having information readily in hand, i.e. what's the current TCP/IP address affiliated with my DSL/cable modem's connection to the Internet.

So after this past week I decided to solve this problem once and for all so I could always readily reach my home network, without cost and without hassles. No GotoMyPC, no OpenVPN/DynDNS, no registrars, no accounts with anyone.

Earlier I mentioned that I used to have a LINUX box with SSH running. The primary detractor in continued use was the fact that the TCP/IP address into my home network would change. Eventually I simply abandoned its use. Neither OpenVPN or DynDNS existed at the time.

Once again I would leverage SSH, specifically OpenSSH. While most IT administrators are used to using SSH with *NIX systems or network devices, it turns out OpenSSH can readily be configured to run as a service on a Windows XP/Vista/200x systems, thereby affording remote control/issuance of commands. OpenSSH can be downloaded as part of Cygwin:

www.cygwin.com

Cygwin is a layer of software that makes a Windows system appear *NIX-like. This means, among other things, various open source applications such as Apache's web server (1.3.x) can be compiled and run on a Cygwin equipped system with no changes to the *NIX source code whatsoever. But outside of providing programmatic similarities it also provides ports of back end services such as SSH (OpenSSH). Rather than dive into the gory details of setting up SSH on Windows, go here for an excellent how-to.

Cygwin provides all the command line utilities commonly used in *NIX. People often collectively call a software system LINUX, UNIX, etc. but many of the tools employed in those environments have no intrinsic functionality that ties them to any single operating system. For example, many of the GNU command line utilities have been ported to Windows. The executables are stand alone and run on Windows without any dependencies. When installing Cygwin, all of these tools are available, including command interpreters such as the bash shell.

All this means I should be able to write a script that allows me to retrieve the world wide TCP/IP address associated with my home network. This information however is not stored on any single machine inside of my home network. Like most people I have a home router fronting my network connections. Using my home router's web interface I can readily see what my outside TCP/IP address happens to be so the task is getting this information through a script.

Most people use their web browsers to fetch web pages but the HTTP protocol is very simple and command line utilities exist to do the same. They often form the basis of "heart beating" web applications, i.e., if I can fetch a web page, the web application is still up, if not, send out an email alert. But they have other uses such as this one. I simply want to log into the router, hit the web page that contains my outside TCP/IP address information then finally do something with that information. Easy enough:


#!/usr/bin/bash

while [ true ]; do
sleep 600
wget --http-user=user --http-password=passwd
http://192.168.1.1/Status.htm
cat Status.htm | cut -c3967-3981 |
perl -lane 'm/([0-9.]+)/; print $1' > currentIP
diff currentIP lastIP
if [ $? -ne "0" ]; then
# Send mail about IP change
cscript sendIPAddressViaMail.vbs
cp currentIP lastIP
fi
rm Status.htm
done



My home router performs basic access authentication and I can pass the requisite username/password through command line arguments to the wget utility program when I retrieve the web page that contains the TCP/IP address my DSL modem negotiated. In the case of my LinkSys router that happens to be:

http://192.168.1.1/Status.htm

Initially I used wget in an ad hoc manner and it by default it stores the page fetched in a file. I noticed the page that had been returned from the router had no line breaks so I fired up EMACS and navigated to the column where the TCP/IP address affiliated with my outside connection starts. Noting the column offset I scripted cut to pluck 15 characters (max possible TCP/IP address string length). Finally I filter the string through PERL so that only characters making up a TCP/IP address would be in the final string that I output to the console.

I store the extracted TCP/IP address in a file named currentIP and compare it with the diff command to the last IP address (lastIP is initially setup by hand). If there are no differences in the files (determined by checking diff's exit code), this indicates my TCP/IP address has not changed so the script goes to sleep for another ten minutes (600 seconds).

If the files are different then things get a bit more interesting. How do I communicate this information? Most *NIX administrators use the mailx utility to send email via their scripts. The problem with mailx is that it assumes there is a local mail server running and running your own mail server is a can of worms onto itself. And honestly I'm not inclined to run a mail server on my Windows desktop. Rather than dive into such issues, I leverage Microsoft Outlook. Knowing that the Microsoft Office applications can be automated with VBScript, I concocted the following script that is executed through the Windows command line tool cscript:


ESubject = "IP Address change"
SendTo = "mymail@someWebMailAccount.com"
Ebody = "IP Address change"
NewFileName = "D:\cygwin\home\mariop\currentIP"

Set App = CreateObject("Outlook.Application")
Set Itm = App.CreateItem(0)
With Itm
.Subject = ESubject
.To = SendTo
.Body = Ebody
.Attachments.Add (NewFileName)
.send
End With
Set App = Nothing


The script sends a text file that contains the TCP/IP address my DSL modem last negotiated as an attachment to my web based email account. This way I can always just browse my web mail to find out my home network's current TCP/IP address. If my TCP/IP address changes, an email will be sent out within ten minutes.

There was one last hurdle to this solution. In 2002 given the prevalence of VBScript based worms in years prior, Microsoft changed Outlook such that automating the sending of emails was not possible without confirming email sends. Outlook will now issue a pop up asking whether or not to allow email being sent through a script. This a major fly in the ointment since I need this to be unattended since after all I'm not at my home PC. After some googling I came up some freeware:

http://www.contextmagic.com/express-clickyes/

The latter page shows the pop up that surfaces if email is sent via VBScript.

There you have it, with all this in place I'm assured of always being able to reach my home desktop. No GotoMyPC, no OpenVPN/DynDNS, no registrars, no accounts with anyone.

It turns out the solution I employ can be used in conjunction with OpenVPN eliminating the need for DynDNS and having to register your own domain. However I prefer SSHing since this allows the computer I'm working on to maintain its local context. I can readily switch between what I'm doing at work and what I might want to do at home. Namely, the machine I'm working at doesn't become part of my home network and suddenly local/work resources are unavailable to me.

If you're at all familiar with SSH's abilities you can tunnel various application protocols such as Microsoft's Remote Desktop. This means I can reach my graphical Windows XP desktop as the need arises. Which is exactly what GotoMyPC provides. Except in my case, without the $180 yearly cost.

When I establish an SSH connection to my home system, I do something like this using the SSH binary that is part of Cygwin on my work machine:

ssh -L 3390:localhost:3389 my_home_ip_address

After I've logged onto my home system instead of giving a machine name to Microsoft's Remote Desktop client while I'm sitting at the office, I simply specify:

localhost:3390

And before too long I get a login to my home Windows XP desktop while sitting at the office. The entire conversation between work and home computers is encrypted.

GotoMySshPC.