Tuesday, December 18, 2007

Mod-direction

If levels of indirection in our lives were as prevalent as those found in information technology, life would be pretty tedious and it would be an awful lot of work to get even small things done. Yet levels of indirection are par for the course in technology. The abstractions they engender form a patchwork that give us the applications we are accustomed to using. Everything from desktop applications to the World Wide Web.

For example we speak of 32 bit processors and increasingly 64 bit processors and it is mostly understood by those in technology that this delimits addressable memory. Which means that in the case of your 32 bit system (probably; at the time this was written) it can address 2^32 bytes of memory or more specifically 4 gigabytes in the context of a single application. In the case of a 64 processor this rises to 1.84467441 × 10^19 bytes. That's:

18,446,744,100,000,000,000

So a 4.6 billion times increase over what a 32 bit processor can address.

But it turns out that other "bits" dictate the bytes that make up the data and code associated with our applications. The Intel x86 architecture for example has registers called selectors which comprise 16 bits. The first 13 bits are actually an index into a table the processor keeps, the next bit dictates which table to do a lookup against. A single bit has two states, 0 or 1, which means that there's two tables. These happen to be called Global Descriptor Table and Local Descriptor Table. 2^13 means 8192 possibilities which means there's at least 16,384 initial paths before the 32 or 64 bits that people typically speak of start to take context - in the form of either code to be executed or data to manipulate. To make things very lucid for a 64 bit processor this means:

18,446,744,100,000,000,000 x 16,384 logical paths to code and data

To quote Sir Isaac Newton, "If I have been able to see further it is because I have stood on the shoulders of giants."

Much of we take for granted in our every day lives is attributable to "behind the scenes" machinations seemingly akin to caricatures of of mouse trap contraptions that have played themselves out in pop culture from cartoons to board games. From the power grid system to water flowing through our lavatories to the Internet there is much ado with the conveniences we are accustomed to with the flick of a finger.

A router in a home masks the fact that a user has several machines behind a single TCP/IP address. This many to one relationship is a level of indirection. This indirection, redirection or in the case of malice, misdirection (think botnets), happens at a level that both endpoints are oblivious to, e.g., a web browser and a web server.

Things get interesting when decision making (indirection) happens at the the initial stage of browsing a web site, which is mutually exclusive of the indirection happening at your home router. It may turn out that the initial chatter your web browser conducts when going against your favorite web site is really against a load balancing device that makes decisions based on the URLs you are requesting. So when you hit www.my-favorite-site.com/search/query.jsp, the device keys off of search in the URL and always sends these page requests to one or more web servers dedicated to searching the knowledge space of that site. To this end, products from f5 are well known in the industry. This is often known as layer 7 load balancing. Layer 7 is the top most layer of the OSI model of networking and is called the application layer since that layer is closest to the abstractions that make up an application, in the example given, the request to a search page for information.

A home router works at layer 3. All these layers are levels of indirection that build on top of each other to facilitate the applications that are familiar to any contemporary Net user, from IM to web browsers to Bittorrent clients.

Another potential decision point for web servers is the kind of HTTP method a client is making. The HTTP protocol allows a client, such as a web browser to make different types of requests to a web server. An overwhelming majority of requests made against sites are usually GET requests by web browsers and it means what it insinuates get me a page. The second most common HTTP method is the POST method. This method is usually used with forms, so anytime you've entered your name, address and credit card and hit the Buy button, odds are overwhelming that an HTTP POST method was at work.

When the HTTP protocol was designed in 1989 the world was a simpler place, the issues of phishing, email spam and other rogue forms of misdirection were yet to surface. In 1999 with revision 1.1 of the HTTP protocol the HTTP TRACE method was added. Having done software development in the past, the motivation is clear, it was designed as a debugging tool. The method simply has the web server echo a request that was sent to it. Sounds benign enough except that if your web browser has cookies associated with the path sent to the HTTP TRACE method, the web server will readily echo the cookie values. Again this sounds benign enough except in the contemporary world of rogue JavaScript (embedded in a malicious HTML email that happens to be spam), the HTTP TRACE method in conjunction with the XMLHttpRequest object can be used to read cookie values and then send them off to someone who just might then go into your bank account and relieve you of your finances. This forms the basis of a Cross Site Scripting attack.

The XMLHttpRequest object in JavaScript forms the basis of Web 2.0 applications, i.e., applications that are very dynamic in nature and feel like a local desktop application. You could turn off JavaScript but you will find out soon enough that it is a lost cause. Most sites rely on JavaScript to provide the experience that end users are accustomed to.

One problem in the management of IT infrastructure and the applications that live on that infrastructure is time. In time, every computer system that is deployed will degrade and become a legacy system.

Any computer system has two factors working against it.

The hardware technologies employed will become antiquated and may not be up to the ask with the growth of an organization. If the system is up to the task other factors over time readily contribute such as replacement parts or in some cases, the vendor of the original hardware ceasing support or going out of business. Some systems are labeled legacy systems but very much still form the crux of what is happening in the here and now. The Internal Revenue Service grapples with the problem of having a well entrenched legacy system.

With the software residing on a system, a wide range of factors contribute it moving to legacy status. Everything from employee attrition (waning knowledge base) to business requirements not being met with the software initially deployed to finding people with skill sets that can make changes to the system.

The factors and their weight will vary widely but just as time weighs down on our knees, time will weigh down any computer system from both the hardware side and the software side making continued use a questionable business proposition. Either in terms of the opportunity cost of lost business or the prohibitive cost of the status quo. In the case of the IRS, the government being what it is, it has resources (your tax dollar) that the private sector could not muster, a.k.a. bankruptcy.

This past week I ran into one of these legacy systems. An Internet facing application server employing JBoss. JBoss is intended to run Java code that makes up web applications. RedHat acquired JBoss in April of 2006 for $350 million. RedHat is well ensconced in the IT industry and JBoss is still well supported so what made this system a legacy system? The version of JBoss running on the server in question was dated to March of 2003 and staff that put it into use where no longer with the organization. More to the point there was little documentation. This is classic case of attrition making a system difficult to support and why managers should be big fans of knowledge transparency in the forms of wikis and allocating time up front to document systems. My role in this case was strictly that of system administrator of the underlying LINUX OS.

This Internet facing application was flagged by auditors for a Cross Site Scripting vulnerability on account of the web server servicing the HTTP TRACE method. I was tasked with seeing if the system could be reconfigured to turn off servicing HTTP TRACE requests.

As I performed a discovery process I learned that the first tier that responds to HTTP requests is a complementary piece of open source software to JBoss called Jetty, an open source web server written entirely in Java. After I ferreted out where Jetty's configuration file lived, through online investigations I discovered that the version of Jetty included with the version of JBoss found on the production system did not have the facility of turning off the HTTP TRACE method. In fact, the version of Jetty employed was one minor revision number off from having the ability to turn off HTTP TRACE.

A quick initial assessment of the situation purely as a function of the application served up as well as the application server itself (JBoss/Jetty) seems to leave only difficult choices. If the application server software is updated, it could very well break the underlying application. As alluded to earlier, part of the problem with legacy systems is knowledge transfer, or lack thereof. Outside of a cursory inspection of this Internet facing web application such as logging on to check base functionality, little was documented, such as a regression test plan or even the tools to carry out such a regression. Perhaps an upgrade of the underlying application server would go well initially (software starts) but application functionality will break (what end user's perceive). Rewriting the application is not realistic either. The application's entry point was flagged during an audit so more than likely remediation is expected sooner rather than later. In addition, asking managers to expend resources to rewrite applications impromptu is not likely to garner support (think budgets), especially if the scope of the application is large. Nor is it likely to thrill tech people who may be on time lines with other projects (see point 8).

Indeed this seems like a very thorny problem with choices that entail lots of risk. Unless that is, you happen to be familiar with the various levels of indirection and their layers and how they play among themselves but even more importantly the tools to manipulate them.

The Apache web server is an amazing piece of software. Despite years of Microsoft giving Internet Information Server (IIS) away for free and Steve Ballmer's hot air of calling one operating system that popularly runs Apache a cancer, i.e. LINUX, it still remains the most popular web server in use on the Internet today. Rather than take my work for it (or not), visit Netcraft which offers a tool that allows you to see what web server your favorite web site is using. The link provided shows the web server platforms behind sites that people were curious about the most - observe that Microsoft's platform is barely on the radar (12/2007).

This is no coincidence. One of Larry Wall's mantras for his popular PERL programming language when he designed it was make the easy things easy while making the difficult things possible. It is a philosophy that shows up consistently with the proponents of open source technologies. It is one thing to be the end user of a site, say Amazon, that may be a complex heterogenous mix of computing platforms, ignorance as they say is bliss, but it is quite another if you must administer the platforms that comprise such a site. The more degrees of freedom that exist to make the system pliable, the greater the ability to adapt to situations as they arise, planned or unplanned... such as when an auditor is complaining.

What makes Apache so powerful specifically are all the modules of extensibility that come with the system 'out of the box'. They are often called mods for short. Being part of the open source community the Apache web server has engendered an active developer community that affords administrators great flexibility with configurations but also in the manipulation of HTTP traffic.

Manipulation of HTTP traffic is one powerful use case for Apache. In this capacity, an Apache web server instance never actually hosts any web pages but is used as a traffic cop to redirect HTTP traffic based on rule sets giving the illusion to the end user of a cohesive web site but in actuality the site may be an amalgam of different web servers.

The Apache modules mod_proxy and mod_rewrite provide these facilities and extensive use cases and the minutiae of the semantics of the rule sets are beyond this write up. Suffice to say they can be used to solve the thorny problem of turning off the HTTP TRACE method.

The solution is simply to have the original web server that fronts the offending application to listen on a non-standard TCP port such as 8080, which is not visible to the outside world on account of a firewall, have Apache listen on the well known port (80) then with a mod_proxy/mod_rewrite rule set direct traffic based on the HTTP method. If a request in the form of HTTP TRACE method comes in, simply deny it.

This is easily done through:

RewriteEngine on
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]

RewriteRule ^/(.*) http://localhost:8080/$1 [R,P]

The RewriteCond statement looks for the TRACE method and the RewriteRule after it causes Apache to return a 403 Forbidden page. The second RewriteRule is a catch all that simply has Apache delegate all other HTTP traffic to the server listening on port 8080 on the same system Apache resides, which in this case is the offending JBoss/Jetty server that will gladly service HTTP TRACE requests. However Apache configured as such will filter these TRACE requests so all that the auditor sees is a result that nullifies the previous observation that this particular Internet facing web application services the HTTP TRACE method. And thus with a deploy of the Apache web server and the addition of four statements in its configuration file, no blown budgets, no interrupting developers.

In the interest of a balanced viewpoint it turns out that if you have a contemporary application load balancing device such as those from f5 the filtering of the HTTP TRACE method can happen there. But it turns out in this case the application was not fronted by such a device. If such a device does exist, the politics at play within an organization may make it more work to involve other teams to make network changes rather than to reconfigure software on a single server that ultimately receives the offending requests.

A better decision point however is the number of points of potential change. It is more cost effective to manage access control from a central location such as an f5 device than visiting multitudes of boxes to make configuration changes if the Internet application is backed by an entire web server farm. Since that was not the case here, an Apache solution was employed.

To quote Arthur C. Clarke, "Any sufficiently advanced technology is indistinguishable from magic." If these tiers of indirection are not on your radar, yep, magic.

Thursday, August 9, 2007

As easy as -p -i -e

Before the advent of the World Wide Web and the decoding of the human genome, PERL had already ensconced itself in the IT world amongst *NIX system administrators almost ten years prior. For good reason. PERL provides a more complete feature set than the Bourne Shell and/or sed/awk. Then and now, all of these tools are commonly used by system administrators to automate repetitive tasks.

One of my favorite invocations of PERL is with the command line options -p -i -e. This particular combination allows for the in place editing of files, namely searching for one string and replacing it with another. Like so:

perl -p -i -e 's/original_text/replacement_text/' configuration_file

You can even qualify the -i parameter to back up the file being modified:

perl -p -i.bak -e 's/original_text/replacement_text/' configuration_file

The original file will continue to live on as configuration_file.bak.

Recently, a coworker was faced with changing a configuration file to point from one database to another on six production instances of Oracle's Application Server. Each instance in turn housed ten applications. This meant he would need to go edit sixty configuration files or use Oracle's web based administration tool and drill into it sixty times. Irrespective of which of these methods you employ, the process is tedious and error prone.

Armed with such knowledge, cobbling a solution for my coworker's plight did not take long. Logging into any number of machines via a script is one reason the Secure Shell(ssh) exists. Since I administer the boxes the changes needed to be done on, I had the requisite private cryptographic key to log into all of them without being prompted for a password.

This problem beckoned for automation since I could readily traverse all of the machines from a central point and invoke commands. In summary, I wanted to log into each machine, find the configuration files of interest, do an in place substitution to point to a different database.

Here is the solution for saving my coworker from firing up vi sixty times across six machines:

for i in `echo "mach1 mach2 mach3 mach4 mach5 mach6"`; do
ssh $i 'cd /oracle/10gR3; find . -name "data-sources.xml" | xargs perl -p -i.bak -e "s/oracle/microsoft/" ';
done


The for loop is a shell construct executed on the system that serves as my jumping point to reach each system where I employed ssh-agent to avoid password prompting. Ultimately the heart of what I am doing on each machine is this:

cd /oracle/10gR3
find . -name "data-sources.xml" | xargs perl -p -i.bak -e "s/oracle/microsoft/"

I situate myself in the base directory of the Oracle application server, find all the copies of the configuration file in question, data-sources.xml, then use xargs to invoke PERL to do in place substitution against each and every instance of said configuration file. Using find with the -exec option and removing the intermediate xargs invocation works equally well.

So now, instead of wasting an hour (or more) editing sixty files, we both can hit the golf course an hour early.

This experience underscores the potency of using several seemingly disparate tools to solve a problem - very much the *NIX way of doing things. Now if only I could do these kinds of things "out of the box" under Windows...

Tuesday, August 7, 2007

An impromptu GUI for strace when monitoring a forking server


I used to have a coworker named Tom where it was joked among us that he had only two possible answers to any technical hurdle posed to him - strace or tcpdump. As it turns out, more often than not, he was right.

strace gives visibility into OS system calls made by a running process. tcpdump allows sniffing of network traffic and is often used to ferret out networking issues. For monitoring an application as it interfaces with a host OS and/or its interactions over a TCP/IP network, these two tools are indispensable.

I once deployed a web application that during startup failed on account of a missing file. As best as I could tell, the file in question was present. Count on strace to get to the bottom of things. It turned out to be a simple misconfiguration - the file lay situated in a parent directory of the code instead of a local directory. This was easily divulged when I ran strace and could readily see all file open operations during the start up phase. What I was able to readily infer from its output was much more meaningful to me than Java code complaining that it could not open a file and then immediately terminating (but not divulging the path of the file it was trying to open).

Strace is powerful but its output can be quite noisy if the application being monitored has a high level of activity. If per chance monitoring a forking server such as Apache or Postfix is desired then strace really gets noisy. Making sense of the "spaghetti" that is returned back from system calls associated with a parent/child process tree is not fun. While it is possible to have strace latch onto a specific child process, errant behavior that is occurring may not be happening in the process being monitored but instead a non-monitored sibling process. Therefore using strace to have complete visibility into a forking server where the output is readily digestable is problematic.

I found myself precisely in this boat as I wanted to monitor the file activity of a Postfix instance running under Solaris. In Solaris, strace's equivalent is truss. After not too long here is the solution I devised:

lsof |
grep TCP |
grep smtpd |
awk '{print $2}' |
sort |
uniq |
perl -lane ' system("/usr/X/bin/xterm \"-sb\" \"-sl\" \"1000\" \"-e\" \"/usr/bin/truss\" \"-p\" \"$_\" &") '


lsof is a tool that will list all open file handles on the operating system, including network file handles. I specifically was looking for processes that had TCP/IP network connections (first filter). Then I looked for smtpd which is the name by which Postfix is listed in the process table (second filter). After which I used awk to pluck all the Postfix process ids which are in the second column of lsof's output (third filter). I sorted these (fourth filter) and removed any duplicates (fifth filter). Finally, this list of process ids was fed into a PERL one liner (sixth filter) that spawns xterms with the -e option. This option launches an application inside of an xterm, in this case truss.

Employing Cygwin's X Server port on my Windows XP desktop, an xterm appeared for each Postfix process running under Solaris (see image above), with truss showing me all the system calls for each of them in real time. Outside of the fact that this approach yields information that is readily consumable for spur of the moment diagnostics, another advantage is that when a process terminates, the corresponding xterm housing truss monitoring that process disappears off the desktop. Which is good visual feedback for taking action depending on the context of the situation.

For PERL to correctly interpret my intentions, I had to escape the double quotes. Furthermore when xterm is launched in this fashion, double quotes are necessary to delimit the command line arguments. So the sample code is harder to read than if one were to interactively fire off an xterm interactively against each and every process like so:

xterm -sb -sl 1000 -e truss -p process_id_in_question

This statement says to fire up truss inside of an xterm against the process id specified after the -p argument with a scroll bar (-sb) and with a 1000 line scroll back buffer (-sl 1000).

Canonically, child processes of forking servers are ephemeral. That is, a child process on a forking server will handle a given number of requests and then terminate. This design philosophy insures that one child cannot live long enough to consume all available system resources, either by accident or malice (serving as a proxy for Denial of Service attacks).

Therefore coming up with a script that hard codes process ids is pointless as their number and ids will always vary over time. Whereas this technique fetches process ids on the fly and couples this with the ability of an xterm to house an application (truss in this case), the end result being an ad hoc GUI for stracing/trussing a forking server.