When I was laid off from the company Kristin and I worked at, she still worked there and telling this story would not have been a good idea. Recently, though, Kristin got the axe herself and now, almost 18 months overdue, and with great relish, the story can be told.
We've really been enjoying the Tivo Kristin got me for Christmas. It's programmable, can pause or go back up to 30 minutes while watching live TV, can change the digital cable channels (unlike our now primitive-seeming VCR) and records around 40 hours of TV with no need to change tapes. It's really cool. I'm fond of old movies and Discovery Channel programs. Kristin's tastes run more to Bill Nye the Science Guy and Mystery Science Theater 3000. But we both share the love of finding new and unusual things to watch, often simply for the fun of making fun of them. This is how I found myself watching the 1961 Disney weirdness, Babes in Toyland . Kristin insisted it was pay-back for making her watch The Gnome-Mobile a few weeks earlier, and pay-back it was. Babes was just awful. It was aimed at children and aimed very low indeed. When the main characters finally get to Toyland, half-way through the movie, Annette Funicello, Tommy Sands (Frankie fill-in), and a gaggle of small children in tow come across "The Toymaker" and his assistant, played by Ed Wynn and Tommy Kirk, respectively. In his first scene, Tommy Kirk shows off an invention that makes any toy by punching codes on a panel. After watching the machine magically create a dolly and a sailboat, old Ed remembers that he's the boss and should therefore be the one to inaugurate full-scale production for Christmas. He bangs on buttons, pulls levers, and dances a little dance while Tommy Kirk hovers and flusters nearby. The result, which you see coming from a great distance, is a messy doll-boat-top thing that reminded me of the genetic-hybrid mess frequently created in The Fly movies -- but in a happy, cartoon-like way. The machine soon explodes, of course, showering the lab with toy carnage. In the aftermath, old Ed pulls himself together and informs Tommy that the mess is his fault. And ain't that always the way with management?
Now the Toymaker had his own toy factory and if he wanted to be an ass to Tommy Kirk, then more power to him. This was an American vision of Toyland and we shouldn't forget that we have the right to be assholes in the name of commerce. And here we come to the meat of today's story. Maybe I'm being naive, but don't you think it should be more difficult to get away with such self-aggrandizing, self-serving bullshit when you work in a publicly traded company? Wouldn't you think the shareholders, if no one else, would want to know what you're up to or how you managed to lose almost a billion dollars in six months? I've been fortunate enough to have worked at companies that used "360 Reviews" (aka Full-Circle Reviews) to allow employees and managers to review each other. This usually has the pleasing effect of largely stemming the tide of incompetence that flows through corporate America. Still, some managers never seem to understand that even if you successfully hide your incompetence from your boss, you can't hide it from everyone.
I recently worked for two years at what was once the darling of the personal electronics industry, Handy (names changed to protect my butt from lawsuits). By the time I was laid off, it had become a pit of ass-covering, lying, boss sucking-up weasels who were more interested in trying to prop up their imagined reputations than in actually looking around to see if they had accomplished anything lately. It was a place where those who tried to do their job well were considered suspect. Those who didn't keep their heads down and their mouths shut (like me) didn't stand a chance. Take this story for what it is. It's as much about achieving closure as telling a story that should have seen the light of day much sooner if not for the ongoing threat to what remained of our livelihood (as well as Kristin's sanity). Or, if you just like bad-boss stories, this ought to give you a tickle.
Kristin introduced me to her new co-worker, Brian LaManche, at a summer party we threw in 1997. Of average height, slim and athletic-looking, he wore a straw hat over his thinning blond hair and this complemented his outgoing, friendly demeanor. He was a likeable person. We got on well at the party and a few weeks later he invited me to Handy to talk about a job writing build scripts for their Software Engineering group. We quickly realized the work he had in mind wouldn't take more than six months to complete, so I asked the obvious question of what I'd do after that. Brian wanted to check with his director, Vera, and said he'd call me later. When he did, it was to tell me that Vera had decided that I would just run the scripts after they were written. That didn't sound like much fun, so I turned down the job. Vera later told Kristin that she didn't like to hire "over-qualified" people. I was annoyed, as much because of the stupidity of the decision as because I liked the idea of working at Handy, but didn't pursue it. A few months later Vera left Handy.
In late 1999, I was more actively looking to change companies and talked with Brian again. He wanted someone to automate OS builds for Handy's best-selling Personal HandJob and Professional HandJob. When he brought me in for an interview at their new building, he talked excitedly about how they had tried to script the builds, but had only limited success doing so. He told me how much he regretted not hiring me 18 months earlier, and how great it was going to be. Eventually I interrupted him to ask if he wanted to know what kind of work I'd been doing recently and whether my skills matched what he was looking for. (Y'know, interview me, man.) He was just slightly taken aback, but asked me a few questions about what I'd been doing. A moment later he happily announced that I was exactly what he was looking for. I had the impression he had decided before the interview to hire me. He continued, telling me that all the builds were now completely scripted (even though he had just told me they had only limited success), but that somehow, though these build scripts were great time-saving tools, his build engineers couldn't keep up with the workload (i.e., the scripts weren't saving that much time). He believed they needed some kind of build server to run the scripts. We chatted about what I could do to improve the scripts and the direction we could take things in the long run. It seemed to go well, though something seemed out of place.
A week or so later I came back to a formal interview. His lead build engineer, Dirk Hobart (Hobie), was first up and he was a strange one; a tall bear of a man, intimidating with his size, shoulder length hair, and impassive face. Twice during the interview he paused, just looking at me or in my general direction like someone had flipped his power switch. He didn't seem to be thinking or shifting the conversation to another topic. He just stopped. I sat it out, not knowing what else to do, and eventually he started talking again. When I mentioned it to Brian, he brushed it off as one of Hobie's eccentricities, which I saw numerous times later. I also met Donald McKenna that day. Dressed in jeans and a T-shirt, with a slightly out of style mop of hair, he was introduced as an Engineering manager and was there to find out how well I knew Perl, the language they wanted me to work in. I wouldn't say he was impatient, but he was intense, moving from topic to topic quickly, as if he had somewhere else to be. I passed his quiz with flying colors, I'm happy to say, and he was gone within fifteen minutes.
After the others finished with me, Brian came back into the conference room to talk more about a build server, trying to make a much stronger case than he had in our previous discussion. He said they had reached "the limit of what scripting could do" and needed to move on "to the next logical step, which is a build server." He said this would be the beginning of an entire automation department and even suggested I would manage it; but then he caught himself and continued talking about how great the build server was going to be. I had questions, though. "...have reached the limit of what scripting could do" turned out to be equivalent to "can't keep up with the workload." The workload at that time was 30 to 40 builds a week, divided between three build engineers, which came out to 2 or 3 builds a day per build engineer. If you write a script to perform a task, the idea is that the script runs programs and commands, so you don't have to wait around and perform each step yourself. A lot of software development consists of dozens or hundreds of small tedious tasks, most of which can easily be scripted to be run by your computer; it's the whole point of scripting. So if Brian's build engineers had scripted their builds, they should be able to run a script and then go do something else until it finishes -- like go run scripts on a few other build machines, allowing each build engineer to do 10 or 15 builds a day -- a five-fold increase. If the script could run on its own, you could even write a "wrapper script," a script that ran the other scripts. Each build machine might simply run a single script that ran 10, 20 100 builds in a row, running all day and night, with workload never entering into the picture. But if Brian was telling me the build engineers had to sit at a computer for an hour and a half (as it turned out) to run a scripted build...well, how scripted is that?
I could see Brian was thrown as I briefly outlined these thoughts. He insisted the builds were scripted, while admitting he didn't know why the build engineers had to stay with the scripts as they ran. I continued to gently push the matter, figuring if I couldn't resolve it there and then I probably didn't want to work at Handy anyway. He finally relented and told me it was Software Engineering who was requesting the build server, so I should please forgive him if some of the details weren't quite right. No problem, I said, trying to salvage the interview. I'm the engineer, and I leave it to him to be the manager. I've worked for managers who aren't programmers and don't really understand what I do, I told him, and it's never been a problem as long as they can be clear about what their business goals are and I am able to satisfy them with solutions. He seemed relieved by this and talked about how he knew exactly what he wanted, and how Software Engineering would be very happy working with me. After a little more discussion and a bit of reluctance on his part, Brian agreed that I could start by establishing a baseline of what was actually scripted (and how much so) and what it would take to create a build server. The problems with Brian's descriptions weren't as obvious to me then as they are now. In hindsight, having interviewed with him before and knowing a fair amount about the company through Kristin, when I was faced with comments or explanations from Brian that didn't quite make sense, I tended to think something like, "Oh, but I know what that means, so I won't worry about what he said." I wanted to work at Handy, you see. Looking back on my tendency to make excuses for things I want to work out, I think I fooled myself into thinking it was better than it was. Anyway, as we set a date for me to start working, I made a mental note to meet Handy's software engineers as soon as possible and find out what they really wanted.
Two weeks after starting at Handy, Brian pulled me aside to tell me I needed to "start showing progress on the build server." When I reminded him of our agreement, he simply told me there'd been a change of plans due to the high priority of the build server. Rolling with the punches, I asked him if we could talk about some of the details we had glossed over in the interview. Agreeing, he found us a conference room where he drew a figure on the whiteboard. He became animated as he talked, his eyes lighting up and his hands moving dramatically. The machines on which the OS was built, called build machines, naturally enough, were Macintoshes. I already knew this from spending a few sessions with the build engineers in their build lab. The build server, however, was to be a Windows NT box. I was to take the build scripts, he told me, and "throw them on the build server." The server would run the first script, and then the next and the next. When it finished running all the scripts it would start with the first script again. Its name would be The Warden and it would "kick out builds day and night."
I tried to walk him gently through his own attempt at logic, hoping he would see some of the important facts that he had missed, without my having to prompt him. What he was saying in essence was, "we're going to take Mac programs and throw them on a Windows box to make them more efficient." When he didn't get it I said something like, "I don't think you can run Mac programs on a Windows box." That stopped him dead in his tracks. The contortions on his face and his widening eyes spoke volumes as it became clear he hadn't thought this through at all. "But the server has to do the builds," he said, staring at the whiteboard. I waited while he mulled this over. "What if we used Timbuktu," he asked; but Timbuktu doesn't work with Perl scripts and I told him so. "Maybe the build server could send the build script to the build machine one line at a time," he continued bravely. "It could wait until the build machine executes each line before sending it the next one, so even though the build machine is executing the steps, really the build server is doing the builds." He looked hopeful, but the Perl interpreter needs to read an entire script before it can run it. I suggested the server, through some simple communication mechanism, could tell the build machine which script to run. This would be much easier to write and test than spanning the process between two machines, and that meant I could complete it sooner. To my surprise, he dismissed the idea immediately, saying, "That can't work. Software Engineering wants a build server that does the builds."
I've worked on build systems before and I know that QA usually likes to get no more than one or two builds a week so they have time to do regression tests and submit bug reports. Engineers, on the other hand, sometimes like to have a computer doing nightly builds or even continuous builds, occurring every time someone checks in some source code. This acts as a sort of early warning system so if an engineer checks in source code with bugs, it's spotted before it gets to QA. But here Brian was talking about generating builds "day and night" to be delivered to QA. There was something fishy about the whole thing. Even putting aside the problem of a Windows box running Mac programs, there was something even more fundamentally weird going on.
In the two weeks I had been at Handy I had spent a couple of afternoons with the build engineers in their "build lab", a large, air-conditioned room with benches of monitors connected to Macintoshes on the shelves above. Handy was founded and largely made up of ex-Apple employees, so it was no surprise that the Handy OS had been developed on a Mac. Hobie showed me the servers, the source code repository, and the build scripts during those sessions, while I watched him perform several builds and asked questions as he did so. Software companies frequently use build scripts to save time, ensure reliability and repeatability, and provide a measure of trust between the software engineers and the build engineers. Bugs occur, mistakes are written into the instructions, build engineers make mistakes performing builds, and that's just normal for the first part of any project. Handy's problem, as far as I could tell, was simply not having anyone available who could write a decent script. In fact, I soon learned the Build Engineers self-deprecatingly called themselves Build Monkeys, being neither engineers nor having any programming experience. What a build engineer does at different companies can range from simply burning CD's to actually designing the build process and tracking changes to the software and related documentation. At the higher end of the spectrum, this is often called software configuration management (or SCM or CM). Brian mentioned he was trying to get the rest of Software Engineering to recognize his department as an SCM group, but it was clear the build monkeys built software by simply following written instructions. When builds didn't turn out right, they frequently had to defend themselves against charges of not following directions, or of simply being incompetent. The charges were valid as often as not, but you could never know right away because they didn't have a verification process to determine if a bad build was because of mistakes in execution, or mistakes in the instructions the build monkeys received.
In my first two weeks at Handy, I learned the basic steps the build monkeys had to follow for all builds.
That's a lot of information to throw at you, but all this really describes is the breakdown of a build monkey's job into six general steps. Each step begins with or is entirely comprised of doing something manually, like typing a command or writing an email. Simply the fact that I had identified six manual steps to doing a scripted build was a direct contradiction to what Brian had told me. But now let's take a closer look at Step 3. At most software companies engineers typically create a build script or makefile such that you could type something like make ProjectX Debug English to tell the computer to build ProjectX with debugging info turned on, in English. With that level of control covered, the next logical step is to write a script to run the make file with various combinations of those commands for all the different languages, with debug info turned on or off, and set for different kinds of output files. At this point you have yourself a simple form of automation. Add in the ability for your script to perform the other five steps, and you've earned your raise for the year.
Handy's Mac-oriented culture preferred using a mouse whenever possible and so, instead of typing commands, the Handy developers made a custom menu in MPW. From the custom menu, you selected each parameter from the menu (project name, Debug or noDebug, language, etc.), and then selected Build It from the same menu. This was the exact equivalent of typing make ProjectX Debug English, with one major exception: You can't easily script selecting menu items. Thus, even though the menu system was user-friendly if you wanted to build part of your project at a time, building a larger project in its entirety could take all day. The European HandJob, for example, had 5 languages, 2 Debug modes, and 4 types of ROM output files. 5 times 2 times 4 equals up to 40 variations of selecting sets of parameters and waiting while it built. Most builds required only two or four combinations, but they were still prone to mistakes and a day full of builds like these could be painfully boring, leading to more mistakes.
Fortunately for the build monkeys, the menu system was controlled by a script written in MPW's own shell script language. To alleviate the pain the build monkeys were feeling, one of the software engineers modified the menu script so it could be called by an external program -- like a build script. Prior to my working at Handy, the build monkeys had written a build script ( the build script) that let them select all the parameters at once in a dialog box. Then they simply clicked a button and waited 30 to 60 minutes, depending on the project, before moving to Step 4. This effectively reduced Step 3, which used to consist of many -- let's call them sub-builds -- to a single step. This was a tremendous time-saver for the few projects the script would work with. But Brian was convinced that all projects had been scripted. This was further confused by the fact that Brian and his build monkeys used the term "build" for several related yet separate tasks. The build monkeys called the six steps listed above "doing a build" and they "did builds" all day. Step 3, scripted or not, was the only step where anything was actually built and each of its sub-builds were triggered by selecting Build It from the menu. Their limited vocabulary for the jobs they performed made it enormously difficult to talk with them about what the heck I was supposed to do.
The build monkeys had taken six months or so to create the basic build script they used. After it was working, they made a copy for each new project they wanted to script. It's a truism in software development that change equals risk because each change is an opportunity to create a bug; thus, minimizing change in a working program is considered a good thing. The build monkeys had never done this before and, without the experience of designing a program before you write it, they just dove in and wrote scripts. When they inevitably corrected a problem or just found a better way to do something, they would change the script, often just commenting out a bad line so it wouldn't run and then writing a new line below it, which allowed me to see the different ways they had tried to make the script do something. And because they didn't have a single script, but were customizing copies for each project, instead of making changes in one script, they had to make the same changes in all the scripts they had written. Over time they ran into a problem where the changes worked for only, say, 9 out of 10 scripts. Whatever the problem was with the 10th script, it would take time to find and correct it. As the scripts were patched and glued, they became "brittle" (each new change tending to introduce new bugs in a ripple effect) as well as different from each other. Though they were originally time-savers, the maintenance required to keep the scripts up-to-date was taking more time than was saved by using them. By the time I came to Handy, they were using their scripts for only a fraction of the projects. I think every decent programmer has written crappy programs -- it's how you learn -- but these were non-programmers who had been forced to learn a language and write some scripts. Any experienced programmer, myself included, could have written better versions of those scripts in a few days and they would have had performed much more of the overall build process and been adaptable to different kinds of projects. They also would have been based on a single library of functions. Then you could make all your changes in one file, yet all of your scripts would inherit those changes instantly, thus avoiding the entropic process they were experiencing. I can't say if it was ever communicated to Brian that the scripts were no longer time-savers (indeed, they had only ever saved a limited amount of time), but much later I wondered if Hobie hadn't said to Brian something like, "We've reached the limit of what we can do," and it mutated in Brian's mind to wide-eyed assertions about the limit of what scripting can do.
I'd like to say here that I don't mean to knock the build monkeys or their work. They had a really annoying problem and they developed a solution for it, learning new skills along the way. Hooray! Good for them! I was looking forward to helping them improve their scripts and expand their skills, which would have made them look good, which would have made Brian look good, and which ultimately would have made me look good. Unfortunately, though Brian was thoroughly convinced that they had "reached the limit of what scripting could do" and only The Warden could make it better, the sad truth was that they had barely scratched the surface of what scripting could do and Brian was badly deluded about the build monkeys' skills, their achievements, and what course of action would benefit them.
So there I was in a conference room, looking at Brian's diagram and listening to him explain that Software Engineering desperately wanted me to run MacPerl scripts on a Windows box. I knew that wasn't going to work, but Brian was becoming defensive, so I decided drop it for the moment. I went to the whiteboard and wrote a quick list of the six build steps I had identified, focusing on Step 3. Before I even finished explaining it, Brian simply rejected the whole thing, telling me the builds were scripted and there was no need to talk about them. I persisted, asking if he could explain what the build engineers did for an hour or more if the builds were completely scripted. He said he didn't know, he just knew the builds were scripted. I tried other tactics, coming dangerously close to calling his ideas a bunch of contradictory nonsense, but he kept responding with what appeared to be his only defense: that Software Engineering wanted this build server. During the meeting, I asked several times to speak to the engineer who gave him this design. Each time Brian insisted I didn't need to talk to him, nor even know who he was.
After hearing him insist a few more times that the build server, not the build machines, was supposed to do the builds, I pointed to the diagram and asked, "Then why are the build machines even in the diagram?" His head shot to the whiteboard and his eyes widened...again. He had no idea why his own drawing included build machines. Someone else had obviously drawn it for him. "Could it be," I asked after a painful pause, "that the build machines are called that because they build software?" "No," he said, "but it doesn't matter anyway because this is what Engineering wants and your job is to show progress on the build server." I started to say I needed more information if I was going to design a build server, but he interrupted, saying, "We don't have time for design. You just need to show progress." He left the room quickly after that.
We revisited the issue over the next couple of weeks but got no further. During this period, however, I did produce a couple of utilities for the department. One of these, a program that monitored the Perforce server, caused a stir with Brian and Hobie. Brian saw his group as the guardians and keepers of the source control system and felt no one should be able to set up new projects without his consent, or at least notification. What my monitor demonstrated was that two or three projects a month were added to Perforce, and there were a couple dozen projects they hadn't known about. While Brian saw his role in Software Engineering as centrally important, and to some extent it was or should have been, the rest of the department didn't seem to feel any need to notify him about what they were doing. I think their attitude towards Brian was tied up part and parcel with their view of the build monkeys, which consisted largely of complaints. Whoever in Software Engineering had requested a build server, I was sure he was looking to get better and more timely builds.
Just before the Christmas break, Brian told me about the Nightly Build Script, which automatically built the head revision of the "core OS" each night. The Personal HandJob, Professional HandJob, and the new wireless model, the AirKiss, as well as third-party branded versions and the different language versions, ran on slightly different hardware and had differences in the software, as well; yet they were all were based on the core OS. The core OS was not very useful on its own; but the adding of language overlays, drivers, and various custom changes for each different device is what comprised the majority of projects at Handy. The Nightly Build Script built the core OS each night and was the one piece of actual automation the build monkeys had written. It did all six steps listed above -- sort of. It always built the head revision and didn't take special instructions, but it did far more than any of their other scripts and, according to Brian, it was broken and needed to be fixed right away.
"Okay," I said, happy to have something real to do, "what's broken?" "It's unreliable," he said. "Okay, sure, fine, but, y'know, for my own curiosity, do you happen to know what's unreliable about it?" "I don't know," he said, "and it doesn't matter; you just need to fix it." Oh dear, here we go again. "If you were a mechanic," I said, "and I brought my car to you and just said, 'It's broken,' what would you fix? The brakes? The engine? A noise that only occurs if you drive over 60 MPH?" I kept at him until he finally suggested I talk to Ernie Lum, the build monkey who wrote and maintained the Nightly Build Script. Ernie had already left for the winter break, so I checked out a copy of the script and ran it to look for problems. Have you guessed yet? It ran just fine.
In early January I met with Ernie, a friendly fellow with a mop of hair that reminded me of an Asian Gilligan. He was a bit stand-offish at first, for which I blame Brian. Brian frequently discussed the build server in staff meetings, talking about how it was "going to be great when it's cranking out builds for us." Ernie in particular often asked what would happen to the the Build Monkeys when the server was doing all the builds. Brian would say, "Oh there will be lots of things you can do," and then shift uncomfortably in his chair as he changed the subject. The build monkeys were clearly worried about this build server and, in a very real sense, I was the enemy -- gaining their trust was not easy. Getting useful information out of them was even harder. As we talked in our meeting that morning, I kept waiting for Ernie to mention problems with Nightly Build Script, which he did not. When I finally brought it up, Ernie said he didn't know what I was talking about. Somewhat defensively, he showed me the nightly build logs, the network folder where builds were deposited each night, and copies of the emails that were sent. I had already seen the script running myself, and my meeting with Ernie simply confirmed that the Nightly Build Script was doing its job.
I did not look forward to reporting this to Brian, who got quite irate when I did. "Don't listen to Ernie," he whined, "the Nightly Build Script is broken and you need to fix it!" I asked to talk to the person who said it was broken (he would say only that Software Engineering wanted it fixed). He insisted I didn't need to know who that engineer was, either. I insisted that I couldn't fix something if I didn't know what was broken; but it fell on deaf ears. My job was to simply "start showing progress," a phrase that was beginning to get on my nerves. Fighting the urge to quit on the spot, I eventually said something about poking around the code and maybe cleaning it up a bit. "Yes, do that," he said and left the room. Thus began the great emergency of poking around the Nightly Build Script to see if I could clean it up a bit. Looking for another job was also starting to look like an emergency and I started that as well.
To say the Nightly Build Script was badly written was like saying the Titanic ran into a little trouble. I will try to explain this for readers who don't know or care about programming, but the true horror of the loops within loops of ill-thought out programming doesn't translate well. For starters, there were no comments in the script to explain what Ernie was attempting, and much of the script consisted of running equally uncommented MPW shell scripts with obscure names that shed little light on what they did. His idea of using functions was something to behold. Someone had shown him how to pass variables into a function, but the way he wrote the script everything was a global variable (globals don't need to be passed anywhere). So he'd pass variables in to a function, and then auto-create new global variables within the function. When the function returned, it did not return these new values to the rest of the program (returning values is part of good programming), but other functions would use them anyway because it was globally available. I stared at the debugger for hours trying to track down where these variables were created and used.
An especially fun puzzle was trying to understand why some text was read into the program, written out to a file, read in again, and written out again, only to be read in again later. It turned out one of the MPW shell scripts printed a lot of text to the screen (the entire output of the build, actually). Ernie wanted to search that text for error messages; so he made the script capture the output as it was printed. This captured output was one big block of text and he only knew how to search one line of text at a time. So, after capturing the block of text, he wrote the block to a file and then opened it again so he could read it one line at a time. As he looked at each line in the file, he searched it for the word "error" and, if it contained "error," he would keep the line for use in the email later. Normally you would use an array to keep the lines around in memory, but Ernie didn't know how to do that either; so he wrote the error lines to yet another file. It was very satisfying when I later rewrote that code in one line without the need for temporary files. Two and a half weeks into this mind-bending work, I was asking Ernie about some inconsistencies between what the Nightly Build Script seemed to be doing and the results I actually found on the file server. Ernie noticed something odd and, after thinking a moment, announced that the script I was working on was three months and a couple of versions out of date! Grrrr...
When this adventure started, Brian had told me the latest version of the script was always in the Perforce source repository. He also told me which server the Nightly Build Script ran on. (For what it's worth, the Handy build monkeys called just about every computer a "server." In reality, the servers were just Macs that had a dedicated purpose.) I talked with Ernie specifically about what version of the script I was using and which "server" it ran on, yet he never mentioned I might have the wrong version. It turned out Ernie rarely checked his code into Perforce (a major no-no in software development and matter of ongoing conflict between Brian and Ernie) and it had been several versions since he last did so. The only current copy existed on the Nightly Build Server (the Mac that ran the script). However, this was further complicated by the fact that the server I was looking at was no longer the Nightly Build Server! The old Nightly Build Server had developed hard drive problems and for some months Ernie had been running the Nightly Build Script on a different server -- which he also called the Nightly Build Server. He said he forgot to tell me. So, after verifying I now had the right script and was looking at the right server, I checked the script into Perforce and started again to try to find something broken in the Nightly Build Script. I was not happy.
The newer Nightly Build Script was, if anything, a worse piece of crap than the older one. The first thing I addressed was the complete lack of error-checking. When, as occasionally happened, the Perforce server was down or the network was down or Ernie did something wrong or whatever, the Nightly Build Script should have noticed and quit with a polite message. Instead it would barrel on like the Little Engine That Could, trying to build code that wasn't there, trying to copy non-existent files to the network, and eventually spewing a bunch of garbage into an email and sending it to the project team with the subject line: BUILD SUCCESSFUL!!! When I started to understand the scope of the horror I was working with, I told Brian it would take me less time to just rewrite the damned thing from scratch than to try to fix its awful, twisted code. I told him it didn't do that much, but it did it in such a convoluted way that if I tried to make anything more than small changes that had no real effect, I ended up breaking the script for days at a time (the same problem the build monkeys were having). In fact, when I stopped working on it, the only real change I did was to disallow global variables (with the use strict command). Brian told me we didn't have time to rewrite the script. I repeated that I was saying we could save time by rewriting it from scratch, but he said we couldn't do that. We had to fix this script, he said, because that's what Software Engineering wants. I didn't even argue. I knew it was a crock, but the job market had changed and I wasn't finding the job leads I used to. I just kept working.
By mid-March I put the script into testing. I did clean it up quite a bit, but other than turning off globals, the only real change I made was telling the script to stop and send Ernie and me an email if it ran into problems. I also added a "test mode" to put the results in a different place and only send emails to Ernie and me. This way we could test it and it could spew garbage to its heart's content without anyone having to see it until it was working. Much later, I concluded that its tendency to send garbage emails to teams, while claiming it as a major success, was the basis for its alleged "unreliability." Whenever Ernie made changes, he had to run the script to test it, sending emails full of garbage to the project teams several times a day. By putting an end to that, I suppose I actually did fix the damned thing; but even when I did figure it out, I enjoyed playing dumb with Brian whenever we talked about it. You see, I think he told so many people what I great job I was doing that it got to be a habit, and he occasionally told me I had fixed the script, as if I hadn't heard about it yet. I would ask him what he thought I fixed, but he would just get grumpy as he was forced to admit he had no idea. When I told him that there were no substantive changes, that most of my work was spent trying to get it to work again after I broke it, he would just shake his head and say, "No, Dave, you fixed it. I know you fixed it."
When testing was well under way, Brian pulled me aside to ask how progress was going on The Warden. I politely reminded him that I had been working full-time on the Nightly Build Script for the last couple of months. He told me we really needed to start showing progress on the build server and he was "very disappointed with our progress." He continued, telling me that when I started at Handy, he expected I would "sit down at a desk and start showing progress on the build server right away." We chatted about it a bit, during which he was very clear that he really thought design was a waste of time, and that the build server was such an obvious thing that my lack of experience with Macintoshes, MPW, Jam, Code Warrior, various Handy home-grown tools, and what exactly they wanted me to do shouldn't have slowed me down. Despite all our arguments over whether a Windows box could run Mac programs, Hobie and Ernie's reluctance to give me the simplest of information I needed, and strong evidence that the existing build scripts wouldn't work even if I could create his magical nonsense server, he still couldn't understand why I hadn't shown more progress.
Taken a step at a time, Brian understood that every download, compile, link, and analysis had to occur on a Macintosh; but he always concluded that the actual automation should occur on a Windows PC. By this point I was immensely tired of even thinking about it, so I tried a different approach. "Let's turn this around," I said. "If Software Engineering saw progress, what would they see?" Brian loved the question and immediately described his vision of The Warden from the perspective of its users. "First, they would go to the server," he began. "Go where?" I asked, "And how?" "They would go to the server with a web browser," he said. "They'll open a web browser and see a drop-down with all the projects available. You select one and a data sheet for that project appears, showing all the information about that project. On the data sheet is a link to a Build Request Form. You fill out the form and click a button and the server does the build." His eyes had that light again. Maybe he was hoping to impress me sufficiently that, seeing the light, I would go out and deliver his build server unto him. And yet...
"You want a web server???" I cried out, astounded. In the four months I had been at Handy, he had never mentioned any kind of user interface, let alone a web browser; yet he was now implying it was what he wanted all along. Even in his brief description, it was clear it would need a database to store the information, various kinds of security to control access, and lots of screens to request or present the information used to track projects. Forget about the Mac vs. Windows problem a minute. Just this piece of the application he was describing would take a few months to develop, along with a design (whether he liked it or not) and a database administrator to design the schema for me. I can build web servers and I might even have been able to build this, but -- I have to say it again -- four months into the job, I had never once been aware that he expected me to build a web server.
But he had another surprise for me. "No, there is no web server," he said. Huh? Okay, let's take this slowly. "But there's a web browser," I said. Oh yes, he agreed, a web browser was definitely part of it. "And the web browser displays web pages, right?" Again he agreed, still missing my point. "And you click on a web link to get to the web form, right?" He agreed, shifting a bit in his chair and starting to look impatient. "And the web form has to be processed by a web server, right?" "No!" he said, slamming his hands on the table, "this is a build server. A build server! There is no web server."
Okay kids, web browsers talk to web servers. Just trust me on this. And as sure as the sun will rise tomorrow, I'm sure Brian was describing from memory a web application someone else had described to him; but somehow he had gotten it into his head that a single machine should do everything ( "One server to bring them all and in the darkness build them..."). The build server, in Brian's mind, was nothing more than a Perl script on a Windows box, yet it was supposed to talk to web browsers and build software, regardless of the fact the projects had to be built on a Mac. As we talked further, he described the Perl script running in a browser on the developers' workstations (that doesn't work either), but as he continued digging...er...talking, he started describing the build server (the Perl script on the Windows PC) doing builds on the build machines (Macintoshes), which had apparently re-entered the picture. When I asked if there was a possible confusion about where the script was supposed to run, he said the engineers would run the script on the server by running it in a browser on their workstations. Then, after he thought a moment, he told me the build scripts would run on the Windows server but execute on a Mac build machine. (Before you ask, running a program and executing it describe the same thing.) As carefully as I could, I described back to him what I thought I heard him saying: "An engineer will run a Perl script on his machine and it will appear in a web browser. This script will also be running on the Windows build server, in fact it is the build server, and it will run MPW, the compilers, linkers, and MacPerl scripts on a Macintosh." I hoped that made no sense at all and would make him think. But he said, "Yes," triumphantly, "that's what I want you to build. I want the script to be cross-platform ." *sigh*
For what it's worth, cross-platform is usually taken to mean different versions of a program running on more than one platform (like MacWord vs. WinWord) that can use the same file format (a Word doc, a spreadsheet, etc.). It does not mean the same program running on multiple platforms simultaneously. Even though he had just verified he wanted a little Perl script to take the place of developing a complex application to perform complex actions, apparently just by claiming it was possible, and he wanted it to magically run in many places at the same time, the more he shared of his half-baked vision the more I could see the outlines of what must actually have been described to Brian by his mysterious Deep Throat. I just wish it didn't have to be at the cost of watching him get further and further from reality.
I believe I know exactly what Software Engineering wanted. In Build Engineering, each development team was given an assigned build day so the build monkeys could manage their workload. The developers don't work on a weekly schedule and didn't like having to wait for their special day; and the builds were occasionally late, even when they were built correctly. I believe what Software Engineering wanted was simply a self-serve system; that is, they wanted a way to bypass the build monkeys and do builds on their own schedule. Brian probably told his Deep Throat, as he told me, that the builds were scripted. Being, I assume, an experienced software engineer, Deep Throat would have begun envisioning, and talking out loud about a way to fully automate the builds via a build server, taking the manual steps (and thus the build monkeys) out of the equation entirely. The easiest way to provide a self-serve system would be to set up a web application that passed build requests to the build machines via a Perl script, telling them which script to run, checking their status, writing and sending emails, etc. It needn't have been limited to Macintoshes, either (I would guess this is where Brian got his ideas about cross-platform functionality), since the build server would only be passing information (not doing the builds, thank you very much). Deep Throat probably told Brian it would be easy to do this and shouldn't take much time at all, which is true, assuming Brian would understand how all this would have to work. I can even hear him saying the words, "Any decent Perl programmer could create a server to do this." Frankly, it would have been a great plan except that Brian could only parrot what he could remember of Deep Throat's words and attempt to hold me to them. Another downfall of the plan was its being based on the belief the existing build scripts did much more than they did. But the place it really got tangled up was between Brian's refusal to admit when he didn't know something and his obsession with The Warden doing all the work (you have to wonder what he had in mind with that name). The one-machine model, even if everything was all running on the same OS, was not nearly as efficient as the multi-machine model. I was blown away to discover Brian never mentioned the web interface (because he didn't think it was a web interface), but I had to laugh as the surprises kept coming. In a supremely ironic turn, a server already existed that did 90% of what Brian wanted!
Hank Mustendamp, who also worked for Brian, had supervised the development of HandySCM, a web application Engineering used to create build requests, replacing the email-based system they were using when I came to Handy. The information needed to do a build came from several different people on a development team and sometimes more than one of them took responsibility for submitting build requests. Consequently, the build monkeys often got incomplete requests or repeated requests for the same build, and following up on these just added to their workload. HandySCM addressed this problem by giving project teams a single place to enter and review information until it was complete. Only then would the request be forwarded to the build monkeys. It didn't build projects or tell build machines to do builds, but HandySCM provided a much better system for tracking builds than email, and allowed teams to check the status of builds without having to go ask a build monkey.
At my next meeting, when I had had time to digest Brian's latest crackpottedness, I reminded him that HandySCM was already in testing and could be modified to run builds far more quickly than developing the same thing from scratch. It was perfect for it and I would only need to modify the build scripts to talk to HandySCM.
"We can't do that," he said, "The Warden has to be a separate server." "Why," I asked. "It just does," he said. "Well I'll have to work on the build scripts anyway, because they don't do..." "Actually," Brian said, "I don't think we really need to use build scripts, anymore." He went on to explain that he had also been thinking since our last meeting. Having concluded that a single Perl script could run simultaneously in a browser, on a Windows "server" and on a Macintosh, he was now certain that we didn't need build scripts at all. "The build server will just do the builds." I smiled and told him I'd get to work on it right away and then went to tell Kristin I'd be quitting that day; but she talked me out of it.
Back in February, Kristin, Brian, and the other QA managers (Brian reported to QA, even though he wasn't a testing manager) had visited our French development office. This office was actually a small software company that Handy had bought and turned into a development unit. While there, Kristin had noticed that they had some talented testers but no QA management experience. Also during this period, Handy was deciding to merge its two main divisions. When it became final, one of the outcomes (that Kristin learned while she was in France) was that her position was redundant and she was losing her job to her counterpart in the other division. No one wanted to fire her, but the positions they suggested she could fill didn't fit her skills or desires. When she returned, she asked me what I thought of living in France. I kind of had Italy in mind, to tell the truth, but no one was going to pay us to live there so we began discussing how we could make France work. One of the first things I did was to call Brian at home, figuring a major obstacle would be his refusing to allow me to transfer (though I considered quitting to make the trip happen). To my utter amazement, Brian agreed immediately, saying he saw no problem with my working on the build server from France.
So when I went to tell Kristin I was quitting, she reminded me that a) Brian didn't have the stomach to fire anyone, let alone fire me while we were in France; b) if the test center was a success, she would be on the fast-track to a directorship and a higher salary; and c) living in Europe was a major check-off item on our lifetime to-do list and someone was offering to pay us to do it. She was right. If Brian wouldn't tell me who was asking for his build server, and remained unmoved by my telling him his ideas would never work, then whoever was pressuring him for it would just have to wait. In the next few weeks, it began to look quite likely that we would be sojourning in France by summer. Kristin's proposal to manage the French testing group was greeted with enthusiasm, and soon it was suggested that she run a European test center. I avoided Brian during this time, while I tried to find something I could call progress. By the end of April Kristin was approved to manage the test center and I was approved to transfer there with her. We took a deep breath and started to learn French for our June departure.
--Dave, 30 April 2003