Tuesday, November 12, 2013

It's *not* rocket science

The healthcare.gov start-up debacle -- about which I'll spare you my political thoughts -- has been of late (hah!) a major topic in the mass media.

Said fiasco has also provided fodder for late-night comedians and, not surprisingly, the Onion: "New, Improved Obamacare Program Released On 35 Floppy Disks." And fodder, too, for tech speculation, as in this from IEEE Spectrum: "The Obamacare Rollout: What Really Happened?"

Rather than become the zillionth-plus-first commenter on the botched roll-out, I decided instead to vent re the problems more generally encountered in software. My opinion is, I shall maintain, an informed one. I have an MS in computer engineering. I once programmed for a living. For many years after I stopped coding (other than, on occasion, recreationally) I managed software- and systems-development organizations, both in the private sector and under contract to several federal agencies (most notably, NASA). Several of those systems were Internet-based, very large, distributed -- or all three.

So what about the state of modern software bemuses (but not amuses) me?

AND search is the intersection
Let's begin with search-engine quirks. Indexing the Internet and quickly searching those indices is a hard task. I don't doubt that. I do wonder why, when I specify a Boolean search such as "item-1 AND item-2," search engines sometimes return more hits than a search on just one of those items. As a matter of logic (see nearby Venn diagram), "item-1 AND item-2" should never return more hits than the lesser of "item-1" or "item-2."

I likewise marvel that changing a search's display preference between "sort by relevance" and "sort by date" often changes the number of search hits. And at how the first screen of retrieved items sometimes indicates one number of result pages to follow, but the page count decreases as I page through. On one recent search, page one of the results indicated ten pages to follow, although only six did.

It's not only search engines that demonstrate inexplicable behaviors. Consider an all-about-books service of which I am a member. Through that site I can see the number of reader ratings and reader reviews across all my titles, which is nice, and the aggregate rating. I can see the ratings breakdown and individual reviews for each book. However:
  •  When the aggregated review count bumps up, there is no way to go directly to the latest reviews. One must hunt title by title.
  • Within a title, the sort-by-date option for reviews is, at best, 50% reliable. As I type, one title's supposedly date-sorted reviews include the sequence: 8/2, 7/22, 8/3, 7/15, 6/30, 6/27, 7/1 ...
  • The overall rating shown for me as an author includes ratings of books I wrote or co-wrote -- fair enough -- and magazine issues in which a story or article or guest editorial of mine appeared and anthologies to which I contributed but one of the many stories. Combining those disparate categories renders the overall rating less than useful.
Then there is the mystery of ebook sellers on whose sites different editions of the same book show many of the same reviews, confirming that the editions are linked, but not always all the same reviews or the same number of reviews.

One must wonder why, when accessing web-based email, search sometimes retrieves a filed message and sometimes doesn't. Or why a request to refresh my inbox sometimes gives a warning about resending personal data and sometimes doesn't. Or why recurrent spam (consistent subject line and keywords) sometimes gets caught by spam filters and sometimes doesn't.

As one must wonder about the e-card site that, after the shopper sends an e-card, insists on restarting the selection process from scratch before a second card can be selected -- even a second copy of the first card.

And don't get me started on how, after twelve years of patching, Windows XP and Office XP continue to have security holes and other bugs. Or, say, the recent Adobe breach, the latest news about which indicates 152 million compromised user accounts.

I could go on -- oh, how I could go on! -- but I'll control myself. My point -- yes, there is one -- is that decent (productive, efficient, fault-tolerant, scalable, and secure) software-based systems are hard to develop. They can be harder still to maintain as needs evolve. "Computer science" is in no way a science, it's an engineering discipline (and, to a degree, still an art) -- and an immature discipline at that.

But the path to decent software is not mysterious. As we've known for decades, software-based systems done right means: well-thought-out, user-vetted requirements; thoughtful, modular design (of the hardware suite, too); careful programming; thorough testing (of many kinds, at every level of integration); a rigorous change-control process; a robust process for problem reporting and repair -- and reviews at all stages.

The software-based systems with which, more and more, we interact every day, and upon which, more and more, we rely every day, all too seldom get built that way ...

No comments: