Tag: Are Right A Lot

A Tale As Old As 2001

A Tale As Old As 2001

For the next week or two I’m going to go back through my old drafts and finish them up. That means the stories are at least a year or two old. For this one, I’m curious if Edge finally changed the behavior. Anyone want to try it out?

When you’re debugging a pernicious issue, there’s no greater feeling than Google search auto-completing your first couple search terms and matching a page that describes your problem to a T. The challenge of course is figuring out those magic couple of words.

The team was recently trying to figure out an IE11-only problem (ugh) where our authentication mechanism was failing, but only for a subset of customers, with no obvious commonality. The server would return a Set-Cookie header, but the browser completely ignored it. WTF, Microsoft!

We’d spent an entire day trying to come up with a solution, until finally stumbling into the root cause: underscores in the subdomain. Chrome and Firefox are cool with them, but IE silently refuses to store cookies when they’re present. The details are a fascinating combination of unexpected side effects from a bug fix, misinterpreted web standards, and lingering backwards compatibility. This post captures the story nicely.

My product manager had never been thrilled with the way we’d been handling domain names. While I couldn’t have anticipated our design would lead to this misadventure (and a simple s/_/-/ solved the problem), I probably should have given his critique a closer listen.

It’s Been Awhile

It’s Been Awhile

Howdy friend. It’s been quiet here for some time now, but as is typical around a new year, I’m renewing my efforts to stay active on this blog (especially since I’ve mostly stopped using social media). This is in no small part to me now working for AWS Professional Services as a Senior Consultant in the public sector, a role for which improving my writing will be particularly valuable.

My silence should not be interpreted as inactivity, because a heck of a lot has gone down since I last posted:

  • Got promoted to the Director of Engineering for a 20+ person team (this actually happened in late 2017 but I’ve never mentioned it here)
  • Led that team through a painful acquisition process that required reducing the team by about a third
  • Experienced the joy of having a paycheck delayed by two full weeks during the holiday spending season
  • I celebrated my 40th birthday with a trip to Germany and Ireland
  • Was laid off when my employer ran out of money, without warning and with no final paycheck (about this much more could be said, but going to keep it short for now)
  • Dipped my toes into independent consulting for a few months while searching for a new job
  • Was hired by Amazon as a Senior SDE to work on their Last Mile team (the folks that get packages from delivery stations to your doorstep)
  • Transferred to AWS as I mentioned above

Pretty bonkers 18 months, but things are starting to settle down, and I’m eagerly anticipating the new normal of 2020. More to come!

Truth And Consequences

Truth And Consequences

Yesterday I was in an all day meeting preparing for a large customer demonstration. Ran into a bug that turned out to be a misunderstanding of how JavaScript handles truthiness. Consider the following code:

if (person.address) {
  console.log('Address is ' + person.address);
}
else {
  console.log('No address.');
}

Seems clean enough right? Not so fast. If person.address is null, no problem. However, if person.address is an empty object, that evaluates to true, and the code fails to do the right thing. To me at least, this is non-intuitive behavior.

I started this blog with a discussion of why I love Python, and once again it behaves more intuitively. Empty dictionary, empty list, None, and empty string all evaluate to False. So the code works in a broader-variety of cases:

if person['address']:
    logging.info('Address is ' + person['address'])
else:
    logging.info('No address.')

Isn’t that nice and clean?

As an aside, most developers have become so accustomed to bracketing punctuation (e.g. braces, semicolons) that they assume there’s no other way. Personally I’ve come to love the lack of noise in Python syntax, and I think you will too.

This Is A Post

This Is A Post

A friend of mine suggested to me a few days ago that the recent Apple vulnerability might have been avoided if the (supposed) offending code had been commented. Perhaps, but perhaps not.

Code comments are a tricky business. Everyone knows they’re a “good thing” but that doesn’t mean every comment is a good one. Blindly them in quantity can actually make code legibility worse. I don’t go as far as Uncle Bob, however, who considers every comment “a failure to make the code self-explanatory.”

For me, a great rule of thumb is that code itself should be expressive enough to communicate the “what”, and comments should be used to explain the “why”. An example is instructive:

# If user is root and there is no root password, don't do the thing
if user == 'root' and password is None:
    dontDoTheThing()
else:
    doTheThing()

See how the comment doesn’t provide any information beyond what the code says? Pretty unhelpful. What a developer who’s asked to maintain this code needs is context, like the following:

# By default an installation of MacOS does not set a root password, thus root
# should never be used as a privileged account unless a password has been set
if user == 'root' and password is None:
    dontDoTheThing()
else:
    doTheThing()

Much better. A comment like that, and maybe Apple doesn’t end up in the headlines.

Phantom Fix

Phantom Fix

I’ve got good news and bad news. The good news is that the system is working now. The bad news is that we have no idea why.

The above scenario plays out regularly in the life of a software developer, and it’s infuriating. In some sense it’s worse for a problem to disappear without warning than it is for the problem to persist, because without reproducibility it’s nearly impossible to determine the actual cause of the issue.

This happened to me after losing a big chunk of my Saturday night. Essentially the issue fixed itself after several retries. No idea at all as to why, but at least I could finally go to bed.

The Herpes Of Version Control

The Herpes Of Version Control

I love git. Truly. Once I got over the fear of learning something new and dove in, everything else paled in comparison. Maybe it’s just because I was a math major and directed acyclic graphs are cool. Maybe it’s because it everything about it makes sense (kinda like Python in that respect). Maybe because it’s the foundation behind open source software development sites like Github.

But this post is not about how much I love git, or why you should use it over the alternatives. It’s about one particular feature that drives me crazy at times, and that’s tags. Yes those littles things you use to mark important places along the development tree. They’re quick and easy to make (especially by CI tools that vomit them out daily), and really useful to have around. But the darn things are nearly impossible to get rid of.

It’s not that the implementation is confusing or doesn’t have logical justification (it does). But the distributed nature of git means they spread like wildfire, and are incredibly difficult to delete across all cloned repositories. You think you’ve gotten old ones deleted, and then some poor developer who hasn’t cleaned up his repo (or even worse, the local copy of a repo on your CI server) pushes and they all come back. Argh.

For a guy who is as anal-retentive as they come about keeping his repos tidy, tags are just the worst.

Ain’t No User Here

Ain’t No User Here

Here’s a new one. I was debugging a problem with a web server that hadn’t been used in a while that was allowing users to log in, but failing to perform a number of other functions (gotta love an API that happily returns a 200 OK even though it’s clearly not working because there’s no data in the response). No one could think of anything that had changed, but clearly something had.

After some spelunking through logs, I found some DB permissions errors. Obviously some DB calls were working, so that was odd, as I knew credentials were only specified in one place.

Another hour of research later, I discovered that MySQL views run as the user who created them, not the user who queries them. This actually makes sense for use cases where you want to give controlled read access to parts of the DB to a less-privileged user. Logically then, if that user is deleted, the view fails to run. And in this situation that’s exactly what happened.

There’s a number of lessons to be learned from this situation:

  1. A user probably should fail to delete if it owns other objects in the database. Otherwise unexpected side-effects occur.
  2. Accurate error messaging is essential to debugging. When trying to run the view, I got a simple “user does not have permission” error, which told me nothing about the underlying problem, which meant another hour or two of research. This goes for APIs especially. Please know when to use a 4XX vs. 5XX error especially, and even better learn the subtle differences between 502, 503, and 504, and when each should be used.
  3. Don’t try to diagnose problems on Friday afternoons before holiday weekends. You’re asking for trouble.
Do The Right Thing

Do The Right Thing

I love it when a programming language behaves in an unsurprising and helpful way. This is Python in a nutshell.

Consider the following situation I came across yesterday. I needed to pair off two lists. Naturally, Python has a built-in function for this:

x = [1, 2, 3]
y = ['a', 'b', 'c']
zip(x, y)
# [(1, 'a'), (2, 'b'), (3, 'c')]

But what happens if the lists are different lengths (as they were in my case)? Python does the logical thing, and stops with the shorter list:

x = [1, 2, 3]
y = ['a', 'b', 'c', 'd', 'e', 'f']
zip(x, y)
# [(1, 'a'), (2, 'b'), (3, 'c')]

This is exactly the behavior I needed, no special case handling required. Woot!

Gimme That Foot Gun

Gimme That Foot Gun

The past couple of days I wrote about the dangers of providing too much functionality. In a fit of cognitive dissonance I now want to contradict myself and demand dangerous power when it suits my needs.

I’ve been working the past week to get a particularly gnarly application running in a set of docker containers. There are over a dozen services, plus a Rabbit queue and a database. Many of the services do not handle database connection failures in a robust way. During my testing I wanted a simple way to ensure they waited a bit before trying to connect, as the database container needs a minute or so to seed itself and get ready for connections.

Unfortunately,docker-compose does not have any form of manual startup delay feature. This is by design, as the Docker team (rightly) argues that having services intolerant to connection failure is a bad thing. However, it’s frustrating to not have the power to do the wrong thing in the short term.

Then again, it turns out it wasn’t too tough to augment my compose file with a depends_on clause that includes a health check, which is a more reliable solution anyways.