Third Time’s The Charm
Things I get wrong the first time, every time:
- Plugging in a USB device
- Spelling “separate”
brew updatevsbrew upgrade- The flags to
tar(obligatory)
Things I get wrong the first time, every time:
brew update vs brew upgradetar (obligatory)Because I’m a LeBron fan, when it comes to Wordle, I care more about longevity of sustained greatness (i.e. my perfect streak of winning games) than I do about having epic individual performances (i.e. having a lot games where I’ve won in 2 or 3 guesses). That means I need a way to keep my statistics even if I move between devices or need to replace one.
It turns out (at least as of this writing, who knows if the NYT will change it) that statistics are stored in a simple JSON object in browser local storage. On a laptop this storage is easily manipulated using DevTools, and on mobile it’s a bit tougher but still not too bad, at least on Android.
Both of the above, however, rely on GUIs. It got me thinking if there was a scriptable approach. First, I found a Python library (natch) that can speak the Chrome DevTools protocol. Then I had to figure out how to use it to read and write to local storage. That wasn’t too hard (even though the example here is in Node, the technique was portable). Finally I needed to make the library connect to both a laptop-based browser (easy), and my Android phone via adb (not so easy). Luckily I stumbled on the correct magical incantation in this post. Put it all together, and boom, I can now backup my Wordle streak, and easily transfer it between devices, using a script:
import chrome_local_storage
laptop_storage = chrome_local_storage.ChromeLocalStorage(port=9222)
phone_storage = chrome_local_storage.ChromeLocalStorage(port=9223)
wordle_stats = laptop_storage.get('games/wordle', 'nyt-wordle-statistics')
phone_storage.set('games/wordle', 'nyt-wordle-statistics', wordle_stats)
Want to see how I did it? Well, you’re in luck, because I packaged up my implementation and published it on PyPI. The source code is also on Github. Enjoy!
Besides the two resolutions I made for 2022, I’ve decided to try out a meta-resolution: every year from here on out, I will resolve to read the same number of books as years I am old (inspired by the coincidence that I finished 42 books last year, which happened to match my age). I track all my reading on Goodreads, where you can follow along if you’d like.
Given the above challenge, I wanted to determine how much of a time investment was going to be involved, and especially wanted an easy way to break it into daily reading targets that would keep me on pace. To do so, I needed two pieces of data: an average book size in pages, and an average time spent per page. My gut feel for these values was 300 pages and 1 minute, which led to a nifty conclusion: if I let A be my age, and aim to read A pages per day, which takes roughly A minutes, I should be able to easily complete my goal of A books over the course of the year (365 > 300, but I expect I’ll miss days here and there). Plugging in my current age of 43, that means a modest investment of 43 minutes per day is all it takes to achieve what otherwise sounds like a difficult goal. Isn’t that neat?
Neat enough that I wanted to validate my assumptions. For average book size, I downloaded the last 10 years of my reading records from Goodreads (I’ve been doing this a while): 84218 pages divided by 288 books gives an average size of 292 pages. My guess was pretty dang close, cool!
To measure my reading speed, I timed how long it took me to read 10 pages of three representative books: Multipliers (business/engineering non-fiction), Lifting the Veil (religious non-fiction), and The End of Eternity (science fiction). Resultant times were 6.5, 10.5, and 10.5 minutes for 10 pages, respectively, which averages out to 0.9 minutes per page. Once again, my intuition was reasonable.
One final statistic worth pondering: if I can hold to this meta-resolution, how many more books can I expect to read before I shuffle off this mortal coil. Thanks to Google, I know average life expectancy for a male in the United States is 75, so we’ll say I’ve got 32 years left. Thanks to Gauss, I can easily compute a sum from 1 to N with the formula N * (N+1) / 2. The sum of 1 to 75 is thus 75 * 76 / 2 = 2850, and now we need to subtract off years 1 through 42, which sum to 42*43 = 903, for a final result of 2850-903 = 1947 books. My Goodreads backlog is only 99 books long, so I guess I better start adding to it. Any suggestions?
In my previous post I mentioned an issue I had when building a CDK construct. As promised, today I’ll go through the problem I found: a dreaded race condition, which anyone who’s spent much time debugging software knows is a pernicious type of situation where behavior varies depending on the order in which various parts of the system execute, causing intermittent failures.
For background, part of the power of CDK is that it provides a framework for executing raw AWS API calls as part of a larger deployment. This is useful in numerous circumstances. For my construct, it enabled me to output several Managed Blockchain parameters that are available via API call but not from CloudFormation.
Under the hood these API calls are executed in a Lambda function that is created just for this purpose. This function has an IAM role, to which various permission policies are applied. For efficiency, it is only created once during a deployment, and then shared across all the API calls in the stack.
In order to keep my code well-organized, I’ve broken out the API calls in several places: one to gather data on the network member, and another to gather data for each peer node. And as a security best practice I want the permissions to be scoped as narrowly as possible. That means at each point in my construct where I call the function, I attach a policy that allows access only to the specific member or node being queried, via an explicit identifier.
Here’s the problem: IAM is an eventually consistent service, and thus policy updates are not immediately effective. Typical propagation time is only a few seconds, but it can take longer in certain circumstances. For the first custom API call in a CDK stack this is not an issue. The policy and role is created, and then the Lambda is created, the latter taking over a minute to be fully instantiated because it upgrades its dependencies at launch. However, on subsequent calls, because the Lambda is already warmed up, it runs immediately after the preceding policy update, and about half the time said policy is not yet effective, and the function fails due to a permission error.
It’s the “sometimes it works, sometimes it doesn’t” nature of race conditions that make them so difficult to track down. Thankfully I was able to identify and document my experience and pass it along to the CDK team. Anyone want to take a crack at a solution? I described several possible approaches in my write-up, with the “simple retry logic” approach likely being the best.
Back in my math major days in college, I was introduced to the Online Journal of Integer Sequences. It’s exactly what it says on the tin. As part of a class we were encouraged to contribute, which I did.
A few weeks ago I had another idea for a submission, and to my surprise no one else had added it, so once again I had opportunity to contribute a little piece of Internet history.
Here’s a complete list of the sequences I’ve authored over the years:
If the above isn’t enough online notoriety, check out my only published mathematical work, A Probabilistic View of Certain Weighted Fibonacci Sums. I was only a mild contributor, but still got an authorial credit, which is pretty cool.
I was privileged to have access to computers from an early age, from the humble TI-99/4A of my early elementary years to a snappy Pentium in high school (can’t remember the exact model, but it was pretty expensive; perhaps the 133MHz version?) The influence this access had on my life cannot be overstated.
Now that I’m firmly in middle age, and on a career path where I’m regularly evaluating technical talent, I’m reminded of that privilege, and how so many didn’t have it then, and some still don’t have it now. How much untapped potential there must be within these groups!
If we’re going to overcome the lack of diversity in tech, it starts with access; early access, when life-long perceptions are formed. As the saying goes: the best time to plant a tree is 25 years ago, but the second best time is today. Gotta get planting!
In my conversations with fellow engineers, git comes up quite a bit. I find myself regularly giving advice both tactical and strategic on its effective use. Learning it in detail is a force multiplier, but few people do. Part of the problem is that training materials are all over the map.
Which is why I was so pleased to discover Git from the inside out. Without question the best introduction to git I’ve come across. It perfectly balances teaching basic commands while also explaining what’s actually happening. Despite having used git at a fairly advanced level for 10 years, I still learned some new things, for example that each git add creates an immutable blob object that is retained for a while even if you git add the same file again, and even if you never commit it. Also that it’s pretty easy to decode raw git objects should you ever need to; here’s a script I wrote to do just that, if you’re curious.
I’ve said before that abstractions are valuable, but they’re not excuses to avoid learning internals, because critical information lies beneath the surface. At the risk of pretentiously quoting myself:
When things go wrong, the engineer must descend into the particulars, and an inability to minimally reason about, if not fully grasp, what lies beneath an abstraction can prove fatal to the debugging process.
I didn’t write the above with version control in mind, but I surely could have. Engineering organizations are full of developers who run stuck the moment a git command fails. You don’t have to be that developer!
Did you know you can implement a CPU in Minecraft or write arbitrary computer programs using Magic: the Gathering cards? Computing is powerfully weird, and hidden structures and capabilities can arise from all sorts of odd places.
I read a theory a while back that the Internet was, in some meaningful sense, intelligently conscious, though we may never be able to interact with such an intelligence. The analogy was to human consciousness arising from the cells of the brain in a way that no individual cell could ever comprehend it, but despite that it being no less real. The notion of Turing-complete computers being buried inside all manner of languages and tools has a similar vibe to it.
The Morning Paper is wrapping things up, as explained in its final post. A treasure trove of computer science deep dives, it will be missed. But I can certainly relate to the desire to move on, especially when a global pandemic has brought life’s various priorities into sharper focus.
Luckily for all of us there is a rich back catalog of posts that could keep one busy for months. Do check them out. Need a place to start? Here’s my absolute favorite: Applying the Universal Scalability Law to organizations.
I went far too many years into my career before truly trying to understand networking, despite it being an increasingly common source of problems. If I could give younger self some advice, I’d recommend taking some time to learn networking.
In that spirit I present to you How DNS Works, a “fun and colorful” comic that describes in detail the operation of the Domain Name Service. Enjoy!