Month: January 2022

Off To The Races

Off To The Races

In my previous post I mentioned an issue I had when building a CDK construct. As promised, today I’ll go through the problem I found: a dreaded race condition, which anyone who’s spent much time debugging software knows is a pernicious type of situation where behavior varies depending on the order in which various parts of the system execute, causing intermittent failures.

For background, part of the power of CDK is that it provides a framework for executing raw AWS API calls as part of a larger deployment. This is useful in numerous circumstances. For my construct, it enabled me to output several Managed Blockchain parameters that are available via API call but not from CloudFormation.

Under the hood these API calls are executed in a Lambda function that is created just for this purpose. This function has an IAM role, to which various permission policies are applied. For efficiency, it is only created once during a deployment, and then shared across all the API calls in the stack.

In order to keep my code well-organized, I’ve broken out the API calls in several places: one to gather data on the network member, and another to gather data for each peer node. And as a security best practice I want the permissions to be scoped as narrowly as possible. That means at each point in my construct where I call the function, I attach a policy that allows access only to the specific member or node being queried, via an explicit identifier.

Here’s the problem: IAM is an eventually consistent service, and thus policy updates are not immediately effective. Typical propagation time is only a few seconds, but it can take longer in certain circumstances. For the first custom API call in a CDK stack this is not an issue. The policy and role is created, and then the Lambda is created, the latter taking over a minute to be fully instantiated because it upgrades its dependencies at launch. However, on subsequent calls, because the Lambda is already warmed up, it runs immediately after the preceding policy update, and about half the time said policy is not yet effective, and the function fails due to a permission error.

It’s the “sometimes it works, sometimes it doesn’t” nature of race conditions that make them so difficult to track down. Thankfully I was able to identify and document my experience and pass it along to the CDK team. Anyone want to take a crack at a solution? I described several possible approaches in my write-up, with the “simple retry logic” approach likely being the best.

Put A Bow On It

Put A Bow On It

Despite writing a bit about CDK nearly two years ago, it’s taken me some time to get a chance to really lean into it. Having now built out a couple real projects, I can confidently say that like it has its rough edges, like any technology, but overall it’s both powerful and fun.

If you’re so inclined, feel free to check out my most recent creation, a construct for deploying a Hyperledger Fabric network on Amazon Managed Blockchain.

# Easy as pie!
HyperledgerFabricNetwork(
    self, 'MyNetwork',
    network_name='MyNetwork',
    member_name='MyMember',
)

For my next trick post I’ll go through one of the aforementioned rough edges I discovered, and how it can hopefully be fixed.

Resolute Comprehension

Resolute Comprehension

I really like New Year’s resolutions. As a lover of habit, the beginning of a year is perfect time to calibrate a new routine. This year I have two resolutions:

  1. Post on this blog at least once per month
  2. Learn a new programming language

The latter was inspired by this article, which is stupidly long but thoroughly enjoyable. As a non-fan of OOP I found myself nodding along quite frequently. He advocates pretty hard for functional languages; while I’m familiar with the paradigm having used it in Python, I haven’t done much with purer forms. In 2022 I intend to change that, probably by learning Clojure.

Erlang and Go are also on my to-learn shortlist, the former for its first-class support for concurrency, the latter because it’s the new hotness for performant APIs.

In other news, I’m working on publishing my first CDK construct, which I’ll share here when it’s ready. I do wish I didn’t have to write it in TypeScript, but sadly I’m at the mercy of the JSII compiler. Why TS doesn’t have first class support for comprehensions boggles my mind. This is the closest I could get:

Array.from(nodeProps.entries()).map(e => new HyperledgerFabricNode(scope, `Node${e[0]}`, e[1]));

Compare that with a Python equivalent:

[HyperledgerFabricNode(scope, f'Node{i}', p) for i, p in nodeProps.enumerate()]

For shame, TypeScript. For shame.