Tag: Learn And Be Curious

Dig In

Dig In

Getting to know your professional colleagues at a personal level is risky. I regularly read advice to avoid it. That’s a reasonable strategy to avoid some of the lows of gainful employment, but it also hamstrings the chance to achieve truly beautiful successes, not to mention it forfeits a potent antidote to loneliness.

So yeah, not only am I going to ignore that advice, I’m doubling down on getting better at being a student of other people. To that end, last week I started reading How to Know a Person, from which I extracted the following list of conversation starters:

  • Which of your five senses is strongest?
  • What are you most self-confident about?
  • What’s working really well in your life?
  • What is the “no” you keep postponing?
  • What have you said “yes” to that you no longer really believe in?
  • What forgiveness are you withholding?
  • Tell me about a time you adapted to change?
  • Have you ever been solitary without feeling lonely?
  • Can you be yourself where you are and still fit in?
  • What crossroads are you at?
  • What would you do if you weren’t afraid?
  • If we meet a year from now, what will we be celebrating?
  • If the next 5 years is a chapter in your life, what is that chapter about?
  • What has become clearer to you as you have aged?
  • What is the best way to grow old?
  • If you died tonight, what would you regret not doing?

Full credit to David Brooks here, I’m just repeating his excellent ideas. Keep learning, friends!

Remote Learning

Remote Learning

Ohio in the early 90s had few educational options for a middle schooler interested in computers. But when there’s a will (and willing parents, thank you) there’s a way. Somehow I got signed up for a correspondence course in Pascal in 8th grade. Yes, an actual class where I never met in person (and only rarely spoke to the teacher on the phone). Where the majority of exchanges were via the good old fashion United States Postal Service. Where code had to be printed out, mailed, marked up, and mailed back (how’s that for slowing down rapid iteration!)

Despite it seeming painful to modern ideas of remote learning, the material was quite useful in my overall development. Up until then I was completely self-taught; reasonably good in BASIC and some rudimentary C. Learning Pascal, however, really opened up a new world. And luckily for you all, I still have a number of my Pascal programs, which I recently uploaded to Github for your browsing pleasure. Here’s the good stuff that awaits you:

  • MARKET.PAS – This one’s special for two reasons. First, it’s the oldest of all these files, with a last modified date of Dec 6, 1992, making it the earliest example of code I wrote that I still have in digital form (the absolute oldest being this handwritten BASIC program from 1987). And second, it was my attempt to implement the Stock Market Game, a board game from the 1970s that my mom and I played together when I was a kid. No one else in the family ever wanted to join; it was kinda “our thing” (as was Scrabble).
  • GRADE.PAS – A simple gradebook app for teachers. I believe this was the final project for my correspondence course.
  • CYBER.PAS & CYBORG.PAS – Today you couldn’t pay me enough to get into video game development, but as a youngling I had a thing for trying to build them. This code is a tiny step towards what looks like a side-scrolling shooter involving robots and lasers.
  • KARATE.PAS & KGRAPHIC.PAS – Another game effort, this one a fighter like Mortal Kombat, but with stick figures, because I am terrible at visual art. Pretty sure I got it to a reasonably playable state, though the mechanics were terrible and it required two people because there was no AI to speak of.
  • JDNCRYPT.PAS – Built this encryption tool to protect DIARY.TXT, which I still have (but no, I’m not gonna share it). Basically I reinvented a simple rotation cipher using an insecurely predictable pseudo-random number generator, with an easily bypassed magic parameter kill-switch on the executable. How cute. Rule one of cryptography: never ever write your own.
  • GAME133.PAS – In college a mathy friend of mine and I got really into the Number Jumbler. I wrote this solver to do research into combinations that had no solutions. Two years later when I started my first real job, I was tasked to learn Ada, and as part of that effort I ported this solver.

FYI, in upcoming posts I intend to expand on my personal tech history; including a visual history of my computer setups. Will it be of interest? Maybe! But I’m going to do it regardless.

Resolution Recap

Resolution Recap

Relaxing on a much-needed holiday has given me time to wrap up a couple books, bringing this year’s reading to a close (I’ve also finally started Alexander Hamilton, but no way I’m finishing it on my return flight; it’s good but long).

Per my meta-resolution, I aimed to read 44 books this year. I’m finishing at 48, though a few only barely qualify. Here’s this year’s 5-star selections:

How did I do in my objective to read more non-male, non-white authors? The goal was 32 books, and I finished with 14 non-male, 15 non-white, and 4 both, for a total of 33. Mission accomplished? Quantitatively yes, but qualitatively, the mission of broadening horizons is never done; this will continue to be a focus area.

What will I aim for next year (besides the obligatory quantity)? For one, I intend to read more history and biographies. Given my job, I also am going to do more reading on politics and government. Should be fun!

Evolution

Evolution

(Editor’s note: the past two posts, Mother Of Invention, Edge Case, and this one form a trilogy of sorts, all related to a particular project I’ve been digging into).

When I first needed a way to get access to AWS from a non-cloud-based computer, I implemented 3 options: hard-coded IAM user credentials (generally bad), user-based Cognito (okay but not super scalable), and X.509 via IoT (good technology, but cumbersome to set up).

This week I had a similar authentication need within an on-premises cluster, and was happy for the chance to learn the most up-to-date approach: IAM Roles Anywhere. I really appreciate the authors of these two blog posts who captured the step-by-step quite a bit better than the official documentation:

I used my own certificate authority because AWS Private CA is too dang expensive; $400 a month doesn’t grow on trees, ya know? Here’s the bash script to create the root CA:

mkdir -p root-ca/certs    # New Certificates issued are stored here
mkdir -p root-ca/db       # Openssl managed database
mkdir -p root-ca/private  # Private key dir for the CA

chmod 700 root-ca/private
touch root-ca/db/index

# Give our root-ca a unique identifier
openssl rand -hex 16 > root-ca/db/serial

# Create the certificate signing request
openssl req -new -config root-ca.conf -out root-ca.csr -keyout root-ca/private/root-ca.key

# Sign our request
openssl ca -selfsign -config root-ca.conf -in root-ca.csr -out root-ca.crt -extensions ca_ext

# Print out information about the created cert
openssl x509 -in root-ca.crt -text -noout

The output from the above is what’s used to create the Trust Anchor. Then here’s a script to create a certificate for the process that will be authenticating:

# Provide a name for the output files as a parameter
entity_name=$1

# Make your private key specific to your end entity
openssl genpkey -out $entity_name.key -algorithm RSA -pkeyopt rsa_keygen_bits:2048

# Using your newly generated private key make a certificate signing request
openssl req -new -key $entity_name.key -out $entity_name.csr

# Print out information about the created request
openssl req -text -noout -verify -in $entity_name.csr

# Sign the above cert
openssl ca -config root-ca.conf -in $entity_name.csr -out $entity_name.crt -extensions client_ext

# Print out information about the created cert
openssl x509 -in $entity_name.crt -text -noout

Special thanks also to the creator of iam-rolesanywhere-session, a Python package that makes it easy to create refreshable boto3 Session with IAM Roles Anywhere. Seriously, could it be easier?

from iam_rolesanywhere_session import IAMRolesAnywhereSession

roles_anywhere_session = IAMRolesAnywhereSession(
    trust_anchor_arn=my_trust_anchor_arn,
    profile_arn=my_profile_arn,
    role_arn=my_role_arn,
    certificate='my_certificate.crt',
    private_key='my_certificate.key',
)

boto3_session = roles_anywhere_session.get_session()
s3_client = boto3_session.client('s3')
print(s3_client.list_buckets())

This was a good reminder that technology marches ever onward, and what made sense yesterday might not be the best approach today. It was also a reminder that, like DNS, TLS and PKI are some of those things that every technologist ought to know (I’ve queued up this book in my Goodreads for a deeper dive). This isn’t the first time I’ve had to write code to create certificates, but it’s now the last, because I’ll have this reference post plus its associated code repository. And so will you.

Edge Case

Edge Case

I was today years old when I learned that an object key in S3 can end with a slash. Why might someone use such a strange key, you ask? Well, I was working today on a static website served by CloudFront that needs to serve a particular JSON document at /foo/bar/ (note the trailing slash). One option was to create the corresponding object at /foo/bar and then use a CloudFront function to remove the trailing slash. But that adds complexity, cost, and a tiny bit of latency. Could there be a better way?

Indeed there was! Create the object with a prefix of /foo/bar/ and Bob’s your uncle. Admittedly it’s a bit tricky to create an object with such a key. The console won’t do it, and neither will the aws CLI (at least not without getting fiddly with encoding, and no one’s got time for that). But boto3 to the rescue, it’ll happily do it.

Obligatory bit of additional knowledge: know your slashes.

Know Thyself

Know Thyself

It’s inevitable that over time I’m going to repeat myself here (including post titles). When I’m aware of potential similarities, I try to embed links back to those prior posts. A while back I noted an idea of building a thematic map of all my posts, but I wasn’t sure how to go about doing so. Now that I’ve learned some about embeddings, it was time to try my hand at it.

You can find the code I wrote to accomplish all of this on GitHub. I was inspired by the clustering section of the OpenAI cookbook, but took considerable liberties rewriting the code there, as I’m not a huge fan of typical data science code examples (they’re suitable for notebooks, perhaps, but rarely include meaningful names or breakdown into logical functions).

First, I had to actually fetch all the post content. I briefly toyed with the WordPress REST API, but couldn’t figure out how to enable it. No worries, though, RSS to the rescue! Unfortunately it’s XML, and I fiddled a bit with using lxml to parse the it, but stumbled upon feedparser which abstracted the details. Awesome!

Since it’s the de facto standard for Python data science, I loaded the posts into a pandas DataFrame. I’m still working on my fluency with pandas, numpy, scikit, and matlibplot, amongst other common tools, and I’m grateful for any opportunity to get their power under my fingers.

To compute embeddings for each post, I used the OpenAI API with the text-embedding-ada-002 model. It’s not good to store API keys in code; for local scripts I store all mine in the MacOS keychain using keyring. Nice and easy.

Since OpenAI usage costs money, I don’t want to repeatedly call the API with identical inputs if I don’t have to. That’s where cachier comes in (a library I help maintain) so results can be transparently saved to disk for subsequent use.

Once I had the embeddings, I used K-means clustering to group posts into common themes, and then t-SNE to reduce the dimensionality and produce a visualization of the clusters. To produce a summary of the theme of each cluster I took a sample of posts from each and shoved them into GPT4.

To start I tried using 2 clusters, which produced the following distribution:

Pretty interesting that there’s a natural grouping going on. Here’s the themes and sample posts:

Blue Posts

The theme of these posts is the author’s personal and professional experiences with technology, education, open-source contributions, ethical considerations, and the impact of travel and diversity on personal growth and the tech industry.

Orange Posts

The theme of these posts revolves around the reflections, experiences, and insights of a software developer navigating the challenges and nuances of the tech industry.

Of course I had to try with a variety of different numbers of clusters, so I reran with 3, 5, and 8 clusters as well (anyone see a pattern there?)

Of those graphs, to my eye the 5 cluster one seemed the best balance between having enough distinct themes without starting to look too arbitrary. Here’s the summarizations for it:

Blue Posts

The theme of these posts is the author’s personal and professional experiences, challenges, and insights related to technology, software development, and working within the tech industry.

Orange Posts

The theme of these posts revolves around the challenges, insights, and anecdotes from the world of software development and engineering management.

Green Posts

The theme of these posts is the multifaceted nature of software development, encompassing the importance of maintaining code quality, the broad skill set required for effective development, and the challenges and responsibilities that come with the profession.

Red Posts

The theme of these posts is the reflection on and sharing of personal experiences, insights, and best practices related to software development, including contributing to communities, understanding abstractions, effective communication, and professional growth within the tech industry.

Purple Posts

The theme of these posts is the author’s personal reflections on their experiences, interests, and philosophies related to their career, hobbies, and life choices.

What’s next? I’d like a quantitative way to evaluate the quality of the theme clustering and summaries produced. There’s a lot of non-determinism in the functions used here, and with some twiddling I bet I can produce improved results. I’ve got some ideas, but will save them for a future post.

School’s In Session

School’s In Session

Tonight I kick off a class from Stanford called Ethics, Technology, and Public Policy for Practitioners. It’s been a hot minute since I’ve been involved with formal education (about 10 years actually), but I’m pretty excited. Not just for the learning, but for the people I’ll meet along the way, who appear to be a fantastically variegated bunch based on what I’ve seen on Slack so far.

Here’s the course description from the syllabus:

Our goal is to explore the ethical and social impacts of technological innovation. We will integrate perspectives from computer science, philosophy, and social science to provide learning experiences that robustly and holistically examine the impact of technology on humans and societies.

Basically it’s Jud catnip. If it sounds interesting to you, I think it’s offered periodically. Here’s a link for future reference.

Fix-It-Up Chappie

Fix-It-Up Chappie

Over the weekend my daughter’s Chromebook stopped turning on. We’ve gotten our money’s worth, having bought it right before the pandemic (fortunate timing, that), but I suspected the issue was only with the battery, having experienced a similar failure mode with other Chromebooks. Fifty bucks, overnight shipping, and fifteen minutes at the kitchen table, and it’s back up and running. Yay!

I usually enjoy trying to repair things. I don’t always succeed (especially if it’s car-related, I leave that to the professionals after a disastrous attempt to patch a radiator leak in an ’87 Honda Civic back in the summer of 2005), but stuff like electronics or minor carpentry things I can usually figure out (not to mention identifying and working around website bugs). There’s something imminently satisfying about learning something new and immediately applying it to bring a tiny bit of order to the entropy.

You may think you don’t know how. And you may be right. But finish the sentence: you don’t know how yet. More than ever before there’s a wealth of knowledge at your fingertips. Engage your curiosity and give it a try. The risk is (usually) minimal and the rewards many.

Truth At The Intersection

Truth At The Intersection

Earlier this year I pledged to read 32 out of my 44 books by authors who are either non-white or non-male, 73% of my total. Juneteenth seems like an excellent day to see how I’m doing, given both its significance to my objective, as well as it being near the middle of the year.

As of today I’ve completed 24 books, ahead of my required pace of 22 by this date. Of those, 4 were written by white women, 2 by non-white women, and 10 by non-white men. That’s 16 in total, or 67%, which means I need to pick up my pace a bit to hit my goal. Of my current 3 books in flight, 2 are by women and 1 was written by a consortium of indigenous folks, so that’ll help things out. And I’ve plenty more qualifying books in my queue.

If you’re curious, you can see what I’m reading any time on my Goodreads page.