In Part 1 we discussed the use cases for formal specifications and we looked at a simple transaction isolation bug in a financial institution.
This is a distributed system transaction orchestration problem.
In this exercise, imagine we are a bank, and we are serving an API request from our banking mobile app to initiate a bank transfer from an external financial institution to the user’s account.
We are building a API endpoint that asks the other financial institution to move money over to us.
The wrinkle here is that internally, the transfer also needs to be synchronized with a third, internal transaction database (our internal source of truth) to formally recognize the balance in the user’s account.
How do we design this system to ensure the design is resilient to failures and outages?
We illustrate the system design using the happy path. Our mobile client calls an API gateway, which we use as a transaction coordinator.
The API gateway makes 2 calls. First, it calls the external financial institution to initiate the transfer. If the transfer is successful, the API then turns around and pings the internal balance store service to note the transaction was a success.
(Note that this is not representative of a real world bank! This is a contrived, simplified example).
Because we need our API to be synchronous, the API coordinator blocks until both responses are success, before returning a success response to the client.
sequenceDiagram
autonumber
OurMobileClient->>+OurAPIGateway: SubmitTransfer
OurAPIGateway->>+ExternalFinancialInstitution: StartTransfer
ExternalFinancialInstitution->>-OurAPIGateway: SUCCESS
OurAPIGateway->>+OurInternalBalanceStore: UpdateUserBalance
OurInternalBalanceStore->>-OurAPIGateway: SUCCESS
OurAPIGateway->>-OurMobileClient: SUCCESS
Of course, we know that errors can crop up in the real world. If the call to either service borks, we will need a way to either retry or fail gracefully. Here, we consider the use case where our internal API service crashes.
We consult with the team and decide that if the API service crashes for any reason, we will want to undo the transaction in the external financial institution with a compensating transaction. We will throw this work onto an external queue as soon as an error occurs.
sequenceDiagram
autonumber
OurMobileClient->>+OurAPIGateway: SubmitTransfer
OurAPIGateway->>+ExternalFinancialInstitution: StartTransfer
ExternalFinancialInstitution->>-OurAPIGateway: SUCCESS
OurAPIGateway->>+OurInternalBalanceStore: UpdateUserBalance
OurInternalBalanceStore->>-OurAPIGateway: FAILED
OurAPIGateway--)BackgroundWorker: Enqueue Reversal
Note over OurAPIGateway,BackgroundWorker: Compensating transaction kicks off after a failure
OurAPIGateway->>-OurMobileClient: FAILED
BackgroundWorker->>+ExternalFinancialInstitution: UndoStartTransfer
ExternalFinancialInstitution->>-BackgroundWorker: SUCCESS
But wait! We see that there’s a bug. For there’s a race condition when users “button mash” after they hit an error dialogue in the mobile client and immediately retry their request again!
sequenceDiagram
autonumber
OurMobileClient->>+OurAPIGateway: SubmitTransfer
OurAPIGateway->>+ExternalFinancialInstitution: StartTransfer
ExternalFinancialInstitution->>-OurAPIGateway: SUCCESS
OurAPIGateway->>+OurInternalBalanceStore: UpdateUserBalance
OurInternalBalanceStore->>-OurAPIGateway: FAILED
OurAPIGateway--)BackgroundWorker: Enqueue Reversal
OurAPIGateway->>-OurMobileClient: FAILED
OurMobileClient->>+OurAPIGateway: SubmitTransfer
OurAPIGateway->>+ExternalFinancialInstitution: StartTransfer
ExternalFinancialInstitution->>-OurAPIGateway: FAILED
OurAPIGateway->>-OurMobileClient: FAILED
Note over OurMobileClient,OurAPIGateway: User's re-submittal fails because there are no funds in account (race condition)
BackgroundWorker->>+ExternalFinancialInstitution: UndoStartTransfer
ExternalFinancialInstitution->>-BackgroundWorker: SUCCESS
This will fail.
OK, let’s try to model this behavior as a formal TLA+ spec. I’ll write out how the spec would look, and we’ll go through it line by line:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
variables
queue = <<>>,
reversal_in_progress = FALSE,
transfer_amount = 5,
button_mash_attempts = 0,
external_balance = 10,
internal_balance = 0;
define
NeverOverdraft == external_balance >= 0
EventuallyConsistentTransfer == <>[](external_balance + internal_balance = 10)
end define;
\* This models the API endpoint coordinator
fair process BankTransferAction = "BankTransferAction"
begin
ExternalTransfer:
external_balance := external_balance - transfer_amount;
InternalTransfer:
either
internal_balance := internal_balance + transfer_amount;
or
\* Internal system error!
\* Enqueue the compensating reversal transaction.
queue := Append(queue, transfer_amount);
reversal_in_progress := TRUE;
\* The user is impatient! Their transfer must go through. They button mash (up to 3 times)..
UserButtonMash:
if (button_mash_attempts < 3) then
button_mash_attempts := button_mash_attempts + 1;
\* Start from the top and do the external transfer
goto ExternalTransfer;
end if;
end either;
end process;
\* This models an async task runner that will run a
\* a reversal compensating transaction. It uses
\* a queue to process work.
fair process ReversalWorker = "ReversalWorker"
variable balance_to_restore = 0;
begin
DoReversal:
while TRUE do
await queue /= <<>>;
balance_to_restore := Head(queue);
queue := Tail(queue);
external_balance := external_balance + balance_to_restore;
reversal_in_progress := FALSE;
end while;
end process;
Whew, ok! That’s a lot. Let’s go through it line by line:
First up, we declare variables and operators:
1
2
3
4
5
6
7
8
9
10
11
12
\* These are global variables
variables
queue = <<>>,
transfer_amount = 5,
button_mash_attempts = 0,
external_balance = 10,
internal_balance = 0;
define
NeverOverdraft == external_balance >= 0
EventuallyConsistentTransfer == <>[](external_balance + internal_balance = 10)
end define;
There are two main blocks here, the variables
block and the define
block. The variables defined here track values that will be used globally throughout the model. The operators in the define
block are properties that the model checker will use to make sure invariants and temporal properties hold true throughout the lifecycle of the model.
It’s imporant to note the properties defined here in the spec:
NoOverdrafts
is checked on every state combination, ensuring that there cannot be a scenario where the external financial institution is asked to transfer more money than is in its account.EventuallyConsistentTransfer
is a Temporal Property that checks whether the system always eventually converges on the condition listed below - that external + internal balance equals $10, the starting amount. We are essentially guaranteeing that we cannot unintentially create or lose any money between our institutions.Next up, there are two process
blocks being defined here, representing the two internal systems whose interactions we are modeling here.
The first process
is the API coordinator. Inside this coordinator, each action is marked by a label
- so note the labels ExternalTransfer
, InternalTransfer
, and UserButtonMash
. These correspond with various phases of our system sequence diagram. Let’s walk through the code:
1
2
3
4
5
fair process BankTransferAction = "BankTransferAction"
begin
ExternalTransfer:
external_balance := external_balance - transfer_amount;
...
This is fairly self-explanatory - the system is set up to first call the external institution and tell them to withdraw the money. For simplicity’s sake, we assume it always is successful. (It obviously won’t be, and we have the perfect tool to model failure scenarios around that!)
1
2
3
4
5
6
7
8
9
10
11
InternalTransfer:
either
internal_balance := internal_balance + transfer_amount;
or
\* Internal system error!
\* The system will enqueue the compensating reversal transaction.
queue := Append(queue, transfer_amount);
reversal_in_progress := TRUE;
...
end either;
The next label is interesting. We use an either...or
control structure to tell the model checker that there is possibly branching logic here (in this case, there is a success case and a failure case). Both these branches will be exhaustively explored.
In the successful case, we observe that the internal API is called successfully and the balance is correctly stored. However, the failure case will have us enqueue a compensating transaction (a “reversal”) that will be processed by an asynchronous worker.
1
2
3
4
5
6
7
8
9
10
11
\* The user is impatient! Their transfer must go through.
\* They button mash (up to 3 times)....
UserButtonMash:
\* await reversal_in_progress = FALSE;
if (button_mash_attempts < 3) then
\* But the UI blocks them from re-submitting until the transaction
\* has finished being reversed/compensated.
button_mash_attempts := button_mash_attempts + 1;
goto ExternalTransfer;
end if;
Ooh, the user, the user. You can always count on the user to do something unexpected. So now while the user is enqueuing the compensating transaction, our poor user is confused and is now retrying the original transaction (aka “button mashing”) the UI button in hopes that it will go through. Will it succeed?
Note that the way I’ve built the spec, I’m specifying a finite limit to the number of user retries, if only to make sure the program will eventually terminate.
Finally, observe the goto ExternalTransfer
statement on Line 10. This basically tells the model checker to jump to the ExternalTransfer:
label - i.e. the top of the program to re-execute the process all over again.
(Author’s note: I haven’t finished this yet, but thought I’d push this up as a work in progress. Do you see the error? Are your spidey senses tingling here? More to come!)
]]>If you’ve been writing software for any amount of time, you may be familiar with the many tools we have available to us to ensure correctness, consistency and debuggability of our systems. They range the gamut of unit / acceptance / integration tests, QA plans, CI/CD automation and the like. System or language tools like type systems, interactive debuggers and profilers abound. Practices emerge like DevOps process, TDD/BDD, and even Agile process itself can be argued to be invented toward the goal of writing correct, robust, easy to maintain systems.
Surely these tools are advanced enough in the 70-plus years of computing to help! But no - with the rise of distributed computing, the classes of bugs that start to emerge start to get ornery and complex, are usually nondeterministic, and often beyond the reach of ordinary tools.
But what if I told you there was another option from the world of… math?
What if there was a way to guarantee that our systems and algorithms are performant, run correctly, are reliable against race conditions and the like?
Here’s how it works:
You describe your system (or program) in terms of formal logic statements. You assert specific conditions that must hold throughout the program runtime (invariants). You write this in the form of a proof (that lives outside your actual program).
The tool has a “model checker” which is a glorified BFS search algorithm that explores every possible state space of your program proof and lets you know if the invariant conditions hold.
If they do - congratulations! You’ve verified your system. If it doesn’t pass - congratulations! You’ve found a potential bug!
Using the results from the model checker, you can fix the proof to fix the model checking error. This will translate into a real world fix that you can then roll back into your program.
It’s not magical. It’s also a lot of work, and in all fairness, slightly out of the reach of the typical industry programmer. But it’s much more in reach than you think!
I’ll be using a tool called TLA+, and writing a sample spec from a derivative syntax called PlusCal. I’ll walk us through a simple example that can be found on the Learn TLA web site.
In the book Designing Data-Intensive Applications, the Transactions chapter illustrates a scenario where read isolation is not correctly implemented in the database, leading to dirty reads - simultaneous queries may be able to read dirty data from complex multi-statement operations - leading to bad outcomes.
Let’s say we are a bank where a user attempts to transfer money between two accounts, and a separate query is being run by an auditor who wants to ensure that the bank software is working correctly and no funny money business is happening:
sequenceDiagram
autonumber
User->>Account1: Add $500
Auditor->>Account1: Query Balance
Auditor->>Account2: Query Balance
User->>Account2: Subtract $500
Alas, our system was implemented a bit naively, and we can see that the application makes two calls to the database, debiting from Account1 and crediting to Account2 in two separate statements.
Assuming both accounts have initial values of $1000, the User’s transfer completes successfully, transferring $500 from Account1 to Account2, maintaining the correct money flow (Account1 + Account2 = $2000
).
However, the Auditor has had the unfortunate timing to look at the state of the world in between the two user operations and has a different view of the world, seeing that $500 has materialized out of thin air into Account 1 (Account1 + Account2 = $2500
)!
Let’s model this behavior as a TLA PlusCal algorithm:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
variables
transfer_amount = 500,
account1 = 1000,
account2 = 1000;
process User = "user"
begin
StartUserTransfer:
account1 := account1 + transfer_amount;
FinalizeUserTransfer:
account2 := account2 - transfer_amount;
end process;
process Auditor = "auditor"
begin
DoAudit:
assert account1 + account2 = 2000
end process;
High level explanation - the two process
blocks model two independent activities happening here - the user initiating the transfer and the auditor running the query.
This will blow up! The TLA model checker will compute all possible computation states between the two processes as delineated by the statements inside the StartUserTransfer
, FinalizeUserTransfer
, and DoAudit
labeled statement groups, including when the auditor runs before, during, and after the user’s inter-account transfer.
Just look - the model checker has run and blown up on a series of state transitions that got us to the Very Wrong situation we discussed before. The poor auditor has found that Account1 has $1500 and Account2 has $1000. That’s not good!
Clearly, this is incorrect. We will need to ensure that the database is not able to allow other queries to read values happening from within a transaction. So here, we say “OK, we’re going to wrap up these statements in a TRANSACTION
block”. But hold up! We need to move that change in system design into our TLA model.
Recall that the TLA model checker can only test state combinations between each labeled state, meaning that statements grouped inside a label are considered atomic operations. Knowing this, we move both transfers to within the same label to tell the model checker “these two operations happen at the same time, as if they were running in a transaction”.
1
2
3
4
5
begin
DoUserTransfer:
account_1 := account_1 + transfer_amount
\* Collapse this transaction with the one above to make them atomic
account_2 := account_2 - transfer_amount
Run the model checker again - it passes.
This was a fairly high level overview on how to write TLA specs. This is much better explained on the Learn TLA site: please read more there!
For the sake of time, I will direct you to some great resources:
I’d love to run through a real world example of using TLA+ to specify a distributed system, loosely based on a concurrency bug we saw recently at Lyft. Stay tuned!
]]>When you think of a software engineering career path, you may default to the idea that you can climb the career ladder at various product companies and corporations, working directly with stakeholders and leadership to ship products to customers. The types of companies you might consider are early/mid/late-stage startups, established enterprises or Big Tech companies.
What you may not have considered, however, is how a stint in consulting can accelerate your learning curve and teach you lessons that can multiply your effectiveness across any engineering organization you join later in your career.
Now you may have a stereotype of a consultant - maybe of a management consultant that flies out to clients five days out of the week and works 80 hour weeks and lives out of suitcase and makes PowerPoint presentations all day. If life as a suit doesn’t seem appetizing, that’s okay. That’s not the consulting I’m talking about!
I spent four years working as a developer at a XP software consultancy shop and… really loved my time there.
Software consulting is defined by working with a client on a short-to-medium-lived project that has a digital deliverable - a software platform, an updated platform capability, or an MVP to show to first customers.
Consulting may consist of process deliverables - I’ve been on many a project with the entrenched old guard in some industry realizing that the new upstarts are eating their lunch with software - and that they’d better get along with the innovation. That means teaching folks Agile process, or product management.
Now I can’t claim to know everything about consulting, as my experience is limited to one consultancy in my career. However, I can say that it was a huge springboard for my career because it increased my exposure to people and organizations. The following are some of my learnings and takeaways from my time:
You can always make a tech problem work. It’s the people problems that are the hardest. From stubborn and resistant developers who need to be wooed to your side, or to surprise stakeholders surprising you right before ship date. The keys to project success are almost never at the execution layer.
As a consultant, you will learn to very quickly read the room and understand where the power structures are. There’s the account manager, who’s stuck their neck out to really get your team in the door. There’s the VP eng, who is somewhat skeptical of your team, but who needs to be shown results. This is no different from inside the walls of a product company, where teams need to know where and through whom the power flows - and properly seek to manage that relationship.
As a consultant, you are an outsider and often met with skepticism if not outright hostility. Finding ways to build trust and rapport with your client partners (read: grumpy engineers or skeptical directors) are super important. To that end, it was important to show my face in the office(s) as much as possible to see. Making small talk was key, or grabbing lunch with the team.
As a consultant, you are always grabbing lunch with people. At a product company, you too will learn that it’s important to build bridges and relationships with the stakeholders and collaborators on your team and outside.
Many of us are hired for our domain expertise or Thought Leadership(tm), which would seem to naturally imply that consultants have a lot of power or sway in what can or should be done. After all, they are expensive!
But wait! That also means consultants are often seen as a threat. After all, who has to maintain the codebase after these consultants build their thing into it? You can never roll in and assume that you have the permission of the entire team to build a new system/introduce a new process/launch a new product the way you think should be done.
Even though we held strongly to our product development principles, we would sometimes bend to the customers’ whims because we understood that not all the time, one-size-fits all. So if the client balks at writing tests a certain way, or if they really don’t want to name the class that name, or if they have really weird preferences around line breaks and indentation - we let it go.
In consulting, the pace of learning is exhilarating. One month you’re working on a kubernetes migration for a Fortune 500 company, the next month you’re dipping your toes in the latest React library, and the next you’re building an iOS app for a stealth startup (oh, and some coworker keeps talking about OCaml or Haskell or something). It’s easy to get caught up in the temptation to choose the latest shiny for everything you build.
My advice - Choose Boring Technology. More specifically, use the “innovation tokens” mentioned in Dan McKinley’s article and choose one (or maybe two) fun, new things to use. Don’t dump the innovation tax on your clients and customers. This is hard to choose into. At a product company, this will be important to learn as well as you learn to identify the tradeoffs of choosing The New and Shiny versus the stability of Boring Tech. At a product company, you are also responsible for long-term maintenance of your systems, so this lesson may emerge no matter what.
Billing hourly had the result of forcing me to think about where and what I allocated my hours to, every day. My consultancy had a rule to never bill more than 8 hours / day, and thankfully it was modeled from the top that we really would sign off at the end of the day1. You’re forced to really work on the most important things - pushing your product counterpart to ruthlessly prioritize only the most important things. Then you sign off and you just don’t work at night. No emails.
Quite honestly, this is something I really find hard to do nowadays I’m at a product company. It’s not easy to put down the computer and not answer emails2.
Gosh, I loved pairing. I know that I’m a weird outlier. But I picked up so many technologies, architecture pointers, vim shortcuts, and other random things from all the pairs I had over the years. I was fortunate to really enjoy my coworkers, and by far, this was my biggest growth multiplier in my technical skills.
Now I’m back in Big Company life, I’m known as the engineer who keeps scheduling time to pair with teammates or folks on different teams. It’s the best way to learn a new domain or system, and to also build trust with the person you’re working with.
As a developer, I naturally shy away from the sales and business development process. As a principal engineer in the consultancy, I was often tasked to go on sales meetings with prospective clients to be the face of engineering and also to vet client systems. I gained newfound appreciation for our partners and business development staff and learned ways to properly put value on the process and art of software development. And even though lots of these BD trips ended up without an agreement, it gave me opportunities to meet staff at other companies and get a little window into how they worked.
So that’s my little spiel about how helpful my consulting background has been now that I’m at a larger company.
By the way - Gerald Weinberg said all this stuff better in Secrets of Consulting. It’s a great read - no matter if you’re in consulting or not!
This must have been more of a high-end software consultancy thing, as we were more of a boutique firm with name recognition. Projects were structured in terms of time & materials, so we never had pressure to work nights and weekends to ship X by Y date (opting instead to cut scope). This let us rest easy at night. ↩
Then there’s the matter of being on call, which is pretty rare in consulting. Geez… I miss that. ↩
I grew up on the stereotypical overachiever fast track. My dad was a Silicon Valley hardware engineer who got me into coding when I was in fifth grade, and my passions kept me scripting, coding and building web sites in middle school and high school. When I graduated from Berkeley with an EECS degree, it was pretty clear I was ready to dive headfirst into the industry.
I joined a fairly large company, filled with smart and friendly people. It was a pretty stable, comfortable place. My coworkers organized lots of social events and there was an obvious deep camaraderie between all.
I was the new grad hire on an older team1. I had a great manager and teammates I could learn a ton from. Totally an ideal place to launch a career.
Except… a year and a half in, I quit. I decided I’d leave the industry for a bit. I was happy at work, but I wasn’t OK.
You see, I had just gone through my first big breakup, one that reverberated deeper than I realized. When that relationship ended, I went through a period of deep soul-searching and realized that I needed some time away.
Around that time, some friends let me know that they were entering a yearlong internship at our Oakland church community. I decided that I’d join them that year. And so it went - I moved out of my comfy Emeryville apartment and into the cramped quarters of our East Oakland community center. It was going to be the start of one of the most transformative experiences of my life.
Instead of daily standups behind big glass vistas of the San Francisco skyline, I woke up to daily meditation and time spent in the urban community garden. Where I took for granted the amenities and services of our big glass skyscraper, I was now the one vacuuming, scrubbing and cleaning the facilities2. Instead of spending most of my day with high-earning tech workers, many days were spent chatting (and sometimes squabbling) with our unhoused friends who lived on the church steps.
I know it’s cliché, but having time to step out of the career hustle was so good for me. It was good for the young man that I was, who needed time to focus on himself and rebuild a grounded identity. It was good for me to spend among friends and trusted community. It was good for me to spend a season focusing my energies outward. It was good for my balance and sense of what was normal to see how folks way, way outside the tech bubble lived, especially in East Oakland as we served in the soup kitchen.
I think that if I had not taken that year off, I would have continued in the hustle - lost deep in the bubble that so many of us in tech ensconce ourselves with.
I fully understand that my time spent in East Oakland that year cannot fully be separated from conversations about gentrification and privilege. After all, I had the financial means to take a year off without worrying about debt. And a year later, I re-joined the industry, easily switching back into my privileged life in tech. To that end, the learning continues.
And yet, that year fundamentally transformed me - it gave me a perspective on life outside of the tech bubble. It gave me friendships that have lasted to this day and sweet memories (and uproarious stories) that will last a lifetime.
At 23 years old, I made a good decision to take a year off. I’d say it was worth it.
]]>Let’s get to the point - I’ve never passed the interview loop at a FAANG1 company, and I’ve tried at least 6 times2 throughout my career - Rejection City!
Now the typical loop at these companies will prioritize mastery over algorithms and data structures. This advantages folks at computer science research institutions, or people who have hours every day to grind leetcode3. Because there are so many applicants passing through the pipeline, there’s a pretty low margin of error for any of these interviews.
These big tech companies are inundated with candidates, and it pays them to be aggressive in how they filter candidates out. This means it’s acceptable to have a very high rate of false negatives, or regrettable rejections (in statistical terms, that means they optimize for high precision and low recall). It also means that there is an entire cottage industry of cram schools, course materials that would rival any SAT / higher education cram school.
It’s perfectly normal to end up failing out one of these loops because of anything including - nerves, blanking out, or simply having some bad luck with the type of problem you were given. At this point, I’ve received 6 FAANG rejections so far (and counting). Does it sting? No doubt, especially as I consider myself a fairly competent engineer. (I did have some memorable experiences4 though, and the interviewers I’ve met have been all kind and fair.)
Does my record indicate I’m any less of an engineer? Nope - I know what I’m worth and what I’m capable of. And you know what? I’m OK with it!
That’s what I usually ask students or junior engineers at this point - what’s your superpower? Is it your keen collaborative spirit? Your thorough PR reviews, and responsible custodianship of the health of your systems? Your ability to write a thorough doc or tech spec? Your deep knowledge of important domains of web performance or observability or some deep understanding of the business?
These might not have a chance to shine in your next FAANG interview.
I know. It sucks. It’s their loss they didn’t design their interview loops to let you shine. And if they don’t, take your awesome self and go apply at a different company. The sad thing is that this can be so much better across the entire industry!
I’m a big believer in making interviews work like actual day-to-day coding practices, over demonstrating algorithmic prowess. I’ve been part of some really well designed loops that:
It’s the reason at Lyft we’ve designed our Apprenticeship interview loop to be specifically more collaborative. And each day I hear more and more about people working on well-designed interview loops.
Let’s end with reasons you might have a better career path elsewhere:
It’s a hot market, and people are fighting for talent right and left. Don’t limit yourself to just FAANG (or adjacent). Do your research and find the right company that fits your strengths and skills. Now go forth and interview - good luck and go get ‘em.
And finally - a conversation I came across recently on Twitter (the author is a prolific author in the Ruby community):
🧵This should stop. Such practices are not inclusive unfortunately. Believe it or not but for some folks any form of a "test/quiz" can be triggering. My brain shuts down when somebody tells me to do a test in front of them. This is what trauma does to people. https://t.co/0Q3Aqz3nbh
— solnic (@solnic29a) February 5, 2022
What if there was a radically different way to interview people - in a way that doesn’t trigger test or performance anxiety?
FAANG: Facebook (Meta), Amazon, Apple, Netflix, and Google, but not exclusively limited to this club. Mainly the halo circle of prestigious companies offering top tier comp and staggering stock returns. With Meta’s recent rebranding, now alternatively called “MANGA”. ↩
I’ve applied to at least one FAANG company each time I’m in a job-hunting phase, and never passed hiring committee review – with the exception of a brief internship at Apple, which I’ll confess I didn’t do great at either. But that’s a different blog post altogether. ↩
“Grinding leetcode” - to hit the books like you’re studying for finals. Leetcode users have levels and rankings, and people brag about how quickly they can solve hard problems (and how many hours they spend on the platform). However, consider the time investment required to better oneself on the platform, and the types of people this would exclude. ↩
Once, when interviewing for an internship at Facebook I did manage to learn that Joe Hewitt (of Firebug fame) was working next door and I managed to shuffle out of my room to shake his hand… nice guy. ↩
2014 was my first year as a manager, and it had been brutal. In addition to feeling the (normal) overwhelm of transitioning from technical IC to manager, I was also struggling with performance managing one of my direct reports, who was pushing for a title and compensation bump and running into process and procedural hurdles from the company. I felt trapped and caught in the middle and completely out of my depth, losing sleep night after night wondering how I was going to make this happen.
I worked with senior leadership and our head of HR to work through the logistics of this process (because reasons). This specific case was thorny because things weren’t straightforward on both sides. My report had gone about the process in a way that turned messy, but the company itself hadn’t formally defined a career ladder, so it was kind of on us.
I’d show up multiple days in a row to work with our HR director to push the process forward and keep her apprised of updates in the process. She’d give me input on how I was handling the process, and I’d run that back with engineering leadership to figure out a way forward.
The back and forth was exhausting.
Finally, we got it done. I delivered the news of the promotion and title bump to my direct report, plus the constructive feedback I needed to deliver. I was drained. I walked back to our HR director and let her know the news, expecting a perfunctory acknowledgement.
She thanked me for the news, and on my way out, she stopped me.
“Andrew, you did a good job with this. It wasn’t easy.”
I thanked her for the compliment.
She looked me in the eye. “You’re going to be a CTO one day.”
I nearly laughed in the moment, but thanked her and walked out. What did she know about me? If management was this stressful, no way in hell I wanted to be a CTO.
I thought nothing more of it in the moment, just glad to be done. But in the years to come, I’d go back to that moment in times when I’d doubt myself. The words, “You’re going to be a CTO” wasn’t meant to shoehorn me into a specific vision of the future, but meant to tell me, “I see you have the potential to rise to a level of leadership that you can’t see yourself.”
Truth be told, I didn’t think I was really cut out for leadership. I didn’t think I knew how to handle management, nor handle messy situations well. There was much to critique about how I had handled things. But a few well-placed words at the right time from the right person changed my trajectory and fanned a little ember of self-confidence in years to come.
These days I try to do the same for my mentees and sponsees. I try to have radical candor when giving people feedback. And when I see glimpses of them rising to the occasion, I tell them, I believe in you. You may not know it now, but you will succeed.
]]>What made 2021 so difficult? 2020 was tough enough, but I felt like I ran it all on adrenaline and we survived. 2021 felt like it opened with a glimmer of hope but then it quickly fell apart again.
In some sense, 2021 was a big success. I got promoted at work; I had multiple speaking engagements and was able to move the needle on several important initiatives. I received validation of my work and my leadership. We welcomed our second child at the end of the year.
But in another sense, 2021 was incredibly draining. It was the year the world collectively realized that the pandemic was here to stay, and the psychic toll that took on us was heavy. It required we suffer through the ever-blurring line between personal and work life. It was hellish to figure out how to raise a young kid in these times.
One more: my family learned the news that my mom has late-stage pancreatic cancer. Suddenly the things that were important came into focus: family was the most important thing, and it was the most important thing to be close to her for however long we had.
And about our second-born: our first experience with our firstborn broke us - much of it due to our physical distance from family. We knew that the second time around needed to be closer to family - both for their help, but also their encouragement and love.
The conclusion was simple: we immediately made a move from Oakland to San Diego to be near my mom and the rest of our families in Southern California. Fortunately, there was the silver lining in COVID remote work - that such a move was possible without having to risk my job.
However, the increased concerns in my personal life started to also nudge into my work life. Working past 6PM wasn’t an option anymore - I was needed. I had to work later in the evenings to make up. I started to decline opportunities I would have jumped at before - conference speaking, or networking events. Side projects and professional reading lay fallow.
At first, it felt really shitty, like I was limiting my career growth due to the pesky realities of personal life. But upon reflection, I realized I was given the gift of focus, and the opportunity to say no.
Much of my early career had been characterized by me saying yes to any opportunity that came my way. The opportunity to jump into engineering management, or the opportunity to lead a big project for a big client, or take a speaking gig or do a conference talk. These things were all well and good. But the cost of saying yes to everything is that you are not in control of your own time, energy and emotional state.
I recently read an article by Steve Magness titled “Own Your Distractions So They Don’t Own You”. In it, the author discusses how our lizard brains fall prey to modern life in the “candy shop”, full of digital distraction. If we live without intentionality, we fritter away our energy and our health, far from our rooted center in healthy relationships.
So back to the work aspect of things. I titled this “Overproduction” because, well, I’ve frankly worked a lot this past year. Much of it has been incredibly fruitful, impactful and fulfilling. Some of it, if I’m being frank, has not been the best use of my time. I have been wondering how things would have turned out if I had been a better delegator, or used my “no” muscle more.
While I’m on paternity leave, my goal is to reorient myself both personally and professionally. I’ll speak to the latter here: I’m going to do a retrospective for myself on my work life. I need to figure out where I’m going, and what is worth my time, and what isn’t. I have less of it than ever these days, and the time I do have needs to be put to good use. We’ll see where that leads.
Until then, I have a few queued up posts that I’ve been working on that I’ll release weekly. Let me know what you think on Twitter at @andrewhao!
]]>This post originally appeared as a guest article on LeadDev.com titled “How to Expand Your Scope as a Staff Engineer”.
You’ve been a solid senior engineering lead for the past several years at your current company. You’re well respected among your teammates and have a solid track record of shipping impactful products and features. However, you can’t help but shake the feeling that you’re stuck in your career growth and that your prospects are limited where you stand.
Your mandate as a staff engineer is to have a deep impact across multiple teams and the organization, but the road to get there is unclear. Maybe your position on your current team limits the types of projects you can execute, or your manager is too busy to help you grow. Perhaps you’re a new staff hire, and struggling to navigate the landscape of the organization and looking for the most impactful places to operate. Or you may be experiencing the opposite problem: underwater with a flood of small projects that aren’t really large enough to get you where you want to be.
These scenarios all share a common thread - the scope and influence you currently hold is not large enough to tackle the deep, cross-cutting projects you want to lead as your career advances. Let’s discuss how you can get there!
For some of us, the thought of growing our influence may conjure up bad experiences at dysfunctional organizations, where kingdom-building and power games were the norm. For others of us, growing influence feels like a zero-sum game: To grow my scope, I need to be taking it away from someone else. And for some of us, we experience icky feelings of dread that we’ll need to cozy up to people in authority. Building influence or scope feels intimidating, aggressive, or unnatural to our collaborative instincts.
On the other hand, we know that we can’t just sit on our hands. In an ideal world, we want to believe that if we just quietly do the work, we’ll naturally get noticed and people will give us credit. In this ideal world, people would step aside to create opportunities for us when we’re deemed ready. Unfortunately, good work is not always noticed, and your internal ambitions are not always recognized.
The good news is that there is a third way - a way where you can be responsible for your own growth and trajectory, without the power games. Growing your influence can be done by naturally leaning into collaboration – here’s how.
At this point in your career, it is a given that your technical skills are strong. They have served you well up to this point, but they will not (usually) be your primary means of growth in this phase of your career. Instead, it’s your relationships and connections that will serve as the catalyst for your growth.
As a Staff engineer, your responsibility is to understand what’s going on at all levels of the organization, linking leadership strategy to what’s happening on the ground. To that end, you’re going to need relationships and touch points that can give you insight into what other teams are doing. You’re going to need to meet people outside of your circle that can help you see the other parts of the organization that you’re not seeing - and fill out the context you’ve been missing.
Not only that, you’re building out relationships with other teams that you can informally lean on if you need a favor done - or provide help to someone else when they need something from you.
The word network is no doubt loaded with notions of clammy hands, awkward small talk and unwanted inbound LinkedIn messages. Once again - it doesn’t have to be this way! Instead, consider a few updated ideas for the modern, remote work world.
There will be people that are immediately obvious to connect with. For example, you may want to set up recurring 1:1 syncs with leads on adjacent teams in your immediate group. In these syncs, consider filling each other in on your team roadmaps, common challenges you face. Some of these conversations may be fertile ground to identify problems that can be solved.
Other people worth connecting with may be peers in adjacent organizations - for example, engineers on platform teams may want to reach out to leads on product teams that are customers. You may want to consider networking with peers who share the same function as you (iOS/Android, frontend, data science, etc). Ask them what challenges they face, and compare notes on any gaps or opportunities you might see to be filled in your respective roles.
Finally, you may want to schedule time with your skip-level manager or a member of the leadership team. Consider asking them questions about the state of the organization, what challenges they face, and what their top priorities are (see Will Larson’s excellent blog post “Staying aligned with authority”).
Many of us aren’t comfortable revealing our career ambitions to others. However, holding back on conversations with your manager or a more senior sponsor along the lines of “I want to grow my scope so I can get a title at the next level” or “I’d like to have a greater role on Project X” will limit their ability to help you. Leaders in organizational authority roles are in the room where decisions are made - and you want them to be aware of your goals so they can position you for that new project or initiative that could help your career break out.
“People who are in the more senior role that you want also have their own goals and career aspirations. While it can be intimidating to ask someone ‘How do I get your job?’, remember that they probably don’t want to hold onto that role forever,” advises Ashley Kasim, a Staff Engineer at Lyft. “Uplifting me is a part of their journey to get to where they want to go too. Now that I’m in my current role, I’m also trying to grow my replacement. It’s about mutual benefit.”
It’s a bit cliche to say, but Dale Carnegie’s advice from How to Win Friends and Influence People still stands today: the best way to build your influence is to freely offer your genuine self. Offer to do a favor for a team that’s feeling crunched. Share your time as a mentor or sponsor for someone who needs it. Celebrate and elevate the wins of teams around you. Make sure you’re really listening to them as they share their wins and their struggles. Remember the details of what they share - from teammates’ names, to the particular challenges they face on specific projects. Do this freely, with no strings attached. Come crunch time, you may be surprised at how easily many will return the favor.
In today’s remote-work environment, it’s important to remember to make personal connections with people. It is too easy to start meetings by diving straight into business while forgetting to connect with the human behind the face on the screen. Personally, I treasure the chance to make small talk. I love learning about people’s vacation plans, or taking a few minutes to rabbit hole in a shared interest, or sharing a laugh over a funny story heard the other day. This breaks the monotony of back-to-back calls and also opens an opportunity for camaraderie, levity and connection. Ultimately, these little actions build trust - the raw currency you need for effective operation at your level.
As you build your network, you’ll also want to identify potential problem spaces that may be opportunities for your growth. Here’s a few ideas.
In addition to your 1:1s, position yourself to receive information pushes from different parts of the organization. Join Slack chat rooms of other teams, where you can get a pulse for project status or updates. Add yourself to email distributions where you can receive project updates asynchronously. Attend an all-hands meeting for an entirely different group.
By being everywhere, you may be able to connect the dots on changing product strategy in another group that is upstream from yours, or jump on a new platform that has collaboration potential for you. You are now at a unique place to be able to connect the dots about problems or opportunities across the company:
Your job is to synthesize that kind of information and use it to create new innovation opportunities for yourself and your team.
Opportunities can often be found in the seams where one team ends and one team begins. For example,
Another way to come up with impactful projects is to do an assessment of your personal strengths and follow them to see if there are opportunities within the company. Maybe you’re a gifted teacher and coach, or you’re a deeply technical data scientist in your team domain. Now take a completely different axis of the business and imagine what it would look like to offer your leadership there.
By now, you no doubt have a large list of potential opportunities or projects to tackle. But wait - you don’t just get to tackle them all at once, you need to be strategic about what to advocate for and how to build the case to get the green light.
More likely than not, you’ll have more than enough opportunities to start up new projects, or contribute to high impact opportunities. However, it’s important to make sure those opportunities are the ones that are aligned with the goals of your organization.
Are you on a product team, but you see a glaring need for an infrastructure improvement? Instead of offering to build a grand, universal solution for the whole company, you may consider building out a local proof of concept for your team - then work with the platform team to integrate your work with theirs.
A good question to ask yourself is - _if I take on this project, does my organization or group move faster? _Some of the reasons for this are pragmatic to your career development - your peers and team leadership are the ones who will be validating your work, and arguably the ones who know you the best. Other reasons are pure Conway’s Law concerns - you will succeed most when you are working with the systems and the teams you’re most familiar with.
However, don’t be afraid to push the boundaries of your org scope. After all, it’s at the seams that opportunities can be found!
When advocating for expanding your scope, you want to create buy-in. I prefer a style of collaboration where we lead with the mindset of solving together. Let’s imagine a situation where you might be advocating for a solution that moves into another team’s domain. You might try the following:
When proposing increasing scope, you should be ready to receive a polite rejection. Be aware that if you move into someone else’s scope, you will almost certainly be creating more work for them - therefore, make sure that the thing you offer is a clear win for the other side. It’s completely normal - and that gives you the green light to move on to your next project idea!
Finally, the tactical part of the picture. Building scope may involve running a few “plays”, or actions, that help you build the scope you are looking for.
In my experience, this is the most common sign of operating at a Staff-plus level - running a multi-team project that ships something complex and important for the larger group. You will want to work with your manager and your network to get on a project that has a surface area at the organization-wide level. These types of projects tend to have multiple phases, require buy-in from teams across the org, and have visible impact to OKRs for the group. As a lead, you will want to be at a level where you are delegating to the team and helping unblock or clarify project timelines, dependencies, and status outward to relevant stakeholders.
It’s important to not be coding in the critical path and end up neglecting your responsibilities around product leadership, technical guidance and helping the team make crucial architecture decisions. That’s why delegation will be your superpower here.
Your manager or a peer in an adjacent org may inform you of a project that is seeking additional help or leadership due to circumstances that are risking its delivery. Consider joining the project as a player-coach - as you help steady the delivery of the project, you’ll be building domain knowledge outside your immediate team and leveling up coworkers on other teams. This domain knowledge - and the relationships you build outside your world - will help you expand your reputation.
Have you built a solution that solves a problem for other teams, such as a machine learning model, a UI library, or an API? You may want to mature this offering into a general solution for other teams to digest and consume. Much has been written on this topic, but in general you will want to try to solicit concrete use cases from one or two early adopters, who you can use to gradually evolve your system from a bespoke solution to an extensible platform.
Is there a skill gap that you see across a functional role? Maybe there’s an opportunity to upskill engineers by leading workshops or starting a community of practice, whether that be around clean coding practices, testability, performance, a new tool, software language, or framework. The best thing is that you don’t have to be an expert to accomplish this - you can pitch this as a way of learning and upskilling together, and you can take the lead in assembling the team.
Perhaps there are ways to improve ways of working by tweaking an agile practice or process that has long stopped working for people. You may notice a gap in how you conduct hiring reviews, or see a need to implement an architecture review process. You may champion new programs, such as starting a hiring pipeline from nontraditional career backgrounds. As is the case with organizational change, make sure you have clear buy-in and support from leadership and other stakeholders before proceeding.
It’s tempting to start from the tactics and think that Staff+ career advancement just means executing and shipping bigger projects. But the reality is,
Take some time to follow some of the prompts listed above and do your own introspective work. Is your manager or sponsor aware of your ambition? With whom do you have relationship with within the organization - and where might you need to be? Where might you plug in to receive information flows? How can you be helpful to others around you?
Growing your network, influence and scope is like nurturing a garden - your progress is hidden for a long time while the roots form underground. However, there will come a day when you get to reap the fruits. Take your time, have patience, be kind, and stay strategic. You’ll soon be going places!
]]>This post originally appeared as a guest article on LeadDev.com.
It hasn’t been a good Monday afternoon. A remote team checks something into the Authentication service codebase that breaks the User Profile service owned by your team. By the time the deploy is rolled back, customers haven’t been able to log in to your product for several hours and news outlets have picked up the story. The CEO is on the line, demanding answers. You send hasty apology emails to your customers, then sign off for the day, exhausted.
In the days to come, everybody has an opinion about what could have gone better. Some suggest the teams needed better external documentation. Others suggest that a signoff process should be enforced and project management should get more involved. Another director suggests starting yet another architecture review committee. Although all those sound like good ideas, something nags at you - what’s actually happening here?
When we think of a software system, we often conceptualize it in its dynamic, operationalized form in production. We ask questions like, how many transactions per second can it process under load? Is it meeting SLO targets? But we also need to remember that software systems take a form much like written communication between the developers working on the system. It is just as important for a software system to have several nines of uptime as it is for it to be readable and understandable to the engineer poring over the code, trying to make sense of its shape.
After all, we know that engineers spend more time reading code than writing it. If an engineer makes a change to a system with an incorrect mental model, then defects will emerge. And the team’s mental model is indelibly imprinted in the code, the tests, and the documentation.
Software’s purpose is not just to achieve business goals for the company, but to be easily changeable, elegantly designed, and robustly tested for all current and future members of the team. In other words - one of the primary purposes of software is to guide its readers into constructing a mental model of the world and how it works.
What if we tried to imagine our systems as if they were time capsules - artifacts left behind for future teams and collaborators to sift through and understand? I believe that software systems can be thought of as textual artifacts, messages in a bottle, if you will, to a future teammate or cross-org collaborator, meant to convey the shape and meaning of the system in this current point in time.
To do that, I want to take you on a detour through semiotics, a field of linguistics and communication theory that deconstructs meaning in everything from literature, TV ads, political messaging, and Internet memes. What could that have to do with software development?
Emerging from the work of Ferdinand de Saussure in the early 1900s, semiotics is the study of signs and symbols as they make meaning in cultural communications. Saussure was a linguist interested in how meaning was constructed through language. In Saussure’s model, meaning is constructed by a Signifier, a concrete “thing” in the world, and its corresponding Signified concept (a connoted meaning). Take the example of this image:
Photo by Carlos Quintero on Unsplash
The Signifier is the representation of this rose on your screen, pixel by pixel. By itself, it doesn’t communicate any meaning. Now you, the viewer, see this image and may think to yourself - ah, a rose! and automatically think about the flower (the Signified concept). How did you know that? You have familiarity with this type of flower in your lived experience, having seen roses in flower shops and in gardens around you.
But if you saw this image of a rose on a highway billboard for a jewelry store or in an online Valentine’s Day floral service ad, you may see different layers of meaning. You may understand this rose as signifying the concept of Romance, Love, or Passion, based on your cultural understanding of roses and the role they play in cultural tropes in movies, film, TV.
But just one second - if you were from a non-Western background (or an intergalactic alien being), this image of a rose may mean nothing to you!
Signifier (Concrete) | Context | Signified (Concept) |
An image of a rose | On a billboard for a jewelry ad | The concept of “Romance” |
In a gardening textbook | The concept of the flower known as the rose | |
A non-Western context | ? |
Saussure developed the beginnings of a framework for how meaning is communicated and understood through concrete artifacts in the world. This framework of semiotic analysis allows us to deconstruct a message into constituent parts - the actual form of the idea in a concrete form, its metaphorical or connoted meaning, and the role of the receiver as the message is parsed in context.
So what does this all have to do with software development? Semiotics pinpoints the hidden role of cultural assumptions of different viewers in different contexts looking at the same things! Let’s get practical and try to apply some semiotic thinking to highlight how divergent understandings can arise from seemingly innocuous features of our systems.
The first and most obvious place to apply semiotic thinking is in the naming of the concepts made concrete in our software. Go through your system and make a list of the concepts encoded in class names, variable names, functions and even database tables. You may observe:
For example…
Signifier (Concrete) | Context | Signified (Concept) |
The User in the database | The Identity team | A record that maintains core account authentication attributes |
The Core Product team | A record that maps to the user’s unique presence in the world as they use the core product - for example, a place to store Profile information |
Aha! That’s a key insight - that the perspective of a teammate in the Identity organization leads them to understand the User model slightly differently from a teammate in the Core Product organization.
To tackle this challenge and make these cultural assumptions explicit, draw from disciplines like Domain-Driven Design where the nuances in language in business contexts are made explicit in code. You may consider:
The structure, or form, of our software systems is another concrete feature that communicates meaning. Consider these considerations around the architecture, platform and runtime features of our system:
Many of these choices are not obvious to our new teammates or collaborators, who may have their own cultural or organizational constraints. “Why is feature X built with Y?” they may ask. If you don’t answer these questions up front, they may project their own assumptions incorrectly on your systems.
For example:
Signifier (Concrete) | Context | Signified (Concept) |
The (undocumented) User Profile API | The Identity team | A protected API endpoint that is meant only for use in manual data migrations. If misused, it will emit invalid events that could corrupt the event store. |
The Core Product team | An API endpoint that allows our service to store profile data. |
To solve for this, explicitly write all architectural and structural decisions down:
Through our crash course in semiotics, we’ve learned how it can identify structures in our software systems that are open to (mis)interpretation. By getting ahead of ourselves and thinking about how different teams in different parts of the organization parse and interpret our systems and flows, we can anticipate ways in which we can end up with divergent mental models. We can get ahead of the issues to develop habits to document the hidden cultural forces that shape our code, our architectural decisions, and our software use cases.
Your software system is a message in a bottle. When a stranger in the future picks it up on some faraway sandy shore, what stories will it tell?
]]>A friend had introduced me to a peer from another company last year who was having troubles adjusting to his new role as tech lead on his new team. We sat down at a local coffee shop to talk it through.
“My team just doesn’t know how to do Agile” the engineer stated. “I propose all these process changes but people are skeptical.”
Having myself been hired by my employer as a team lead, I knew how frustrating that could feel. I pressed on with some more questions about why the process was off.
“They’re pre-assigning stories at the start of the sprint. They estimate with time instead of points. And they’re all working in individual silos” this engineer sighed.
There was one last question I had to ask - “Is there a specific problem that’s happening as a result of this broken process? Is anything actually wrong?”
There was silence as my conversation partner thought further. “Not really,” he allowed. “They seem to be working out all right.”
“Does your manager think anything’s wrong?” I asked.
“Hm, not that I think about it. He’s understanding of what I want to accomplish, but the team’s been together for a long time and this is just how they work. People are willing to make the changes, but they’re skeptical what benefit it would bring,” he acknowledged.
Kent Beck, in an interview on the Software Engineering Daily podcast describes coming in to Facebook in 2011 and having his fundamental assumptions about software engineering challenged (for those of us who are unfamiliar, Kent Beck is one of the original authors of the Agile Manifesto and earliest advocates of XP and TDD):
Interviewer: When you joined Facebook, my understanding is that around that time, Facebook really didn’t have much testing. It’s ironic, because you were the creator of extreme programming. It was highly dependent on the process of writing unit tests and then writing the features. Facebook… was able to be successful, despite the fact that they wrote their features before they wrote their tests.
Kent Beck: The answer that I came to is that… one is, how many of your problems can you test for and how many problems only show up in production? … If you can’t write unit tests for it… The Facebook answer is don’t.
The second part of the answer is tests are a form of feedback and Facebook engineers had many, many other forms of feedback [describes rollout process].
Then the tests that I had written broke almost immediately. They were deleted. That was one of the things that surprised me. If you had a test and it failed, but the site was up, they just delete the test… If you eliminate this noise production, per definition the situation is clearer all of a sudden.
Mr. Beck realized that, despite being the very author of the book that espouses best practice, test-driven software development, his perspective was limited at best and he needed to recalibrate to the context of his new company.
Many times we take our belief systems and playbooks with us as we progress in our careers. Who has experienced being under incoming senior leadership, having seen success at Company X attempt to implement that same process at new Company Y and see it fizzle out?
Why would that happen? After all, weren’t they hired to replicate their success at their prior position in the new position?
When I joined my current employer two years ago, I had what I thought were a list of unbreakable “rules to work by” and held conceptions of what were best practices. In discussions with my to-be-manager in the hiring process, I was brought on with a mandate to uplevel the team. And coming from a background in consulting with Very Successful Outcomes™, I had a very specific set of practices and dogmas that should Always Be Followed. Things like:
In my first few weeks on the team, I could already see that we were violating every single one of my principles. Work was pre-assigned. Engineers took individual projects. Tests were sparse. The PM asked (in my opinion, pressured!) individual engineers for project dates on a regular basis. Points were individually given and tied to a time scale. Egads!
I could have a talk with my manager and have the Hey, I’m here to change everything up talk. But I knew it was wiser to wait and observe some more. And sure enough, I was surprised.
I had assumed that it was always advantageous to treat the team as a similarly-skilled set of work executors to be able to build work in the style of XP or Kanban. I felt allergic to the idea that the team might keep individual ownership over a particular project or system. In my mind, that meant a low bus factor.
However, as I observed the team, I realized that ownership and doing the implementation work were not the same. On a team that is gelled and high-performing, an “owner” of a project may also invite other teammates to work with them on it. Ownership meant accountability and leadership, not sole execution.
In another example, I pushed back hard against any marketer or PM who would ask me for date updates. “So when’s Project X going to launch?” or when writing a tech spec, being required to fill in expected launch time frames. I swallowed my pride and, after much discussion, deliberation (and padding) with the team, came up with something we could live with.
To my surprise, giving dates didn’t kill me. Dates, at this company, were used as sight lines rather than cudgels. If a project slipped, then it was communicated early and everybody adjusted! I had failed to take into account that my prior experience with deadline-driven development was an unhealthy one, and I needed to recalibrate my experiences with this new team.
There’s a reason so many management gurus espouse going on listening tours for new leaders before starting to execute. They need context, but they also need to see their new teams with fresh eyes.
I’ll leave with this excerpt I found fascinating from an interview with Nick Caldwell (VP Eng Reddit) who talks about making the shift from his time at Microsoft to a small team at Reddit:
Nick: Your natural assumption is to take whatever works in your previous roles and use it as a template… I view process more as a set of tools I carry around with me… you need to spend the first couple weeks just listening to the problems people on the team present to you. Then, you can dig around in your process tool bag, figure out the right tool for the job, and adapt that tool to fit the situation.
What tools, processes, or sacred cows do you bring along with you, and is it time to re-examine those?
]]>In Part 2, we built a bottom-up Idea Backlog-driven generation engine. Now we have swarms of ICs working on their own projects, the team still lacks organizing principles and a way to keep moving forward toward the right goalposts. In other words, we still need to solve the same problems that a PM would normally solve:
We thought about what PM’s do for teams and we broke them up into four jobs, or “hats”, which four volunteers on the team manage. It looks something like this:
In addition to taking on Project Roles, teammates also rotate through Team Roles that reflect one aspect of product ownership. Let’s go through them one by one:
In one sentence: The Messenger ensures that necessary stakeholders are informed of team progress.
It looks like: Each week, the Messenger posts a team update to a Slack channel describing the achievements of the team that past week. The update will include relevant links to backlog items, tech specs, and experiment results. The update highlights upcoming work and current project blockers.
Messengers are responsible for staying abreast of team members’ work status, aggregating them and summarizing them for a larger audience.
Messengers may also represent the team at cross-team meetings, such as product reviews and higher-level meetings to represent the work of the team.
If the Messenger were not doing their job: Stakeholders would be unaware of the work (and the wins) that the team is accomplishing. The potential for organizational confusion and duplicate work would rise.
In one sentence: The Scrum Master facilitates our planning-oriented Agile ceremonies:
The Scrum Master doesn’t have any explicit decision-making powers, but ensures that these meetings run smoothly. Their laser focus is to help the team develop a beautiful set of backlog items.
It looks like: The Scrum Master works with each teammate to write well-defined user stories. This means that this individual must be well-read on Agile story-writing and understand how to facilitate sprint planning and estimation activities.
(Note - Agile purists will note that this is not technically the definition of a Scrum Master, but we liked the name and it stuck).
If the Scrum Master were not doing their job: the work necessary to develop a clean work backlog would fall through the cracks, leading to disorganization, uneven story definitions and unclear team metrics.
In one sentence: The Architect makes prioritization decisions for the team.
It looks like: The Architect thinks hard about the overall team roadmap, what initiatives need prioritization (and conversely, deprioritizing), and then makes the calls for the team. The Architect is likely that team member who already operates at a strategic level - most likely a more senior member of the team or someone in management.
This team member, throughout the week, makes prioritization decisions in the backlog, bringing up projects and stories that have immediate impact or urgency.
The Architect is responsible for maintaining relationships with other Product Managers in the organization to stay abreast of strategy discussions and stay in the loop with product discussions. This means that the Architect must be sure to represent the team in formal product reviews.
On my team, my manager plays the Architect, and it’s clear that their natural relationships with product and engineering leadership makes them the most qualified to play the Architect. It doesn’t mean that the Architect can’t be someone in an official management/leadership role, but it certainly makes it easier.
If the Architect slacked on the job: then the team would have no runway for new projects, or would spin their wheels trying to identify the most important work to do. The team may be unaware of broader strategy discussions, or run and chase after non-impactful projects.
In one sentence: The Dreamer’s job is to facilitate the Idea Generation engine.
It looks like: Our team weekly maintains a separate list of new ideas that we vet together, and the Dreamer’s job is to make sure that list of ideas is prioritized accordingly and well-defined.
In other words, the Dreamer’s job is to make sure the Idea Engine is running smoothly. They are the Scrum Master (the process facilitator) for the intake engine.
The Dreamer is constantly reminding the team to input new ideas into the engine. The Dreamer also facilitates group brainstorms and other idea-generation sessions that help the group do brainstorms.
One thing that I’ve noticed is that “idea generation” is not a natural thing unless you are deeply tuned into customer problems. The Dreamers who are most effective are those who can keep the team in tune with real customer pain points.
This might mean collaborating with a UX researcher to summarize findings for the team, or even doing a deep-dive onsite interview with a new customer. It might mean shadowing a support agent for a day to understand issues that customers face, so they can return to the team and broadcast some real challenges that need addressing.
If the Dreamer did not do their job, then the team would run out of impactful projects to implement, or would work on shallow-work that doesn’t truly solve a customer problem.
We’ve been running this process for nine months now, and a few findings have emerged:
It’s very possible to operate without a PM! Months of product-owner operation have demonstrated to us that it is possible to run a bottom-up engineering team process. Our overall team output has not wavered much, despite the additional overhead of managing our PM responsibilities.
The team feels empowered to make impact - the culture of the team is very encouraging and positive. Ideas are encouraged and celebrated. People report that they feel empowered to make ideas a reality, and that the team’s culture encourages new, risky or unproven ideas.
Lacking a seat at the table: Even though our team can operate independently, we oftentimes lose out on having a seat at the table with the rest of the PMs when they gather to formally discuss strategy and have syncs. This causes us to sometimes lose advance conversations that lead us to be a step behind emerging product strategy. This means that we must work hard to develop relationships within the PM org.
Idea Generation is unnatural: We’re still building the Idea Generation muscle into our team memory. For years, most of us have worked by having work handed to us by a PM. When given the opportunity to experiment and try out new things, we need to be in the right mindset to try that out. As a team, we are learning to understand customer pain points better so we can be nudged to experiment and try out new solutions.
Wearing a PM Hat is uncelebrated work: The additional workload of product management is a significant time commitment for our team. Depending on the role, this may take anywhere from 10% to 40% of their time. Many of our engineers (including myself) who have internalized an idea of “real work” to mean “execution” need to recalibrate our expectation of productivity to include the full lifecycle of events - from ideation to execution. As a team, we’re learning to celebrate the journey of building our PM muscles.
Can you really take a PM and split them up into four roles that cleanly - and divide the roles among a distributed team? Not easily. But it can be done. And when it’s done well, it leads to higher engagement and impact on the team.
]]>In our last installment, we saw how the Idea Backlog was a great tool to generate new ideas. Now that we’ve got great, well-defined ideas with clear measurement metrics and juice opportunity sizes, how do we execute on this work with the team?
The project driver is the individual responsible for the outcome of an experiment or project.
The driver may do things like:
The driver does not:
We conceptualize the lifecycle of a Project from a Project Driver’s perspective into Plan, Build and Measure phases:
In the Plan phase, the requirements are still being gathered and relationships are being built. This is the “soft” part of the project - we are trying to convince folks to help us or commit to collaborating with us in this form.
A tech spec may be written and circulated among relevant teammates. Clear experiment hypotheses or measurements for success are defined and documented.
In the Build phase, we actually build the systems and write the code that support our feature. There is a project management angle to this as well - requirements are written as user stories and logged into JIRA or system of choice.
The team (or teams) build against this backlog.
In the Measure phase, we launch the product and we let it bake, gathering metrics along the way. The product should ideally be launched in a manner that facilitates A/B testing so we can accurately measure the effect of the change.
Once we release our change and enough time has passed, we report our results to relevant stakeholders and mark our hypotheses as Validated (or not). We then turn around and use these learnings for a second iteration of our hypotheses and return into the Build phase to focus in some more.
Note that not all experiments or product launches are destined to live another day. Some should be reverted or axed. Others will live on as future product ideas or learnings to apply to a different domain.
(Astute observers may see that this process is a rebranded version of the Lean Startup Build-Measure-Learn triangle.)
In skills-diverse teams, there will be a natural gap between engineers with different amounts of industry experience. We believe that all engineers, even new grads, can be trained to drive projects of increasing scope.
Let’s take the New Grad as an example:
Although this is his first job out of college, our new grad is given a simple project to drive - an A/B testing experiment that adds a new module to a landing page.
Even though this project’s scope and complexity is small, our new grad will be able to pick up some incredible experience from it:
The New Grad will succeed as a Project Driver if:
On the other end of the spectrum, let’s take our Experienced Engineer as another example:
Even though our engineer is highly skilled, her growth arc will increase when she is challenged to ideate at a size and scale larger than she’s taken on before by owning big ideas with big outcomes. Here, her project is the development of a new machine learning platform that personalizes the search experience for the product.
The Experienced Engineer will succeed as a Project Driver if:
While everyone has the opportunity to be a project driver, not every team member should be individually driving a project all the time. Projects should be sequenced so that teams do not have more than a handful of active projects at a time (limited WIP).
We encourage engineers to lead for some season, but in other seasons they can be the supporting cast for another engineer’s project. This allows all of our team members to be in the driver’s seat. Pun intended.
In our next installment, Part 4, we’ll look at how it all comes together by rotating our team through the four “Hats” of Product Management. Tune in!
]]>In Part 1, I told a story about how the departure of our PM led my manager and I to split the PM roles between ourselves, to shield the team from the change. This spread the two of us way too thin, which also caused us to be less effective overall. That clearly wasn’t going to work, and burnout lay around the corner.
We decided to make a major change to the team structure with a process that would focus on individual empowerment. What if all our engineers were empowered - and required - to decide what to work on in a truly radical way? What if we asked them all to think like product owners, and take responsibility for team output?
Instead of being fed work from a product manager, our engineers would be responsible for thinking of new features and projects that drive growth and seeing them through to completion through the entire software development cycle. Terrifying? Absolutely.
In this new model, the ball starts with each individual contributor. Armed with a clear vision of the goal (OKR) and the tools to measure them (metrics and analytics), the team is given leeway and authority to work on any project that can drive impact.
We track this process with an Idea Backlog, a prioritized backlog of ideas that are in various stages of maturity. Ideas may range from:
We ask teammates to continuously brainstorm and generate new ideas to keep the creative cycle going. Each week, we check in with the team and see if people have any new ideas for projects and experiments that can move the needle.
People don’t just think of ideas unprompted! One way we help the team think up ideas is to get them consistently in front of customers. For example, by manning a chat widget on one of our landing pages, our team had to talk to several customers a day and learn about what they were having trouble with - those conversations led to product ideas that eventually became experiments!
In another experiment, I ended up piloting a customer-facing product aimed at restaurant owners and other SMBs. I had long phone conversations with restaurant owners and discovered new use cases and concerns that I had never known about. These become ideas as well.
Other ways we do this is by chatting with our UX researchers and reading customer interviews done by others. No matter whether your company is 5 people or 5000, there are always ways to break through the inertia and get the team in front of customers to build empathy.
Our Idea Backlog is really just a Kanban board of big ideas; rough and unfiltered. But they need to eventually get vetted. We do this in a weekly check-in where we vet the fitness of each idea.
Each idea backlog item moves from “this is a cool idea” verbal discussion into written form as a 1-pager specification, written by the idea’s author. This specification needs to have:
Yes, it’s painstaking work that often lives outside of most engineers’ comfort zones. This means oftentimes setting up meetings with different teams, other PMs, and product leaders. This means writing specs (ugh). This means getting down and dirty with data analysis, writing queries and hunting down data tables that may be poorly documented or hard to understand.
In other words, this isn’t coding, but it’s a foundational part of the exploratory analysis needed to be a product owner. And we have our engineers all go build that muscle.
This is also time-consuming - idea generation should be factored into team sprints. It’s not uncommon to spend a whole day writing queries digging through data to form a hypothesis, or finalizing the 1-pager spec that is to be circulated.
At the beginning of each week, we evaluate our roadmap and see if there is any need to “pull” a new idea from our idea backlog into the roadmap. At this point, the team can decide together whether an idea has enough legs, definition, and impact alignment so as to actually become a formal work project.
The Ideas that rise to the top are the ones that have at least one of these traits:
Thoughtfully designed Ideas will also be:
This usually means that the kinds of ideas that get wings are quick tests - for example, we want to test the listing on the App Store. Or we want to test a quick JSON-LD rich snippet change on a web page that might cause our Google search index rank to rise. Or we deploy an alpha prototype to a small group of trusted prerelease customers.
It doesn’t mean we avoid ambitious, multimonth releases, but those are vetted with more deliberation and care, and do require much more unblocking and team alignment. My manager counterpart will often have to do some roadmap planning and stakeholder communications with upstream/downstream teams to get buy-in consensus for the very big projects.
And finally, the hallmark of an idea is that there are clear metrics for success - or failure. From this, we take a page from the playbook of the Lean Startup manual. We oftentimes use an Experiment Canvas to help think through our experiment design to build metrics that are clearly quantifiable and time-bound.
Features are launched with metrics dashboards built as part of the feature work. Clear metrics are essential to our ideas since they force us to see the outcome of our work through actual data, give us the courage to roll back changes that are ineffective.
Interested in trying out this method? Of course there are limitations, and here are some of them:
The projects that make it out of this cycle get formalized in our roadmap and worked on in upcoming sprints. Better yet - all engineers are responsible for the generation of these ideas, so the team feels an increased ownership over their work.
In Part 3, we’ll discuss how we distribute this work among team members and got the day-to-day work of executing and strategy development rolling among the team!
]]>A former PM colleague once told me, with some amount of jest, “I don’t know why product managers need to exist.”
While it shocked me at first (coming from a product manager, no less), she was emphasizing that the core value of product managing is ephemeral - it’s strategic, it’s relational, and it’s hard to quantify and measure. It’s the stuff that fills the gaps and spaces between the orgs in the company, teams and process. A PM’s core function is to be the glue - maintaining alignment and developing strategy and execution planning among different areas of the business.
This post is certainly not about bashing PM’s. Let’s get that out of the way - I love PM’s, and a good one is worth their weight in gold. Their skills in focusing the team on building the right thing the right way at the right time can be a game-changer for many companies.
But sometimes that role isn’t needed. Businesses and companies in specific configurations oftentimes have plenty of runway where they can operate without a formal Product Manager. For example, plenty of founding engineering teams operate in a vacuum without a product owner. In those cases, their engineers develop the skills and product intuition to work like a PM.
Only once an organization scales and the focusing powers of a PM are needed, do PM needs really reach their zenith.
However - there’s sometimes an exception where engineering teams at all scales can and do operate without PMs - I’ve met with some world-class engineers at Pinterest and (once upon a time) at Stitch Fix. These were product teams (often growth) that were formed as 100% engineering teams that - surprise - do not operate with PMs.
You may not need one right now either. To explain what I mean by that, I need to tell a story about my team at Lyft.
When I first joined Lyft, my team operated the same way many standard software teams do. We had a dedicated product manager and a team of engineers, along with design and marketing as shared resources. Our product manager was responsible for the usual things a PM does - prioritization of new projects, creating product specs (including market research and opportunity analysis) and doing all the legwork to communicate between teams.
And it was glorious. All the hard work of communicating with stakeholders all along the org? Done. Prioritization decisions? Instantly made. Deep domain expertise about our product? Batteries included.
Then one day, it all stopped.
Our PM colleague came in one day and announced his departure. Just like that, we were thrown for a loop. What were we to do? A replacement was not coming - instead, the team was to own their output and impact for the business. Gulp.
We improvised. My manager (the team’s engineering manager) and I (the tech lead) decided that we’d split up PM responsibilities between the two of us. So I took on the organizational part of the role, and got busy organizing stories and the backlog. He got busy jumping into lots of meetings with leadership and other stakeholders. We held it all together with duct tape and Slack chats and prayed the whole enterprise would hold together till the end of the year.
The end of the year arrived and quite honestly, we whiffed on our goals. Our team performance against our OKRs was underwhelming. What went wrong?
We looked back at our team output and concluded:
Surprise, surprise. Having lost a PM, we had one less teammate who could be wholly devoted to guiding the team to build the right solutions for the customer. As hard as we worked, we were still the bottlenecks, limited by inexperience and lacking time to deeply think about the product needs of the team.
We needed to empower the team - the entire team - to think like product owners.
In the next post, I’ll talk about how we made some changes to the team - and empowered every engineer to think like a product owner. Read Part 2!
]]>I’ve written a bit on this blog about the highs and lows of my time in engineering management. I’m a team lead nowadays - less managing, more coding, but I still think long and hard about what leadership means and looks like in the tech industry.
Re-reading the book recently made me realize that there could be a useful angle for tech leaders who aren’t in direct management, but are in leadership roles anyway. Folks in our shoes tend to work closely with first- and second-line managers. Even though Andy Grove’s advice is targeted to this group, plenty of it still applies to the lead role, and understanding the challenges and mindset of your manager compatriots will greatly increase your effectiveness as a lead.
Andy Grove leads with an iconic example of running a diner, giving examples of how you might measure the output of a business flipping griddles, eggs, and waffles at a breakfast diner. To him, software production is not so different from an assembly line metaphor, and he takes the reader through a tour of his fictional diner business. In the end, he concludes that managers need to develop metrics that let them monitor the output of their own “factories” through three types of indicators: Output Indicators, Quality Indicators and In-Process Inspections.
What are the indicators your manager could be watching for?
The takeaway for technical leads? Collaborate closely with your management counterpart to develop and contextualize these indicators. Your management counterpart may not always have the day-to-day context of what is happening, so your expertise will set them up for an accurate evaluation of team output and performance. Help explain the progress and status of the team to give the manager confidence that the team is operating to its fullest potential.
This could look like:
Much is written in the book about how a manager’s role is defined by their indirect influence on the output of the organization(s) under them. Given the scarcity of time for most managers, Grove encourages managers to seek out high-leverage activities to maximize their influence.
Oh, and one more key insight. Grove cautions his readers from confusing activity with output. Merely filling your day with low-leverage meetings may not be the best use of your time.
Technical leads have much to learn from this insight as well. What are high-leverage activities for a technical lead? What are low-leverage activities?
High Leverage | Low Leverage |
---|---|
Attending an architecture review board meeting | Working on a “juicy” feature in isolation |
Leading a backlog grooming session to make sure the work is well-defined for the next iteration or sprint. | Working on something just because the tech is cool but with low business value |
Pairing with another developer to do knowledge transfer | Nitpicking over syntax on someone’s PR |
Working on external documentation* | Oh, did I mention coding in isolation? |
Performing an effective, empathetic code review | |
Discussing tradeoffs and opportunities with other product leaders |
* Varies by org size and team composition
See that? High leverage activities involve communication and coordination between teams. They often help communicate work streams, set expectations to other stakeholders, or do the plain, boring work of defining work clearly and explicitly.
Low leverage activities oftentimes happen to be the “fun stuff” - a juicy feature that happens to let you refactor your fancy system to use a new pub/sub framework, or try out a new frontend or mobile framework. The “low leverage” part comes when you, the lead, take the work yourself without bringing anyone along with you.
In other words, the leverage from a team lead comes from the strength of the connective tissue they are building within their team and between other teams.
While many of us might view meetings as a necessary evil to minimize, Grove takes a high view of 1:1 meetings. I believe that the effective lead should also do the same.
Since leads often have the highest amount of day to day context to the team’s work, you can be well positioned to put the day to day in context of the strategic in 1:1 meetings with your teammates.
What do I mean by “context of the day’s activities”? If you are doing your job of directing the day’s efforts and helping coordinate the daily tasks on the team, then you have execution authority to the execution flow of the team. You know who is working on what, and what streams of work remain blocked or undone.
What to talk about | Why |
---|---|
Opinions and feelings about project progress (“How is the project going?”) | You can get a read into your teammates’ heads and address morale or nagging questions |
Career mentoring and coaching (“How can I help you grow?”) | Your unique view of the project as a peer on the project will allow you to feed your teammate advice as to how to adapt her or his skills to seek advancement. Focus on giving actionable feedback with an immediate application. |
Real-time feedback (“How am I doing?”) | Since you are a peer doing the work along with your coworker, you can give feedback in near-real-time as it happens. That gaffe in that meeting, or a code review comment that fell flat, or the bungled feature deployment can all be discussed or addressed at your 1:1 without having too much time pass between the incident and the resolution. |
The lead doesn’t have direct managerial authority, and in many ways, that is your superpower. Your authority comes in a tactical form, and that oftentimes makes input easier to swallow (than, say, if it came from your boss). You may find that your teammates find it easier to open up to you when they find that you are a safe person to confide in (more on that in a future post).
This also makes you valuable to your managing counterpart, as they can work with you to come up with coaching strategies for new hires or underperforming teammates.
At Lyft, we have a strong culture of inter-team one-on-one meetings. I’ve used these meetings to talk teammates through interpersonal issues, provide a listening ear to discuss the purpose of different strategic initiatives, or simply talk through execution-oriented agenda items like code style and architecture patterns.
By the way, this doesn’t mean that I don’t believe that managers can do these things either! I just mean that tech leads have more natural authority that comes from their tactical involvement in the day-to-day.
Grove encourages managers to convene Operational Review meetings, whose purpose is to review the progress of an initiative or project. They can be held with a cross-team audience or within the team. The purpose of these meetings is to help leadership assess the health of a project or initiative so they can make decisions to keep it on track.
As a tech lead you may consider your role in presenting operational reviews to product or technical leadership. While your immediate management counterpart likely already has a great grasp on the status of your project, you can maximize alignment potential of communicating status clearly to cross-functional stakeholders.
Imagine a tech lead who is asked to represent the team at a monthly status meeting that convenes leaders in mid- to senior-level management who are interested in the progress on the team’s highly-visible project. Our lead may want to represent the technical execution aspect of the project, ready to answer questions like:
I’m going a little off-script here since this isn’t exactly a Grove takeaway in his book, but from my experience it is just as important for the lead to demonstrate status progress to her peers - other leads or managers in other parts of the org. This reduces duplicative work, allows “aha” realizations that one team can use the tech of another team’s work - in other words, it increases leverage.
These peer reviews can take several forms. You may want to convene a cross-functional group of peers (co-leaders at your level) to communicate the status and challenges of your product surface. This can be something as simple as explaining progress on the project, presenting some architectural diagrams and summarizing with some needs and open blockers. Your audience may take time to give their perspective, to offer solutions, or to help problem solve with you.
Else, the tech lead can send out group updates to his or her peers in a shared Slack channel or email distribution. Peer cohorts can subscribe to these updates and glance through the updates periodically.
Example: At Lyft, a group of engineers in our Growth organization often gather in a Friday meeting to discuss their product areas on a monthly cadence. This group gathered represents teams from across nearly the entire product surface. Conversations may range from architecture reviews, to product presentations with open Q&A for presentations or brainstorming. The objective of the meeting is to share information that would otherwise have been contained in a silo, and find leverage points between the groups where opportunities may have been missed.
Grove talks about the primary responsibility of managers to make decisions. How should this work? Managers need to provide a venue to provide input, facilitate discussion, then after all inputs have been received, the manager can make a decision.
We can apply this to technical leading in two ways. First, it is important for a lead to be a decision-making partner to her management counterpart. Let’s imagine a manager needs to decide whether to make a case to senior leadership to expand headcount on the team. The technical lead can make the tactical case for doing so by explaining how the team’s output is blocked due to a key skill set being missing with specific examples of PRs, code samples, and/or specific backlog items that went unfinished or were underinvested in.
Secondly, the tech lead herself may make decisions. These decisions tend to live at the boundaries at the tactical and the strategic, and may include:
Of course, these decisions do not occur in a vacuum. Like Grove suggests, our lead must be able to convene the group of experts (the team) and have the team give their feedback and opinion.
Tech leads reading Grove’s High Output Management get a cheat sheet into the manager’s mindset, helping their management counterparts get the clearest picture into the performance of the team.
Tech leads can also take on a manager’s mindset, using 1:1s with their teammates to develop each member. They maximize their leverage as technical leaders to build bridges between other groups within their organizations.
Andy Grove: High Output Management. https://www.amazon.com/High-Output-Management-Andrew-Grove/dp/0679762884
]]>(Note: this post has been republished to Medium in the Towards Data Science publication.)
As I’ve been growing my data science and machine learning chops, I’ve thought long and hard about what it means to develop a user-centered model. It seems to me that a lion’s share of ML energies are spent cultivating the data set and performing tricks to get the model architecture right.
After all, ML model development is hard - in addition to developing a data pipeline to arrive at a large data set that’s properly formatted, preprocessed, scaled, and trained on, there’s problems of balance, bias, out-of-training performance, and overall model performance.
Developing the right model architecture and data pipeline, though flashy and news-article-worthy, is only one part of the process. And yes, amassing a large data set is fundamentally important in building an effective ML model, but yet again it misses some more fundamental challenges in developing ML products:
In my last article and in my PyGotham talk, I argued that the best way to ground the model in the problem domain is to radically keep the user front-and-center in the entire development lifecycle. As it turns out, the field of User-Centered Design has been advocating for this thinking for the past couple of decades.
Central to the discipline of user-centered research are Personas, which are fictitious representations of real users that inform our development work. Developed by programmer Alan Cooper in the 80’s, Personas:
When the whole team internalizes the personas, it helps focus their work. They can now refer to users by name, considering whether each product feature, or machine learning capability they develop will impact their customer personally.
Personas, in short, animate our customer and put them first in every product- (and data-) decision that we make.
Imagine a scene playing out in the engineering team at AcmeWidgets.com, where our team is developing a recommendation system for the e-commerce retailer to surface relevant items on product pages. To do this, a crack team of ML engineers and data scientists are gathered to build the system.
The team begins by defining an objective function for their model. In this case, the team decides to choose a function that maximizes shopping cart value (our objective function). For example, it may discover that users have a propensity to splurge on a set of high-ticket-value items in our catalog.
However, things soon go awry. The team notices that their model ends up making low-quality recommendations that sacrifice long-term customer retention for short-term boosts in average order value. The team discovers that their model ends up pushing customers to purchase items that they don’t really want or need, decreasing the long-term loyalty of the customer to the product and the business. In fact, the products that the recommendation engine suggests end up being returned at a far higher rate, incurring operating expenses and costs for the business.
Well surely that’s just a matter of tweaking the model! The team goes back to the drawing board and this time launches a model that changes the objective function of the model to minimize the return rate of its products. The model is tested and voila, the system is again chugging along happily again. Users seem to be happy with their purchases…
…until three months down the line, when they discover that users that interact with this recommendation engine are actually churning and leaving the product at far higher rates than those who do not see the recommendations! As it turns out, these recommendations have upset and angered users so much that they do not even bother returning.
And so the team goes back to the drawing board, dejected and feeling a sense of foreboding of what else they might find in the next iteration of their project.
If we went back and sat down with the team at their project retrospective, we might have heard reflections like:
Of course, many of these things can only be learned from experience! But could there have been a better way forward for the project?
The key issue is that team was only thinking at the tactical level. They thought - “oh, a recommendation engine should be simple. We’ll just pull in Architecture X for our model, train against objective Y and ship it when we can attain model performance Z”.
ML practitioners will often tell you that a great ML system blends domain expertise and expert intuition. What this means is that ML models must be designed with team members who have deep understanding with the business domain, a deep familiarity of the data set, and a deep intuition of the customer’s needs.
How do you get domain expertise? Well, you have to have the right people embedded on the team, solving the right problems, considering the customer at all points in the model development process. How might having user personas have helped us avert this situation?
Imagine that the team, back in the beginning, agreed to consider their customers as living, breathing people in the real world. In fact, their UX colleague did a set of customer interviews that resulted in a set of composite sketches of their customers:
Luke, the office admin
What | Description |
---|---|
Profile | Luke is a 31-year old male living in Indianapolis and works as an office admin for a small fulfillment business |
Motivations | Luke needs to keep widgets stocked in the office for the employees to use. Given that the office runs out of widgets regularly, he needs to keep them stocked every couple of weeks. He dreads having to log back in to the web site to place another order since he finds it tedious. |
Goals | Seamless, regular order re-fulfillment for the same SKU |
Harmony, the wedding planner
What | Description |
---|---|
Profile | Harmony is a 41-year old female who lives in Boston and runs an event-planning business that is just getting off the ground. She loves to supply Acme Widgets at her events because they are loved by the guests and provide her business the visibility she needs. |
Motivations | Harmony has dreams to build an event-planning empire in her city |
Goals | Unique, hard-to-find, desirable widgets to make her business stand out |
Tianqi, the Work-From-Home dad
What | Description |
---|---|
Profile | Tianqi is a remote freelancer who lives in Shanghai, who enjoys his work-from-home flexibility because it allows him to do childcare for his family while his partner works full-time. |
Motivations | Tianqi wants an ordered, efficient home life on a shoestring budget |
Goals | Buy what I want at a budget |
The reason we might want to consider personas is so we can firsthand understand our customers’ goals and motivations. Harmony wants to discover unique items that pop. Luke just wants to fulfill a recurring order for the office. Tianqi wants the cheapest items, period.
The team, when considering their recommendation algorithm, should be actively referring to Harmony, Luke and Tianqi by name as they develop their model –
TEAMMATE 1: If we choose to use the collaborative filtering model, we have to consider the fact that we have a very small sample size of users who match up to Luke’s use case (power fulfiller). I’m worried that Luke’s cohort is just going to see a lot of garbage recommendations before any meaningful signal emerges, and get turned off by our product.
TEAMMATE 2: Yeah, that’s true. But we know that Harmony’s cohort accounts for over 75% of our sales. There’s lots of opportunity here. What if we went with a bandit approach? That should hopefully minimize the number of bad recommendations that get out in front of our users.
TEAMMATE 1: Not a bad idea. We could learn pretty quickly given the scale that we operate at, and we’d minimize the amount of time we serve bad recommendations to Luke’s cohort.
Or consider another scenario in the development process, where the team has to grapple with negative interactions with the recommender in the wild:
TEAMMATE 1: We’ve received some customer feedback on Twitter that our products are too expensive. Apparently some customers feel like they were duped into purchasing items they didn’t need - the Tianqi cohort. How can we be sure that folks are truly receiving valuable and worthy recommendations?
TEAMMATE 2: Why don’t they feel like their items are valuable?
TEAMMATE 1: For some reason, our recommendation system is creating some sort of buyers’ regret. We might be over-hyping some of our promotions, or we may be pushing some items that have had quality issues.
TEAMMATE 2: Let me start a conversation with Juli, our UX designer, to see if there’s some sort of user research that would confirm or validate this hypothesis. And if there’s some sort of feature we need to re-incorporate into our model to make sure we’re making truly high-quality recommendations, let’s incorporate it. Else if our recommendations aren’t up to snuff, let’s test a new model variant where we don’t show any results at all.
TEAMMATE 2: Got it.
See? That’s the kind of holistic, customer-centric conversation we want to build into the team’s natural modus operandi. And all those things are enabled by customer personas.
Personas are powerful - and they’re not a silver bullet. What personas help us do is imagine our customers as real people, with real goals, motivations and frustrations. Doing so will give us the vocabulary to discuss them as first-class citizens to then orient our technical solutions around.
Oh - and by the way, the People & AI Research (PAIR) team at Google has done far more thinking about the process of designing human-centered AI products. Have a read through their best practices guide to learn from their deep experience building ML products.
What do you think? Do you have experience building ML models in customer-centric ways? Reach out and let me know on Twitter at @andrewhao.
Here’s the recording of the talk from PyGotham 2019.
Nighttime at our house is pretty hectic and crazy, and we have sleep battles with our toddler all the time. The question was, just how much were we locking horns? I decided to train a machine learning model to find out.
I discussed how I did this in Part I and Part II of my talk. In the slides, I go over the data pipeline I used to tag and label the data, which if you ask me is pretty interested. One interesting tool (I won’t talk about it here) is EchoML, which allows you to tag and label different parts of audio files for utterances.
The interesting stuff is what happened after I was able to get the system set up and doing real-time audio detection. Can this dad find some sleep training insights, or will he be forever doomed to tears?
Here’s some real-time sleep data
Cry patterns, measured by minutes per day
At the end of my time, I realized that quantifying sleep progress was actually kind of depressing, especially when you see those numbers continue to jump as the months go by.
I loaded up the data in a Jupyter Notebook (link) and tried to slice the data for some insights.
Note how beyond actual “loudness”, the only variables that correspond with cry patterns are month and day of week. We cannot truly depend on month, since I only had one year of data.
The only interesting finding was that there was a 1% correlation of crying with the day of the week. However, that’s small enough to be within the margin of error for analysis. It could be interesting when subjected to further analysis.
My attempt at a correlation heatmap. Nothing interesting here
Looking for correlations between temperature and humidity and crying - once again, nothing interesting.
At the end of my analysis, I realized that quantifying sleep progress was actually kind of depressing, especially when you see those numbers continue to jump up and down month after month.
So was this project useless? Was I back at square one?
One night Annie and I were staring at the baby monitor during another cryfest and we were up to our eyeballs in frustration. She looked over at me and said “I’m going in”
“What?” I replied, incredulously. “You know we can’t do that. We’ll just reward his crying and his sleep training will be ruined!”
But she went up anyways. It was silent for 20 minutes and all I could hear was the white noise machine.
She came back downstairs, sat down next to me and all she could say was “kids are weird”.
No kidding. “What happened?” I asked.
“I sat down on the floor next to his bed and we just looked at each other. Then I grabbed a blanket and lay down on the floor and pretty soon he lay down too and then he fell asleep.”
“He just wants to be with us, Andrew”
At that moment I had this hot flash of shame. Had I missed something vitally important as the fact that my son needed us, but I had been so fixated on my way of doing sleep training that I had been holding back comfort and connection?
Of course, it didn’t always work. And yes, “going in” really does reinforce bad sleep habits. But sometimes, that’s just what you do when you need to give your kid (and everyone else in the house) a break.
I’m thankful and glad I embarked on this project. It’s taught me a lot about deep learning, and the process of understanding the internals of TensorFlow models. It’s also shown me how to optimize model training, as well as the importance of labeling data thoroughly and accurately.
However, these days I don’t watch the baby monitor nearly as closely as I used to. I don’t really care for the minute-by-minute information because I understand its limitation.
As I was preparing my talk, I realized that there was an important connection to real-world machine learning.
In the same way that I dismissed my son’s human need for connection for the sake of building my model off of my idea of inputs and data, do we, as ML and engineering practitioners miss seeing the human needs of the people adjacent to our ML models?
How do we go from here…
There are humans whose inputs are valuable to our models. If we base our models only off of only what we measure through quantifiable datasets and pipelines, we run the risk of building systems that fail to serve (or worse - harm) the people they are designed to help.
…to here?
How can we build a consciousness of our customers and the humans that work within? I have a silly idea, but I think it might be a good one: Personas. Read the next blog post, “Integrating Personas in User-Centered ML Model Development”, to learn more.
Five years ago, I first inadvertently found myself in engineering management. My manager sat me down in a room and told me he was leaving, and that he wanted me to take his place. Would I be up for the challenge?
I told him I’d think about it, then went home and started freaking out. He had been so critical to the company was walking off with so much in-house experience, relationships, and domain knowledge. Was everything going to fall apart without him? Would I be able to fill his shoes? What if I couldn’t?
I came back the next morning and told him I was up for it. A few days later, I was on my own. There was going to be a steep learning curve and plenty of bumps on the road. Sure enough, I made more than a few mistakes.
The company at the time was in a pretty bad spot. Revenue was plateauing, and morale was kind of meh. The organization had gone through a high amount of turnover, and things weren’t looking like they were getting better.
At the time, the project management team had a reputation for being pretty heavy-handed with Project Managing™️ and had clashed my boss. Hard. I knew where my boss was coming from and agreed with his approach, but his personal style had burned a few bridges.
One of the first things I did as a manager was to start the team doing retrospectives, but doing so apart from the rest of the product delivery organization. I honestly felt like our working relationship was not in a productive place, and I feared that my team would not be able to honestly voice their opinions out in the open. So I told everybody - we would be doing our reflections and retrospectives on our own. Closed doors. It felt like the right thing to do at the time, with the uncertainty of transition, I just didn’t trust other people to be able to hear my team out.
The problem with huddling in a defensive posture is that I didn’t end up trusting other people to be present in the room, give people the space to voice their opinions, and become an improving organization. I had also projected my internal beliefs and (mis)understandings of the rest of the product delivery organization and assumed the worst intentions from them rather than offering them the benefit of the doubt.
Overcome fear by posturing the team with a learning mindset and a framework for honest communication.
Always assume the best intentions from others, especially when relationships haven’t formed yet
Hiring was (surprise, surprise), one of the most challenging aspects of management. Collectively, our team and I decided on some criteria we wanted for prospective candidates, which kept a pretty high bar. They had to excel at data structures, have a great nose at OO design, understand best practices of software development, and all that jazz.
Of course, these are important skills for a senior software engineer! But once in a while, we’d come up against a candidate that didn’t really have the skills to qualify for the role in our job description but had a fantastic personality or a curiosity to learn.
When trying to make the hiring decision, I oftentimes found myself picturing this candidate in the mindset of “could this person kick butt tomorrow on my team?” In my mind, this would be the domain of the grizzled software veteran.
But what I wasn’t asking was: “given 6 months on our team, could this person really grow into (the engineer I’m imagining)?”
I did once hire a junior engineer fresh out of boot camp who later went on to become one of our most effective engineers on the team. It was a credit to her teachability, curiosity and collaboration skills that gave her a career trajectory far higher than I originally estimated for her.
Hire for the engineer you may have six months down the line, not necessarily the engineer you see now. Prioritize empathy, curiosity, and teachability - not just career experience.
I had an engineer on my team who was, from my perspective, underperforming by not staying on task for much of the workday. When asked to do work, he’d crank out some code (that was pretty decent), but it would lack tests or would be overlooking some key requirement in the story. He had a reputation that was unbeknownst to him, but obvious to everyone else that worked with him.
For some reason, I kept feeling like I needed to sugarcoat my feedback. In our one-on-ones, I would dance around the topic. “You do good work, but you missed some requirements here and there”. “Good job on delivering X. I hope we can give you a big project Y, but I think that PR stayed out a bit too long. Let’s keep our eye on the ball.”
But it wasn’t working. Fellow engineers would acknowledge his lagging contributions in vague, general terms, not wanting to come off as mean or underhanded. Product owners would turn to me after sprint planning and ask what they could do to help.
Truth of the matter, the responsibility for his performance was mine, and I needed to give him more critical, crucial feedback that would open to his eyes to the reality of his situation.
What I needed to tell him was, “Hey, you’re underperforming and people are noticing. I need you to rein in your web surfing and pitch in with some solid work. I know you’re capable of doing this - let’s focus on testability this week…”. Because if I didn’t give him this feedback now, he’d just end up hurting our team’s morale. I was doing him no favors by shielding him from hard feedback.
Direct, actionable feedback from a place of genuine care is a crucial component of great management (see Kim Scott’s excellent book, Radical Candor).
After about a year and a half in my position, I decided to move back to an individual contributor role - the next four years of my career were happily spent in software consulting at Carbon Five with the most fantastic people I’ve ever worked with. The lessons I’ve learned from my first stint in management were hard lessons to learn, but they’re mistakes I hope I’ll only have to make once.
In a future post I’m going to write about the great experiences I did get to have in software management, and why I think it’s something worth doing.
]]>Several articles have been circulating recently in the Agile community critiquing the current state of the practice (see: “The State of Agile Software in 2018” - Fowler and “Developer Should Abandon Scrum” - Jeffries). At their core, they are really saying the same thing - Agile, as originally intended, was meant to be a living, breathing, freeing process for the team. Compare that to the rigorous imposed frameworks deployed across organizations these days.
Long ago[1], I worked at a company that deployed Scrum across the entire company. This wasn’t some mega-corporate behemoth, it was a mid-stage startup that had been around for a good while. But it was struggling. Teams across the org weren’t working the same way. It was hard to pin down milestones and delivery dates and coordinate work the same way. Some engineering teams worked in an ad-hoc, Kanban-ish way. Others were more centralized, top-down planning. Project managers wanted more predictibility in our delivery platforms.
And lo, an Agile working group was formed, and they chose to roll out Scrum. And to be honest, the implementation of Scrum we chose wasn’t a bad one. A process was rolled out in which we all used JIRA, had regular planning meetings, standups, retrospectives and all that jazz. But something still wasn’t sitting well with me.
We assigned stories to engineers at our planning meetings. “Oh, Jay is good at the payments system, so he’ll take this story”. Jay mumbles something in agreement. “Esther knows infrastructure items better, so she’ll handle the deployment tasks on this task.” And Esther would do it. But pre-assigning stories would lead to knowledge silos, and creating these false boundaries of what was “acceptable” to work on.
Our project manager - bless her soul - would attempt to answer big questions through Scrum in a way that abused it. We were asked to estimate stories three sprints out so a delivery date could be given to upper management and other stakeholders. Of course, we would always miss these targets. (Once, we were asked to estimate the scope of an entire project by interring the entire team in a room and writing out six months’ worth of stories!)
It was clear that Scrum had brought some wins. Retrospectives really were introducing a feeling of continual improvement and shared accomplishment. Estimating stories was helpful in raising conversations to the fore that would help the team arrive at an accurate estimate. But there were still some pain points - Scrum was still leveraged by our project management structure to organize major efforts toward milestones and deadlines in ways that were pressure-filled and potentially harmful to the team. In short - Agile process was implemented in a top-down manner that emphasized control, predictability and rigidity. And it didn’t feel great[2].
Around the same time, I started attending an XP meetup (hosted by Pivotal Labs SF) and I met a ton of lovely folks who were happy to inform me that, yes, there was a better way to Agile. And in fact, the better way to Agile was to embrace uncertainty and chaos more, to trust teams to do their own thing, and to bake some solid technical practices in with the work, too.
I was sold, and I ascertained that my next gig was going to be at a company that practiced XP. And so when the opportunity arose to work at Carbon Five, my current employer, I jumped.
In my software consulting life at Carbon Five, I work on projects that ramp up quickly in domains fraught with uncertainty and organizations (oftentimes) wrestling with dysfunction. We practice what I call “lower-case xp” in nearly all of our projects. Meaning we don’t force our clients to do ALL the things in by-the-book XP, but we keep the important things. Those things are:
Frequent pair-programming: On stories and tasks that are fraught with unknowns, or for a team with knowledge silos, there’s really nothing better than pairing through and sharing knowledge. This has been the number one way that I know how to level up developers on a team. My teams don’t typically pair 100% of the team - the ratio has been closer to 40% to 60% of the time.
Pull-based work streams: Since we pair program so much, the entire team tends to be very balanced in terms of their knowledge and familiarity of the system. This means nearly anyone on the team should be able to take on any task in the backlog - or find a pair partner who could. This eliminates the need to pre-assign work at the beginning of each iteration.
Cutting scope, pushing deadlines, but never overworking: The team never works more than 8-hour days. We put in focused 8 hours of work, then we’re off. If a project expands in scope due to unforeseen circumstances, technical debt, scope creep or whatnot, then our product owner has a choice to cut scope to meet the deadline, or move the deadline itself.
Short iterations: I have really enjoyed one-week iterations. Two is fine. Anything longer than that, and it feels like a slog.
Craftsmanship practices - batteries included: XP isn’t particularly prescriptive, but it does recommend development practices like CI/CD, and test-driven development. I find that folks who buy into XP also tend to be a certain type of programmer that values software design and testability[3].
Continual improvement: This is arguably the one thread that, no matter what Agile religion you follow, you should always prioritize. In XP, this would be our reflection (or retrospective), where the team has a space to reflect on what went well and what could be better, then make commitments to improve upon those things. There’s nothing more empowering than a team that continually improves iteration over iteration again.
There are lots of ways to Do Agile. My intention isn’t to bash Scrum, although I’ve seen it abused. Matter of fact, though, is that doing Scrum can lead to lots of wins, because the principles it sits on are still solid.
My preferred method of working is with a lightweight lowercase-xp style of work, that promotes collaboration, bakes in best development practices, and embraces uncertainty.
What about you?
[1] It wasn’t that long ago. [2] Some would say this was a mis-execution of Scrum. I don’t disagree - I actually like the core tenets of Scrum and believe it works well when it’s understood to not be a silver bullet. [3] This is purely my observation, and is not meant to be a generalization!
]]>The last time we met, we had a lively discussion about the ins and outs and joys and terrors of parenting. I talked about how I started building a Raspberry Pi project with a USB mic and wrote a simple parsing script that measured the mean amplitude of recordings of the current state of the nursery. And when that guy wailed, he really WAILED.
Well that naive approach got us so far, but the system would still trip up due to random loud noises in the house. Music, or doors closing or opening, or loud conversation would all cause the system to think the kid was crying but no - it was just ambient noise.
Let’s start our dive into machine learning!
I had already gotten pretty far with Udacity’s Machine Learning course and had vague recollections of AI theory floating around from the cobwebs of undergrad CS courses past. So I had some background in AI and machine learning. But training neural networks was a completely new thing to me.
Luckily, I came across the Simple Audio Recognition Tutorial example right there on the TF homepage. This was exactly what I was looking for. The objective was: given a training set of audio clips of crying babies and “empty rooms”, classify an audio clip as one or the other.
I had oversimplified in my mind what made a great training data set. After all, I figured that I just had to record my kid crying a bunch, then get a few minutes of quiet room sounds, and then we were good, right?
Wrong.
Training data must be exactly matched. Sample rates must be consistent, and audio samples must either be highly randomized or at least normalized to the same rates. To wit, here’s how I gathered my sample data:
crying
folder.silence
folder.sox
. I also applied a few amplification filters to overcome the weak pickup on the mic.$ sox FILENAME FILENAME_OUTPUT trim 0 5 vol 45 dB rate 22050
I then placed each of these trained samples in the folders corresponding to their label: crying
and silence
.
I modified the train.py
script nearly verbatim from TF docs. We’ll dissect it here, beginning with the command to begin training:
1
$ python app/train.py --data_url= --data_dir=./data --wanted_words=silence,crying --sample_rate=22050 --clip_duration_ms=5000 --how_many_training_steps=1000,200 --learning_rate=0.001,0.0001 --train_dir=./training
--wanted_words=silence,crying
: This specifies which labels should be considered for training purposes.--sample_rate
: The sample rate of the audio files provided--clip_duration_ms
: Duration of each training clip in milliseconds--how_many_training_steps
: This is a comma-separated list of n numbers that specify the number of steps per phase.--learning_rate
: This is the rate at which the system can adjust its current learnings to match its new inputs. A higher learning rate means the faster the system can change to learn new inputs. A lower number ensures the stability of a system’s learning. We specify a higher training rate for the first phase and lower the training rate in the latter phase as our precision increases.Let’s run the script!
1
2
3
4
5
6
7
8
9
$ python app/train.py --data_url= --data_dir=./data --wanted_words=silence,crying --sample_rate=22050 --clip_duration_ms=5000 --how_many_training_steps=1000,200 --train_dir=./training
2018-08-27 22:15:33.195894: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Tensor("Placeholder:0", shape=(), dtype=string)
INFO:tensorflow:Training from step: 1
INFO:tensorflow:Step #1: rate 0.001000, accuracy 26.0%, cross entropy 2.674081
INFO:tensorflow:Step #2: rate 0.001000, accuracy 23.0%, cross entropy 1.593786
INFO:tensorflow:Step #3: rate 0.001000, accuracy 64.0%, cross entropy 1.067298
INFO:tensorflow:Step #4: rate 0.001000, accuracy 73.0%, cross entropy 0.843605
...
The output in each step reveals the current state of the neural network as trained. Accuracy reflects the correctness of the model as the validation set is tested against the network (during each step of training, samples in the validation set are run against the model and checked if they line up with their specified labels).
Cross entropy is, as I understand it, the squared error factor in the result network from the actual results (lower is better).
This would proceed for several hours for 1200 total steps. On a 2013-era Macbook Pro, this took approximately 6 hours. I had 350 clips of crying and 500 clips of silence. (Too much or not enough? This tired parent says “too much”.)
One more thing - every few hundred steps during training, we would get this sort of output:
1
2
3
4
5
INFO:tensorflow:Confusion Matrix:
[[ 9 0 0 0]
[ 0 0 0 0]
[ 0 0 55 0]
[ 0 0 1 30]]
What’s a confusion matrix? It’s, according to this helpful article, another way to visualize the accuracy of a machine learning model.
Here’s how to read this confusion matrix. Given the following labels:
1
2
3
4
5
# conv_labels.txt
_silence_
_unknown_
silence
crying
(Where did these come from? More on that later…)
Imagine these labels go left-to-right, and top-to-bottom. The x-axis represents the labels that have been predicted (i.e. that have been verified) to be a certain label. So the first column represents the percentages of samples that have been predicted to be _silence_
.
The labels that go top-to-bottom are the actual results from the trained model. So in this case, if we take the top-left number 9
, that means that in 9 runs of the model where the prediction was _silence_
, the actual result was _silence_
. Moving one cell down, that represents the # of samples where the predicted result was _silence_
, but the actual result was _unknown_
. Fortunately for our model, there are 0
results in this cell. So on and so forth. So the ideal confusion matrix is a matrix that has a “diagonal line” running from top-left to bottom-right, and 0
s everywhere else, because all predictions would equal actuals.
tl;dr: Confusion matrices are a way to visualize and report the accuracy of a machine learning model. You want a clear and convincing diagonal line in the matrix.
Oh, and here are the results from the training data. I’m using tensorboard
to visualize the training steps:
Accuracy modeling. Note how quickly the model jumps to be fairly accurate.
Note how quickly cross entropy dives.
We can use these graphs to tune our models if we really cared. In this case, I say it’s good enough (accuracy is up to 99% by the end).
OK, but enough already. We have a trained model and, like Chekhov’s Gun, that means we’ve gotta use it!
Where’s that model? Oh, it needs a few more steps before it can emerge. At this point, TensorFlow has developed a neural network, but the neuron graph (is that the right term?) is not yet in a usable state to be used by applications. To that point, we need to dump the model into a binary format that can be used by TensorFlow applications in the future.
Once again, I claim no smarts in all this, but instead point to the TensorFlow script to do this in app/freeze.py
:
1
$ python app/freeze.py --start_checkpoint=./training/conv.ckpt-1200 --output_file=./graph.pb --clip_duration_ms=5000 --sample_rate=22050 --wanted_words=silence,crying --data_dir=./data
What did we specify here?
We said we wanted the model at --start_checkpoint
of 1200, saving the --output_file
to graph.pb
, and mentioning that the sample rate of each audio sample should be 22050 hz
and 5
seconds long. We then specified that the labels we wanted to classify are silence
and crying
. Finally, the data set from the prior run can be found in ./data
dir.
When we run this script, we get a graph.pb
protobuf binary file that we can then ship to various TensorFlow programs.
Now here’s the fun part!
Onboard a Raspberry Pi, we are now going to play back current samples in the nursery:
Every 1 minute (with cron), we record audio samples from the system mic in the baby’s nursery. We then massage, crop and downsample it into a WAV file. We then point a script at this WAV file and run the TF graph on it. Running the graph will return a list of labels and their probabilities. We choose the first label with the highest probability, and ship it off to a timeseries API, in this case powered by Keen.io.
Voila:
A fun graph displaying the timeseries data for this little dude’s episodes. On the top was the original RMS volume graph, and the bottom is the result of the trained TF model. Note how much easier to read and understand the latter graph is.
Much of this code has been adapted from Google’s TensorFlow Audio Recognition tutorial.
My scripts have been collected on this GitHub repository: https://github.com/andrewhao/babblefish
And my repository with audio sampling and archiving: https://github.com/andrewhao/miserymeter
Wow, that was a quick dive through TensorFlow. Note that I didn’t get too deep into the theory of convolutional neural networks, which may be a topic of discussion for another time. Instead, we talked a little bit about the mechanics of building and training a TF model with audio data and a finite set of classes. It was fairly straightforward to get this script then loaded up on a Raspberry Pi and have a dashboard that could finally quantify the pain and suffering of baby… and parent.
The months that went into developing this app were fully and knowingly an escape from the very real stressors of parent life. I want to acknowledge the love, the grit and the patience of my wife in this very trying time. There is so much more to say about babies other than their crying and fussing - life these days is filled with laughter and giggles and joy, too - but these are words that I’ll save for another blog entry on a different blog for another time.
]]>