Category Archives: Uncategorized

What does a tech lead do?

This was written internally at Etsy. I was encouraged to post on my own personal blog so people could share. These are my opinions, and not “Etsy official” in any way.

Motivation for writing this

For the past 5 months, I have been the tech lead on the Search Experience team at Etsy. Our engineering manager had a good philosophy for splitting work between managers and tech leads. The engineering manager is responsible for getting a project to the point where an engineer starts working on it. The tech lead makes sure everything happens after that. Accordingly, this is intended to document the mindset that helps drive “everything after that.”

Having a tech lead has helped our team work smoothly. We’ve generated sizable Gross Merchandise Sales (GMS) wins. We release our projects on a predictable schedule, with little drama. I’ve seen this structure succeed in the past, both at Etsy and at previous companies.

You can learn how to be a tech lead. You can be good at it. Somebody should do it. It might as well be you. This advice sounds a little strange since:

  • It’s a role at many companies, but not always an official title
  • Not every team has them
  • The work is hard, and can be unrecognized
  • You don’t need to be considered a tech lead to do anything this document recommends

But teams run more efficiently and spread knowledge more quickly when there is a single person setting the technical direction of a team.

Who is this meant for?

An engineer who is leading a project of 2-7 people, either officially or unofficially. This isn’t meant for larger teams, or leading a team of teams. In my experience, 8-10 people is an inflection point where communication overhead explodes. At this point, more time needs to be spent on process and organization.

What’s the mindset of a tech lead?

This is a series of principles that led to good results for Search Experience, or are necessary to do the job. I’m documenting what works well in my experience.

More responsibility → Less time writing code

When I was fresh out of college, I worked at a computer vision research lab. I thought the most important thing was to write lots of code. This worked well. My boss was happy, and I was slowly given more responsibility. But then the recession hit military subcontractors, and the company went under. Life comes at you fast!

So I joined BigCo, and started at the bottom of the totem pole again. I focused on writing a lot of code, and learned to do it on large teams. This worked well. I slowly gained responsibility, and was finally given the task of running a small project. Until this point, I had been successful by focusing on writing lots of code. So I was going to write lots of code, right?

Wrong. After 2 weeks, my manager pulled me aside, and said, “Nobody on your team has anything to do because you haven’t organized the backlog of tasks in three days. Why were you coding all morning? You need to make sure your team is running smoothly before you do anything else.”

Okay, point taken.

So I made daily calendar reminders to focus on doing this extra prep work for the team. When I did this work, we moved faster as a three person unit. But I could see on my code stats where I started focusing more on the team. There was a noticeable dip. And I felt guilty, even when I expected this! Commits and lines of code are very easy ways to measure productivity, but when you’re a tech lead, your first priority is the team’s holistic productivity. And you just need to fight the guilt. You’ll still experience it. You just need to recognize the feeling and work through it.

Help others first

It sounds nice to say that you should unblock your team before moving yourself forward, but what does this mean in practice?

First, if you have work, but someone needs your help, then you should help them first. As a senior engineer, your time is leveraged–spending 30 minutes of your time may save days of someone else’s. Those numbers sound skewed, but this is the same principle behind the idea that bugs get dramatically more expensive to fix the later they are discovered. It’s cheaper to do things than redo things. You get a chance to save your teammates from having to rediscover things that are already known, or spare them from writing something that’s already written. Some exploration is good. But there’s always a threshold, and you should encourage teammates to set deadlines based on the task. When they pass it, asking for help is the best move. This could also help with catching bugs that will become push problems or production problems before they are even written.

Same for code reviews. If you have technical work to do, but you have a code review waiting, you should do the code review first. Waiting on someone to review your code is brutal, especially if the reviewing round-trip is really long. If you sit on it, the engineer will context switch to a new task. It’s best to do reviews when their memory of the code is fresh. They’re going to have faster and better answers to your questions, and will be able to quickly tweak their pull request to be submission-ready.

It’s also important to encourage large changes to be split into multiple pull requests. When discussing projects up-front, make sure to recommend how to split it up. For instance, “The first one will add the API, the second one will send the data to the client, and the third one will use the data to render the new component.” This allows you to examine each change in detail, without needing to spend hours reviewing and re-reviewing an enormous pull request. If you believe a change is too risky to submit all at once because it’s so large that you can’t understand all of its consequences, it’s OK to request that it be split up. You should be confident that changes won’t take down the site.

Even with this attitude, you won’t review all pull requests quickly. It’s impossible. For instance, most of my team isn’t in my timezone. I get reviews outside of work hours, and I don’t hop on my computer to review them until I get into work at the crack of 10.

I personally view code reviews and questions as interruptible. If I have a code review from our team, I will stop what I am doing and review it. This is not for everybody, because it’s yet another interruption type, and honestly, it’s exhausting to be interrupted all day. Dealing with interruptions has gotten easier for me over time, but I’ve gotten feedback from several people that it hasn’t for them. You will never be good at it. I’m not. It’s impossible. You will just become better at managing your time out of pure necessity.

Much of your time will be spent helping junior engineers

A prototypical senior engineer is self-directed. You can throw them an unbounded problem, and they will organize. They have an instinct for when they need to build consensus. They break down technical work into chunks, and figure out what questions need to be answered. They will rarely surprise you in a negative way.

However, not everybody is a senior engineer. Your team will have a mix of junior and senior engineers. That’s good! Junior engineers are an investment, and every senior engineer is in that position because people invested in them. There’s no magical algorithm that dictates how to split time between engineers on your team. But I’ve noticed that the more junior a person is, the more time I spend with them.

There’s a corollary here. Make sure that new engineers are aware that they have this option. Make it clear that it is normal to contact you, and that there is no penalty for doing so. I remember being scared to ask senior engineers questions when I was a junior engineer, so I always try hard to be friendly when they ask their first few questions. Go over in-person if they are at the office, and make sure that their question has been fully answered. Check in on them if they disappear for a day or two. Draw a picture of what you’re talking about, and offer them the paper after you’re done talking.

The buck stops here

My manager once told me that leaders take responsibility for problems that don’t have a clear owner. In my experience, this means that you become responsible for lots of unsexy, and often thankless, work to move the team forward.

The question, “What are things that should be easy, but are hard?”, is a good heuristic for where to spend time. For instance, when Search Experience was a new team, rolling out new features was painful. We never tire-kicked features the same way, we didn’t know what groups we should test with, we’d (unpleasantly) surprise our data scientist, and sometimes we’d forget to enable stuff for employees when testing. So I wrote a document that explained, step-by-step, how we should guide features from conception to A/B testing to the decision to launch them or disable them. Then our data scientist added tons of information about when to involve her during this process. And now rolling out features is much easier, because we have a playbook for what to do.

This can be confusing with an engineering manager and/or product manager in the picture, since they should also be default-responsible for making sure things get done. But this isn’t as much of a problem as it sounds. Imagine a pop fly in baseball, where a ball falls between three people. It’s bad if everyone stands still and watches it hit the ground. It’s better if all of you run into each other trying to catch it (since the odds of catching it are better than nobody trying). It’s best if the three of you have a system for dealing with unexpected issues. Regular 1:1s and status updates are a great way to address this, especially in the beginning.

Being an ally

Read Toria Gibbs’ and Ian Malpass’ great post, “Being an Effective Ally to Women and Non-Binary People“, and take it to heart. You’re default-responsible for engineering on your team. And that means it’s up to you to make sure that all of your team members, including those from underrepresented groups, have an ally in you.

“What does being a tech lead have to do with being an ally?” is a fair question.

First, you are the point person within your team. You will be involved in most or all technical discussions, and you will be driving many of them. Make sure members of underrepresented groups have an opportunity to speak. If they haven’t gotten the chance yet, ask them questions like, “Are we missing any options?” or “You’ve done a lot of work on X, how do you think we should approach this?”. If you are reiterating someone’s point, always credit them: “I agree with Alice that X is the right way to go.”

You will also be the point person for external teams. Use that opportunity to amplify underrepresented groups by highlighting their work. If your time is taken up by tech leading, then other people are doing most of the coding on the team. When you give code pointers, mention who wrote it. If someone else has a stronger understanding of a part of the code, defer technical discussions to them, or include them in the conversation. Make sure the right names end up in visible places! For instance, Etsy’s A/B testing framework shows the name of the person who created the experiment. So I always encourage our engineers to make their own experiments, allowing the names to be visible to all of our resident A/B test snoopers (there are dozens of us). If someone contributes to a design, list them as co-authors. You never know how long a document will live.

Take advantage of the role for spreading knowledge

When a team has a tech lead, they end up acting as a central hub of activity. They’ll talk about designs and review code for each of the projects on the team.

If you read all the code your team sends out in pull requests, you will learn at an accelerated rate. You will quickly develop a deep understanding of your team’s codebase. You will see techniques that work. You can ask questions about things that are unclear. If you are also doing code reviews outside of your team, you will learn about new technologies, libraries, and techniques from other developers. This enables you to more effectively support your team with what you have learned from across the company.

In this small team, Alice is the tech lead, and Bob is working directly with Carol. All other projects are 1 person efforts. Alice is in a position where she can learn quickly from all engineers, and spread information through the team.

Since you are in this position, you are able to quickly define and spread best practices through the team. A good resource that offers some suggestions for code reviews is this presentation by former Etsy employee Amy Ciavolino. It is a good team-oriented style. Feel free to adapt parts to your own style. If you’ve worked with me, you’ll notice this sometimes differs from what I do. For instance, if I have “What do you think?” feedback, I prefer to have in-person/Slack/Vidyo conversations. This often ends in brainstorming, and creating a third approach that’s better than what either of us envisioned. But this presentation is a great start, and a strong guideline.

Day-to-day work

As I mentioned above, much of the work of a tech lead is interrupt-driven. This is good for the team, but it adds challenges to scheduling your own time. On a light day, I’ll spend maybe an hour doing tech lead work. But on a heavy day, I’ll get about an hour of time that’s not eaten up by interruptions.

Accordingly, it’s difficult to estimate what day you will finish something. I worked out a system with our engineering manager that worked well. I only took on projects that were either small, non-blocking, or didn’t have a deadline. This is going to work well with teams trying to have a minimal amount of process. This will be a major adjustment on teams that are hyper-organized with estimation.

You need to fight the guilt that comes with this. Your job isn’t to crank out the most code. Your job is to make the people on your team look good. If something important needs to be done, and you don’t have time to do it, you should delegate it. This will help the whole team move forward.

When I’m deciding what to do, I do things in roughly this priority:

Inner loop:

  1. Answer any Slack pings
  2. Help anybody who needs it in my team’s channel
  3. Do any pending code reviews
  4. Make sure everybody on the team has enough work for the rest of the week
  5. Do any process / organizational work
  6. Project work

Once a day:

  1. Check performance graphs. Investigate (or delegate) major regressions to things it looks like we might have affected.
  2. Check all A/B experiments. For new experiments, look for bucketing errors, performance problems (or unexpected gains, which are more likely to be bugs), etc.

Once a week:

  1. Look through the bug backlog, make sure a major bug isn’t slipping through the cracks.

What this means for engineering managers

Many teams don’t have tech leads, but every team needs tech leadership in order to effectively function. This is a call-to-action for engineering managers to examine the dynamics of their teams. Who on your team is performing this work? Are they being rewarded for it? In particular, look for members of underrepresented groups, who may be penalized for writing less code due to unconscious bias.

Imagine a team of engineers. The duties listed above are probably in one of these categories:

A designated tech lead handles the work. If your team falls into this category, then great! Make sure that the engineer or engineers performing these these duties are recognized.

Someone’s taking responsibility for it, on top of their existing work. This can be a blessing or a curse for engineers, based on how the engineering manager perceives leadership work. It’s possible that their work is appreciated. But it’s also possible that people are only witnessing their coding output drop, without recognizing the work to move the team forward. If you’re on a team where #2 is mostly true (tech lead is not formalized, and some engineer is taking responsibility for moving the team forward, at the expense of their own IC work), ask yourself this: are they being judged just for the work they do? Or are they being rewarded for all the transitive work that they enable?

A few people do them, but they often get neglected. Work still gets done in this category, but there are systematic blockers. If nobody owns code reviews, it will take a long time for code to be reviewed. If nobody owns code quality, your codebase will become a swiss cheese of undeleted, broken flags.

Nobody is taking responsibility for them. In this category, some things just won’t get done at all. For instance, if nobody is default-responsible for being an ally for underrepresented groups, then it’s likely that this will just be dropped on the floor. This kind of thing is fractal: if we drop the ball on the group level, we’ve dropped the ball on both the individual, and company-wide, levels.

In conclusion

There is value in having a designated tech lead for your team. They will create and promote best practices, be a point-person within your team, and remove engineering roadblocks. Also, this work is likely already being done by somebody, so it’s important to officially recognize people that are taking this responsibility.

There is also lots of value in officially taking on this role. It allows you to leverage your time to move the organization forward, and enables you to influence engineering throughout the entire team.

If you’re taking on this work, and you’re not officially a tech lead, you should talk with your manager about it. If you’d like to move towards becoming a tech lead, talk to your manager (or tech lead, if you have one!) about any responsibilities you can take on.

Thanks to Katie Sylor-Miller, Rachana Kumar, and Toria Gibbs for providing great feedback on drafts of this, and to everyone who proofread my writing.

My friends trolled each other with my Discord bot, and how we fixed it

My last post describes a Discord bot, named “crbot,” that I wrote for my college friends. Its name is short for “call-and-response bot.” It remembers simple commands that are taught by users. I patterned it after some basic functionality in a bot we have at Etsy. crbot has become a useful part of my friends’ channel. It’s been taught over 400 commands, and two of my friends have submitted patches.

But there were problems. We also used crbot to troll each other. Bug fixes were needed to curb bad behavior. The situation is better now, but I’ve had a nagging question: “should I have seen this coming, or could I have only fixed these problems by being reactive?” I couldn’t answer this without my friends’ perspective. So I asked them!

I was hoping for a small discussion. However, it blew up. With the bot as a staging ground, the subtext of the conversation was about our friendships. Many of us have known each other for a decade. Some, far longer. We’ve seen each other through major life events, like home purchases, children, and weddings. But I don’t remember any conversation where we’ve discussed, at length, how we perceive each other’s actions. And we only resorted to personal attacks once or twice. Go us!

To answer “Could I have seen this coming?,” we’re going to look at this in three parts:

  1. What happened? A story about how the bot was used and abused, and how it changed over time.
  2. What did my friends think? All of their insights from our discussion.
  3. Lessons learned. Could I have seen this coming?

I think it’s worth adding a disclaimer. These discussions have an implicit “within my group of friends” clause. The fixes work because my friends and I have real-life karma to burn when we mess with each other. Not because they’re some panacea.

What happened?

First bot version: ?learn, without ?unlearn

Channel #general
// Teach a bot a command.
Katie: ?learn brent https://www.example.com/brent.gif
crbot: Learned about brent
// Later, it's used.
Chris: my meeting went well today
Jake: ?brent
crbot: https://www.example.com/brent.gif
Discord unfurler: [gif of a kid giving a thumbsup]

Sidebar: Where did the word “unfurler” come from? Is it named because it “un-f”s the url? Or because it’s actually sailing related? The people need to know.

At launch, my friends’ initial reaction was mixed. Some immediately got it, and taught it inside jokes. Others said, “I don’t understand what this does, or why we’d want this.” One cleverly pointed it to a URL of a precipitation map, only to discover that Discord’s unfurler wasn’t re-caching the image within a useful timespan. By now, everyone has latched onto the bot’s ability to quickly recall gifs. Good enough for me.

At launch, crbot could learn new commands. But it could not forget them. This introduced a land grab, where my friends claimed as many names as possible. Scarcity encouraged usage, which was good for the bot. However, my friends took this opportunity to ?learn ridiculous or insulting things for each other’s names.

It got a little nasty. One message insulted someone’s intelligence. Another referenced the fact that someone is adopted. I’m speaking generically because the phrasing was over-the-top. There wasn’t a legitimate defense against this. You could claim all the commands that you might find insulting, if you were somehow able to foresee every way that you could be offended. For instance, people tried squatting on placeholders on the ?learns of their actual names and online handles. However, the bot is case-sensitive, so they’d need to protect all capitalized variations of their names.

If you’ve read “A Group Is Its Own Worst Enemy” by Clay Shirky, none of this is a surprise. Technology amplified our group’s negative interactions, and there was no moderation to fight back. The joy from trolling on crbot outweighed any reputation hit they took to their actual friendships, especially if they learned the commands in private. However, most of the things that I found really abusive were done in broad daylight.

global ?unlearn

To combat this, Ryan submitted the first user-contributed patch, to add ?unlearn. Now, users could make the bot forget responses.

Channel #general
// Show that the bot has not forgotten about dre
Jake: ?dre
crbot: I know who Dr. Dre is!
// Delete the ?dre command
Jake: ?unlearn dre
crbot: Forgot about dre
// The bot has forgotten about dre
Jake: ?dre
// No response

This helped a little. There was no longer an incentive to be the first to ?learn something. Now, you needed to be the last to learn it. This incentivizes high-quality ?learns. Your crummy commands are going to be replaced by better ones.

There was an adjustment period, where we figured out acceptable use. For instance, I replaced ?popcorn with a better gif, which touched off an argument about ?unlearn etiquette. There was a long sidebar about the life choices of people who post shitty gifs, when better ones exist. We settled on some guidelines, like “the person to ?unlearn a command should be the person who ?learned it.” We don’t always follow these rules. But it’s a good start.

?unlearn introduced a second problem. ?unlearn could be executed in a private channel. This introduced an attack where popular commands could be replaced in a direct message with crbot.

Direct Message with crbot
Attacker: ?unlearn ping
crbot: Forgot about ping
Attacker: ?learn ping Fuck you, ping yourself.
crbot: Learned about ping
Later, in #general
Jake: hey! there's a new release of the bot
Jake: ?ping
crbot: Fuck you, ping yourself.
Jake: :(
Jake: ?unlearn ping
crbot: Forgot about ping
Jake: ?learn ping pong
crbot: Learned about ping
Jake: ?ping
crbot: pong

As a design decision for crbot, I don’t log anything. Basically, I don’t want to respond to “who did X” bookkeeping questions, and I don’t want to write a system for others to do this. I don’t know who ?learned what, and I don’t care. This anonymity created a problem where there is no accountability for private ?unlearns. To this day, I still don’t know who did these. Nobody ever stepped forward. I would claim that I didn’t take part, but that’s what everybody else says, too 

Public-only ?unlearn

We had a group discussion about ?unlearn, where I proposed that ?learn and ?unlearn could only be executed in public channels. My idea was that a public record would force everybody to properly balance the social forces at work. Andrew and Bryce argued that only ?unlearn should be prevented from being executed in private. This would force our real-life karma to be tied to removing someone else’s work. But ?learn should be allowed in private channels, since Easter eggs are fun. Plus, the command list is so large, that a new command will never be found, without being used publicly by the person who created it.

So, Ryan tweaked his ?unlearn implementation so it could only be executed in public channels. Now that a month has gone by, it has elegantly balanced ?unlearn and ?learn within our group of friends. The social forces at work have prevented further abuses of the system.

Hobbit bomb

One of our friends is often called a hobbit. I don’t know the details. Something about his feet.

Anyways, this led to ?hobbitbomb, which pastes a url of his picture. 48 times in one message. So, typing ?hobbitbomb once causes the Discord unfurler to inline the same image 48 times. The effect is that it takes up a massive amount of vertical screen real estate; it takes a long time to scroll past the bomb. It was used 7 times across a month (I used it a few of those times), and then effectively abandoned.

My friends’ reactions fell into 2 camps.

  1. So what?
  2. This makes Discord unusable. Also, this isn’t funny.

At one point, somebody decided that they’d had enough, and they ?unlearned ?hobbitbomb. The original poster, not to be deterred, created ?hobbitbombreturns?hobbitbombforever, and ?hobbitbombandrobin, all of which were duplicates of the original. A good meme has a healthy immune system.

Then, there was a lengthy detente, where the capability still existed to ?hobbitbomb, but nobody was using it. Finally, the command was brought up as a major source of frustration during our lengthy conversation on trolling in our channel. My friends settled on a social outcome: they limited the bomb size to 4 images (since 4 hobbits forms a fellowship). It still exists, it still requires scrolling, but it’s not extreme.

What did my friends think?

?unlearn abuse

Multiple patches were needed to balance ?learn and ?unlearn. Despite that, some people in the channel didn’t think that ?learn abuse was noteworthy. Their viewpoint was interesting to me. This required actual code fixes, so it must have been the biggest problem we faced. Right?

But when looking at individual instances, the problems caused by ?unlearn were minor. “I’m kind of annoyed right now,” or “I need to claim my username, so that nobody learns something weird for it.” This happened in low volume over time. For me, it added up to being the worst abuse of crbot. For other people, this was just something mildly annoying to deal with over time.

Hobbit bomb, and my own blind spots

Before the discussion, I hadn’t given ?hobbitbomb a second thought. “So what?” was my official position, and I wasn’t alone in having it. Scrolling seemed like a minor problem. But other people were seriously impacted. One friend felt it was the only abuse in the channel, and had to be reminded of all the ?unlearn abuse.

Before the bot, we had 2 tiers of users: moderators, and an owner. But when I added the bot to the channel, I created another implicit power position as the bot’s maintainer. I can change the bot to prevent behaviors from happening again. I can reject my friends’ patches if I don’t like them. I can still do private ?unlearn myself, since I have access to the Redis database where the commands are stored. And I can just shut off the bot someday, for any reason I want. This cuts both ways – I’m held in check, because the bot can be banned.

Anyways, the most interesting part of our discussion was finding out that I had a blindspot in how I handled this situation. I never thought that ?hobbitbomb was a problem, so I didn’t even file a bug ticket. I had been treating social problems like technical bugs, but this one hadn’t risen to the level of reporting yet. I needed to disconnect myself from my own feelings, and implement fixes based on my users’ complaints. As Chris put it, “the issue is its potential and how different people react to its use.”

Otherwise, you end up like Twitter, which had a long and storied harassment problem that has reportedly cost the company potential buyers. In my experience with crbot, users have great suggestions for fixing problems, and I’ve seen great user suggestions for Twitter. For instance, “I only want to be contacted by people who have accounts that are verified with a phone number. And when I block an account, I never want to see another account associated with that phone number.”

Technical solution, or compromise?

Since technical and social problems are related, I offered to fix ?hobbitbomb technically; I’d limit crbot’s output to a small number of URLs per response. ?hobbitbomb might be the only command that has multiple URLs per response, so it would have little impact. One of my friends pointed out that part of its utility is how annoying it is. So this would have the dual-impact of reducing pain, and reducing utility.

My friends rejected this offer, and decided to work towards a compromise. This was interesting to me; the core problem is still latent in the project. I may still implement my fix. If the bot were exposed to the public, I’d have to implement it, given how the Discord unfurler works. But on the other hand, I can think of a dozen ways to troll somebody with the bot, and I haven’t even finished my second cup of coffee. Plus, the premise of this whole chatroom is that we are friends. We have the option to create house rules, which might not be available in public forums.

During the discussion, we reduced the ?hobbitbomb payload from 48 to 4 images. This is enough to clear a screen, but doesn’t force people to scroll through multiple pages of hobbits. I don’t think that everybody was happy with this, since the ?hobbitbomb payload still exists. But both camps accepted it, and the great ?hobbitbomb war of 2017 was finally put to bed.

Social forces of friendship

Most of the problems with the bot were fixed with fairly light technical solutions, or house rules. For instance, public-only ?unlearn was the last time we saw ?unlearn abused in any capacity, even though there are still plenty of ways to cause mischief. And we have a few house rules; for instance, “don’t ?unlearn commands you didn’t ?learn.”

As Chris pointed out, this implies that everybody in the group assigns some weight to the combination of “we care about each other” and “we care about how we are perceived by each other.” This adds a hefty balancing force to our channel. It also means that all of my fixes for this channel are basically exclusive to this channel. There’s no way that this bot could be added to a public Discord channel. It would turn into a spammy white supremacist in 3 seconds.

Could I have seen this coming?

Or put a better way, “If I had to implement just one anti-trolling solution, before any of my friends ever used the bot, what would I implement?”

I imagined tons of problems that never arose. Nobody mass-unlearned all messages. Nobody mass-replaced all the messages. Nobody did a dictionary attack to ?learn everything. Nobody tried spamming the ?list functionality to get it blocked from using Github Gists. Nobody managed to break the parser, or found a way to get the bot to /leave the channel (not that they didn’t try). I didn’t need to add the ability to undo everything that a specific user had done to the bot. Once my friends saw the utility of crbot, there was little risk in the bot being ruined.

I did foresee spam as a problem. But I would have guessed that it’d be somebody repeating the same message, over and over again, to make the chat unusable. I never expected ?hobbitbomb, one message that was so large that some of my friends thought it broke scrolling. I’m not even sure this is fixable; even if I limited the number of images in a response to one, I imagine that one skinny + tall image can be equally annoying. I’m at the mercy of the unfurler here. Also, my traditional image of spam is something that comes in mass volume, not something that has a massive volume.

So, back to ?learn without ?unlearn. I should have seen that one coming. My idea was that this created scarcity, so people would be encouraged to use the bot. I didn’t imagine that people would use the opportunity to ?learn things that were abusive. Plus, the functionality for ?learn and ?unlearn are quite similar, so I could have quickly gotten it out the door, even if I still wanted to launch the bot with just ?learn. Launching without ?unlearn was too aggressive. Even with social pressures at work, we needed to have the ability to undo.

When reviewing the ?unlearn patch, I never guessed that private ?unlearn would be abused like it was. Honestly, a lot of this surprised me. This wasn’t even the general public; these were all problems that were surfaced by people I’ve known for a decade. If I can’t predict what they’re going to do, then it feels like there’s no hope to figure this out ahead of time, even if you have mental models like “private vs. public,” or “what is the capacity of people to tolerate spam?”

So my key takeaways from this project are pretty simple.

  • Discuss bug fixes with impacted users. They have great opinions on your fixes, and will suggest better ideas than you had. Especially if the people are technical.
  • Treat all user complaints like technical bug reports. Not just the ones you agree with. That doesn’t mean that all reports are important. But they deserve to have estimates for severity, scope of impact, and difficulty of the fix.
  • Plan on devoting post-launch time to fixing social problems with technical fixes.Because you will, whether you plan on it or not.
  • Every action needs to be undone. The most basic of moderation tools. Not even limiting the bot to my own friends obviated this.
  • Balance public-only and public+private. Balance privacy and utility. When something involves your personal information, it should be default-private. When your actions interact with other users, it should be attributed to you.

Thanks to Andrew, Brad, Bryce, Chris, Drew, Eric, Katie, and Ryan for sharing their thoughts!

Writing a Discord bot, and techniques for writing effective small programs

My old college friends and I used a Google Hangout to keep in touch. Topics were a mix of “dear lazychat” software engineering questions, political discussion, and references to old jokes. Occasionally, out of disdain for Hangouts, we discussed switching chat programs. A few friends wanted to use “Discord,” and the rest of us ignored them. It was a good system.

But then one day, Google announced they were “sunsetting” (read: murdering) the old Hangouts application, in favor of two replacement applications. But Google’s messaging was odd. These Hangouts applications were targeted to Enterprises? And why two? We didn’t take a lot of time to figure this out, but the writing on the wall was clear: at some point, we would need to move our Hangout.

After the news dropped, my Discord-advocating friends set up a new server and invited us. We jumped ship within the hour.

It turns out that they were right, and we should have switched months ago. Discord is fun. It’s basically Slack for consumers. I mean, there are differences. I can’t add a partyparrot emoji, and that’s almost a dealbreaker[0]. But if you squint, it’s basically Slack, but marketed to gamers.

As we settled in to our new digs, I found I missed some social aspects of Etsy’s Slack culture. Etsy has bots that add functionality to Slack. One of my favorites is irccat. It’s designed to “cat” external information into your IRC channel Slack channel. It’s “everything but the kitchen sink” design; you can fetch server status, weather, stock feeds, a readout of the foodtrucks that are sitting in a nearby vacant lot. A whole bunch of things.

But one of my favorite features is simple text responses. For instance, it has been taught to bearshrug:

Me: ?bearshrug
irccat: ʅʕ•ᴥ•ʔʃ

Or remember URLs, which Slack can unfurl into a preview:

Me: hey team!
Me: ?morning
irccat: https://bitlog.com//wp-content/uploads/2017/03/IMG_0457.jpg

Lots of little routines build up around it. When a push train is going out to prod, the driver will sometimes ?choochoo. When I leave for the day, I ?micdrop or ?later. It makes Etsy a little more fun.

A week or two ago, I awoke from a nap with the thought, “I want irccat for Discord. I wonder if they have an API.” Yes, Discord has an API. Plus, there is a decent Golang library, Discordgo, which I ended up using.

And away I go!

Side project organization

So, yeah, that age old question, “How much effort should I put into my side project?”

The answer is always, “It’s your side project! You decide!”. And that’s unhelpful. Most of my side projects are throwaway programs, and I write them to throw away. The Discord bot is different; if my friends liked it, I might be tweaking it for years. Or if they hated it, I might throw away the work. So I decided to “grow it.” Write everything on a need-to-have basis.

I get good results when I grow programs, so I’m documenting my ideas around this, and how it sets me up for future success without spending a lot of time on it.

I want to be 100% clear that there’s nothing new here. Agile may call this “simple design.” Or maybe I’m practicing “Worse is Better” or YAGNI. I’ve read stuff written by language designers, Lisp programmers, and rocket scientists about growing their solutions. So here’s my continuation, after standing on all these shoulders.

Growing a newborn program

Most of my side projects programs don’t live for more than a day or two. Hell, some never leave a spreadsheet. Since I spend most of my time writing small programs, it makes sense to have rules in place for doing this effectively.

Writing code in blocks makes it easy to structure your programs

By this, I mean that my code looks roughly like this:

// A leading comment, that describes what a block should do.something, err := anotherObject.getSomething()
if err != nil {
// Handle error, or maybe return.
}
log.Printf("Acquired something: %d", something.id)
something.doAnotherThing();

Start the block with a comment, and write the code for the comment. The comment is optional; feel free to omit it. There aren’t hard-and-fast rules here; many things are just obvious. But I often regret it when I skip them, as measured by the number that I add when refactoring.

Blocks are useful, because the comments give a nice pseudocode skeleton of what the program does. Then, decide whether each block is correct based on the comment. It’s an easy way to fractally reason about your program: Does the high level make sense? Do the details make sense?  Yay, the program works!

For instance, if you took the hello-world version of my chatbot, and turned them into crappy skeletal pseudocode, it would look like this:

main:
ConnectToDiscord() or die
PingDiscord() or die
AddAHandler(handler) or die
WaitForever() or wait for a signal to kill me
handler:
ReadMessage() or log and return
IsMessage("?Help") or return
ReplyWithHelpMessage()

There’s a lot of hand-waving in this pseudocode. But you could implement a chatbot in any language that supported callbacks and had a callback-based Discord library, using this structure.

Divide your code into phases

In my first job out of school, I worked at a computer vision research lab. This was surprisingly similar to school. We had short-term prototype contracts, so code was often thrown away forever. It wasn’t until I got a job at Google later that I started working on codebases that I had to maintain for multiple years in a row.

At the research lab, I learned what “researchy code” was – complicated, multithreaded computer code emulating papers that are dense enough to prevent a layperson from implementing them, but omit enough that a practicing expert can’t implement them either. No modularization. No separation of concerns. Threads updating mutable state everywhere. Not a good place to be.

So, my boss had the insight that we should divide these things at the API level, and have uniform ways to access this information. Not groundbreaking stuff, but this cleverly managed a few problems. Basically, the underlying code could be as “researchy” as the researcher wanted. However, they were bound by the API. So once you modularized it, you could actually build stable programs with unstable components. And once you have a bunch of DLLs with well-defined inputs and outputs, you can string them together into data-processing pipelines very easily. One single policy turned our spaghetti code nightmare into the pasta aisle at the supermarket; the spaghetti’s all there, but it’s packaged up nicely.

I took this lesson forward. When writing small programs, I like to code the steps of the program into the skeleton of the application. For instance, my “real” handler looked like this, after stripping out all the crap:

command, err := parseCommand(...)
if err != nil {
info(err)
return
}
switch command.Type {
case Type_Help:
sendHelp(...)
case Type_Learn:
sendLearn(...)
case Type_Custom:
sendCustom(...)
case Type_List:
sendList(...)
}

Dividing my work into a “parse” and  “send” phase limits the damage; I can’t write send() functions that touch implementation details of parse(), so I’m setting myself up for a future where I can refactor these into interfaces that make sense, and make testing easier.

Avoid optimizations

Fresh out of college, I over-optimized every program I wrote, and blindly followed trends that I read recently. I’d optimize for performance, or overuse design patterns, or abuse SOLID principles, or throw every feature in C++ at a problem. I’m guilty of all of these. Lock me up. Without much industry experience, I just didn’t understand how to tactically use languages, libraries, and design techniques.

So I’ve started making a big list of optimizations that I don’t pursue for throwaway personal programs.

  • Don’t make it a “good” program. It’s fine if it takes 9 minutes to run. It’s fine if it’s a 70 line bash script. Writing it in Chrome’s Javascript debugger is fine. Hell, you’d be shocked how much velocity you can have in Google Sheets.
  • Writing tests vs tracking test cases. Once you’ve written enough tests in your life, you can crank out tests for new projects. But if your project is literally throwaway, there’s a break-even point for hand-testing vs automated testing. Track your manual test cases in something like a Google Doc, and if you’re passing that break even point, you’ll have a list of test cases ready.
  • Make it straightforward, not elegant. My code is never elegant on the first try. I’m fine with that. Writing elegant code requires extra refactoring and extra time. And each new feature could require extra changes to resimplify.
  • Don’t overthink. Just write obvious code. You don’t need to look something up if you can guess it. For instance, variable names: my variable name for redis.Client is redisClient. I’m never going to forget that, and it’s never going to collide with anything. Good abbreviations require project-wide consistency, and for a 1000 line project, it’s hard to get away with a lot of abbreviated names.
  • Don’t make it pretty. For instance, my line length constraints are “not too much.” So if I look at something and say, “that’s a lot!” I keep it. But I refactor if I say, “That’s too much!”

Release

Once I tested the code, and got the bot running, I invited it into our new Discord channel. Everyone reacted differently: some still don’t understand the bot, and others immediately started customizing it. Naturally, my coder friends tried to break it. One tried having it infinitely give itself commands; another fed it malformed commands to see if it would break. Two of my friends have filed bugs against me, and one is planning on adding a feature. My friends have actually adopted it as a member of the channel. I love the feeling of having my software used, even just by a few people.

There have also been some unexpected usages. Somebody tried to link to snowfall images that are updated on the remote server. Unfortunately, Discord’s unfurler caches them, so this approach didn’t work like we wanted it to. Bummer. My program almost came full circle; my call-and-response bot would have been used to cat information into the channel, just like its predecessor, irccat.

So yeah, my chatbot is alive, and now comes the task of turning it from a small weekend project into More Serious code. Which has already started! Click here to follow me on Twitter to get these updates.

Links

Github project: https://github.com/jakevoytko/crbot

Version of code in the post: https://github.com/jakevoytko/crbot/commit/8ceaeaf1ec34a45e91eff49907db1585d5d22f53

[0] For people who do not know me well: I am serious. I cannot be more serious.