Category Archives: industry

Finish something every day

When you write code in an engineering organization, you will do the following:

Type the code out.
Test some of it. Hell, maybe you’ll test all of it.
Get someone to review the code.
Push it to source control.

These items aren’t discrete or ordered. Test-driven development and pair programming are practices that reorder or merge these items. But these should happen for most changes.

Sometimes, you’re given a large task. You have a question at this point: should I break it up? Should I write the whole thing at once? In my experience, the best tradeoff is to finish something every day, even if it’s small. Write it. Test it. Send it for review.

This introduces a lot of tradeoffs. It’s not always possible. It makes some assumptions about your work environment. We will discuss all of these below.

Benefits

Small changes are better for you

Let’s say that you’re a full stack developer at a web shop. You are assigned a new task: add a new form to collect and store some data, and then display it on another part of the site. The designer whipped up mocks. The product manager wrote out the expected behavior. Now it’s your turn.

You see all of your tasks: Modify a form to collect the data. Include it on the request to the server. Write it into the ORM. Modify the database to store it. Think about the default values and whether we can backfill the data from another location. Read the data from the new location. Render it into the view.

There are a few obvious ways to approach the code:

Write all of it at once.
Write the database code, write the write path, write the read path.

There’s a less obvious way to approach the code:

Write the database code, write the read path (with stubbed data), and write the save path.

There may be other alternatives that depend on the details of your project. But look at what happened: The project is now decomposed into smaller tasks. Even better, we see that the ordering of two of the tasks doesn’t matter. The data layer code blocks everything else. But the other two tasks are independent of each other. You can pick the most convenient task ordering. They could even be done at the same time. This is the first insight of decomposing tasks: some work becomes parallelizable.

Parallelization is where the magic happens. This means that you’ve converted a one-developer project into a two-developer project. This is a great way to grow as an engineer. It lets you practice for project lead or tech lead positions. This also helps you practice for artificial deadline pressure. In an “emergency,” you could make the following proposal to your team: “we can get this out faster if we add a second engineer to this. I can add the model code today. Someone can work on the view tomorrow while I work on the save path.”

It’s also good to practice to go through a full “write, test, review, deploy” cycle frequently. Practice makes perfect. You will become more capable as you push more and more code. Your “small” changes will become larger and larger. It also becomes insurance against seniority. As you get more responsibilities, you will probably suffer from the Swiss cheese calendars that plague many senior employees. It’ll be your job to help people and maintain relationships around the company. People often need help at awkward times on your calendar. If you are in the habit of producing small changes, it’s a little easier to write code. You can still finish something if you have two hours between meetings.

Interestingly, you will discover failure cases as you parallelize work. These failure cases aren’t always obvious. What could go wrong? Some tasks are just too small. Every “write, test, review, deploy” cycle has overhead. Sometimes the overhead is huge compared to the size of the change. You will also notice that saturating a project with the maximum number of engineers doesn’t work as well as it sounds. If someone’s schedule slips, other engineers will be blocked. This is okay in the occasional “emergencies” where shipping a feature ASAP is the most important thing in the world. But you burn intangible resources (goodwill, happiness, team cohesion) by perpetually oversubscribing projects. You will learn to find a sustainable headcount for a project.

There are selfish reasons to ship all the time. Shipping is a form of advertisement. People see that you’re constantly “done” with something because you’re always asking for a review. But this is a double-edged sword. You’re always going to be asking for code reviews. The reviews should be worth the time of the reviewer. Make them large enough to be interesting. If you’re distracting them with adding a single line, you’re doing the team a disservice. This is why I’ve found “a day of work” to be a good tradeoff.

Better for the codebase

I’m going to tell you a horror story. Remember the above example: adding a UI feature to a web app? I’m going to work on that change. And I’m going to do the whole thing in a single pull request. I swoop my cape over my face and disappear into my evil lair for over a week.

I send you the code review out of nowhere. You look at it. Thousands of lines added across a few dozen files: tests, database configurations, view templates. This is going to take forever to review. You skim the files. You eventually get to the database file. You see that something is wrong: I should have added another table for the new data. Instead, I wedged it into an existing record. And this change was foundational. The write path depends on this mistake. The read path depends on this mistake. The UI on either side depends on this. The tests assert this. Everything depends on this mistake. Fixing this is expensive. It’s closer to a rewrite than a refactoring.

But we’re in the “website economic model” of development. Our sprint process puts downward pressure on how much time this task should take. I shipped a functional version of the project. It’s now your job to argue that we should throw away a working version in favor of a different working version.

This puts you in a difficult spot. The team estimated this task would be completed in under 1 sprint. But now we’re more than halfway to the deadline, and the change is wrong. Fixing it will take it past the end of the sprint. I’m incentivized to push back against your feedback. I may not. But let’s remember: this is a horror story. I’m going to push back. Bringing this up will also invite discussions with product or management stakeholders to negotiate whether there’s a cheaper fix that avoids a rewrite.

Furthermore, it took you forever to review the entire change. You need to do the entire review again a second time after my rewrite. And maybe a third time if another round of revisions are necessary. That could up to hours of reviewing that you’re not dedicating to your own work.

All of this leaves you with two bad options: rubber stamping a bad change (with some perfunctory “I was here” feedback to show that you reviewed it), or reducing your own velocity and your team’s velocity to argue for a gut renovation because of less-tangible long-term improvements.

Ok, let’s end the horror story. What if I had split my task into day-long chunks? My first task would be to write the data layer. So I’d write the database changes and any ORM changes. I’d send them to you for review. You’d look at my changes and say, “Hey, let’s move these fields into a separate table instead of wedging this into the HugeTable. We used to follow that pattern, but we’ve been regretting it lately for $these_reasons.” And it’s totally cool – I don’t push back on this. I take the few hours to make a change, you approve the changes, and I move on.

What was different? I incorporated you earlier into the process. You weren’t sacrificing anybody’s time or velocity. You made me better at my job by giving me feedback early. The codebase turned out better. Nobody’s feelings were hurt. This means that I improved the entire team’s engineering outcome by splitting my changes into small chunks. It was easy to review. It wasn’t difficult for me to fix my mistakes.

Why wouldn’t I split my changes into smaller chunks?

When does shipping every day work? When does it fail?

“Finish something every day” makes a lot of assumptions.

There is an obvious assumption: something worthwhile can be finished in less than a day. This isn’t always true. I’ve heard of legacy enterprise environments where the test suite takes days to run. I’ve also worked in mobile robotics environments where “write, compile, test” cycles took 30 minutes. In those situations, it can be impossible to finish something every day. There is a different optimal cadence that balances the enormous overhead with parallelization.

“Finish something every day” also assumes that the work can be decomposed. Some tasks are inherently large. Designing a large software system is a potentially unbounded problem. Fixing a performance regression can involve lots of experimentation and redesigning, and is unlikely to take only one day. Don’t kill yourself trying to finish these in a day. But it can be interesting to ask yourself, “can I do something quickly each morning and spend the rest of the day working on the design?”

Another assumption is that your teammates review code quickly. Quick reviews are essential. This system is painful when your code reviews languish. Changes start depending on each other. Fixes have to be rebased across all of them. Yes, the tools support it. But managing 5 dependent pull requests is hard. Fixes in the first pull request need to be merged into all the others. If your teammates review them out of order, fixing all of them becomes a nightmare.

If I may be so bold: if you’re getting slow code reviews, you should bring it up with your team. Do you do retrospectives? Bring it up there. Otherwise, find a way to get buy-in from your team’s leaders. You should explain the benefits that they will receive from fast code reviews: “Fast code reviews make it feasible to make smaller changes, because our work doesn’t pile up. Our implementation velocity improves because we’re submitting changes faster. We all know that smaller changes are easier to review. It’ll lead to better engineering outcomes because we’ll provide feedback earlier in the process.” Whatever you think people care about. Ask your team to agree on a short SLA, like half a day.

You can model the behavior you want. You should review others’ code at the SLA that you want. If you want your code reviewed within a couple of hours, review people’s code within a couple of hours. This works well if you can provide good feedback. If you constantly find bugs in their code, and offer improvements that they view as improvements, they’ll welcome your reviews and perspective as critical for the team’s success. If you nitpick their coding style and never find substantial problems, don’t bother. The goal is to add value. When you’re viewed as critical for the team’s success, then it’s easier to argue that “we will all be more successful if we review code quicker.”

I take this to an extreme. When I get a code review, I drop everything and review it. My job as a staff engineer is to move the organization forward. So I do everything possible to unblock others. If this doesn’t work for you, find a sustainable heuristic. “Before I start something, I look to see if anyone has a pending code review that I can help with.” Find a way to provide helpful code reviews quickly.

Finish something every day

Try to finish something every day. You will get better at making small changes, and your definition of “small” will keep getting bigger. You will get better at decomposing tasks. This is the first step towards creating parallelizable projects. Additionally, you will get exposure for continually being “done.”

It helps your reviewers. Smaller tasks are much easier to review than larger tasks. They won’t have to give large reviews. They also won’t have to feel bad about asking for a complete rewrite.

It will also help the codebase. If reviewers can give feedback early, they can help you avoid problems before you’ve written too much code to turn back.

In practice, “finish something every day” really means “find the smallest amount of work that makes sense compared to the per-change overhead.” In many environments, this can be “every day,” but it won’t be universal.

Please consider donating to Black Girls Code. When I was growing up,
I had access to high school classes about programming and a computer
in my bedroom which allowed me to hone these skills. I'd like
everyone to have this kind of access to opportunities.

https://www.blackgirlscode.com/

I donated $500, but please consider donating even if it's $5.

DuckDuckGo is good enough for regular use

Google recently launched a desktop redesign. The favicon and URL breadcrumbs were turned into a header for organic search results. Ads had the same design, but were identified using the string “Ad” instead of the favicon. This design wasn’t new. Google’s mobile web search has served this design since May 2019. But users and regulators complained that the desktop version blurred the distinction between ads and organic results. Google reverted the change a few weeks later, citing the backlash.

I experienced change aversion when I tried the redesign. Change aversion is a simple idea: users react negatively to new experiences, but they stop caring as new experiences become normal. Anyways, looking at the Google redesign gave me change aversion. I knew that I wouldn’t care about it within a few days. But I decided to put it to good use: I would try DuckDuckGo. If it was time for Google to experiment, then it was time for me to experiment. I had wanted to try it for a while. This finally gave me the activation energy to switch.

DuckDuckGo’s premise is simple. They do not collect or share personal information. They log searches, but they promise that these logs are not linked to personally identifiable information. Their search engine results seemingly come from Bing, but they claim to have their own crawler and hundreds of other sources on top of that. They do customize the results a little: geo-searches like bars near my location give me results from my home city of New York. But search results aren’t personalized. I’ve always wondered how good the results would be.

Anyways, here are the guidelines that I set for my experiment:

I would switch all of my browser’s default search engines to DuckDuckGo across all of my devices.
I would use DuckDuckGo for at least a month. This would give me enough time to learn some of its strengths and weaknesses.
I would not use any DuckDuckGo poweruser features unless I could guess that they existed. I wanted to understand the out-of-the-box experience on the site.
I could use the !g operator to search Google if DuckDuckGo failed. Some will point out that this violates the previous rule. But as soon as a discussion changes to DuckDuckGo usage, people can’t WAIT to talk about how often they use !g or g!. Do you need an example? I discussed it in this paragraph and tried to blame it on other people. I’m serious: people can’t talk about DuckDuckGo without talking about !g. It’s the law. So I know about it and I will use it.

I haven’t tried a new search engine since I tried Bing in 2009. It was time to find out how good DuckDuckGo is in 2020. What was the biggest difference that I found?

Google is the king of low-intent searches

Google has a structured understanding of many domains. This is a difficult moat for other search engines to cross. This is evident when comparing low-intent searches. These are searches with an ambiguous purpose. The subject is broad and it’s not clear what the user wanted. The user might not even “want” anything except to kill five minutes before a meeting.

Let’s try a low-intent search. Type harry potter into Google. In response, Google throws everything at the wall to see what sticks. In addition to the organic links, Google serves me:

A panel on the right with a ton of metadata. This includes oddly-specific structured data like “Sport: Quidditch”.
A list of five of the seven books in the series.
Fantasy books from five related searches.
A news panel containing three articles about Harry Potter actors.
The harry potter Google Maps search, centered on the New York area.
A “People also ask” panel with four questions.
A link to three Harry Potter-related YouTube videos.
Three recent tweets from @HarryPotterFilm.
A panel with 7 “Fantasy book series” results.
A panel with 7 “Kids book series” results.
8 other search strings related to harry potter.

This makes sense: what did I want when I searched for harry potter? Google can’t know. So Google returns information from many domains to attempt to satisfy the query. Google returns so much information that something will be close enough. This is a huge competitive advantage. They can serve good results for bad searches by covering as many domains as possible.

This is a departure from how search used to work. When I was in grade school, I was taught how to craft search queries. Someone herded us into a library and explained how to pick effective keywords, quote text, use operators like AND or OR, etc. These days are dead. None of this matters on Google. If you want to know showtimes for “Harry Potter and the Cursed Child,” a search for harry potter will get you close enough.

In comparison, DuckDuckGo’s results for harry potter are relaxing. It serves a small knowledge panel to the right and three recent news articles at the top, some organic links, and nothing else. It’s much easier to scan this page. It’s a more relaxed vibe. But if I actually wanted something, it likely wouldn’t be on this page. You can make the argument that I got what I deserved: I didn’t clearly communicate what I wanted, and therefore I didn’t get it. But Google has trained everyone that broad queries are effective. It feels like magic. It’s not. It’s the result of years of developing a structured understanding of the world and crafting ways to surface the structure. And it’s something that potential competitors will need to come to terms with.

I don’t personally miss most of Google’s result panels. Especially the panels that highlight information snippets. It’s easy to find these. Searching microsoft word justify text provides me a snippet from Microsoft’s Office’s support page explaining what to click or type to justify text. I’ve learned not to trust information in these panels without reading the source they came from. Google seems to cite this information uncritically. I’ve found enough oversimplified knowledge panel answers that I’ve stopped reading most of them. Recently, I was chatting with a Googler who works on these. I asked them if I was wrong to feel this way. And they replied, “I trust them, but I’ve read enough bug reports and user feedback that I don’t blame you.” So my position is wrong, but not very wrong. I’ll take that.

Some of Google’s panels are great. I miss them. I haven’t found anything better than Google’s stock panel for quickly looking at after-hours stock movements. Searching Google for goog stock will show you this panel. I miss you buddy. I hope you’re doing well.

Ultimately, it stresses me out when Google returns many panels in a search. I’m sure that each is a marginal gain for Google. But I don’t like how Google feels as a result. I’m continually glad to see just 10 links on DuckDuckGo, even if this means that I’m not getting what I wanted. This has been training me to craft more specific searches.

DuckDuckGo is good enough

Let’s move away from Google’s competitive advantages. How does DuckDuckGo perform for most of my search traffic? DuckDuckGo does a good job. I haven’t found a reason to switch back to Google.

I combed through my browser’s history of DuckDuckGo searches. I compared it to my Google search history. When I fell back to Google, I often didn’t find what I wanted on Google either.

Most of my searches relate to my job, which means that most of my searches are technical queries. DuckDuckGo serves good results for my searches. I’ll admit that I’m a paranoid searcher: I reformat error strings, remove identifiers that are unique to my code, and remove quotes before searching. I’m not sure how well DuckDuckGo would handle copy/pasted error strings with lots of quotes and unique identifiers. This means that I don’t know if DuckDuckGo handles all technical searches well. But it does a good job for me.

There are many domains where Google outperforms DuckDuckGo. Product search and local search are some examples. I recently made a window plug. It was much easier to find which big-box hardware stores had the materials I need with Google. I also recently bought a pair of ANC headphones. I got much better comparison information starting at Google. Google also shines with sparse results like rare programming error messages. If you’re a programmer, you know what I’m talking about: imagine a Google search page with three results. One is a page in Chinese that has the English error string, one is a forum post that gives you the first hint that you need to solve the problem, and one is the error string in the original source code in Github. DuckDuckGo often returns nothing for these kinds of searches.

Even though Google is better for some specific domains, I am confident that DuckDuckGo can find what I need. When it doesn’t, Google often doesn’t help either.

Sample of times when both Google and DuckDuckGo failed me

I tried to write a protobuf compiler plugin using the official PHP protocol buffer bindings. I now believe that writing a protobuf compiler plugin in PHP is impossible due to several arbitrary facts, but I needed to piece this information together myself. My searches sprawled over Google and DuckDuckGo across several days before I concluded that it could not be done and that I could not find a workaround. This isn’t DuckDuckGo or Google’s fault. Some things just don’t have answers online.
I often fell back to Google for gif searches. It turns out that I’m bad at finding gifs. Sometimes I get exactly what I want, like searching for gritty turning around. But I had a lot of trouble finding a string that gave me this. Eventually I found it by remembering a Twitter user that had posted it and scanning their “Media” posts.
Trying to find a very specific CS:GO clip that I had seen on Reddit years ago. I found it via a combination of Reddit search and skimming the bottom of Reddit threads for video links.
What is australian licorice? Is it a marketing gimmick? Stores sell it. It’s tasty. But I can’t find an explanation anywhere.

If you’re thinking of switching to DuckDuckGo because of the Google redesign, I’ll save you the trouble: DuckDuckGo’s inline ads are formatted similarly to the Google redesign that got reverted. If anything, DuckDuckGo’s ads are harder to spot because DuckDuckGo’s (Ad) icon is on the right, while Google’s was on the left where my eyes naturally skim.

It turns out that I care about privacy, but I still use Google Analytics on my blog. I haven’t been thinking about digital privacy for long enough to have a consistent and principled opinion. Sorry about that.

Let’s go back to the original selling point of DuckDuckGo: they don’t track you.

I have been reading my DuckDuckGo searches in my browser history for this post. It’s wonderful that all of these searches remained private. Some of them should remain private for stupid reasons. I don’t want anyone to know that I searched for what is the value of a human life because it makes me sound like a killer robot. Other searches are much more sensitive. One is the name of a medication I’m on. Others are searches about pains and fears that I have. DuckDuckGo allows me to perform these searches without building a profile of me. I’m sure that advertisers pick up the scent as soon as I click a link. But I appreciate the delay. I didn’t think about the traces I left online when I searched on Google. But now that I know I have the choice, I’m actively comforted by reviewing my DuckDuckGo search history and reading everything that they didn’t track.

I also noticed that many searches show trends. I knew that this was true in theory. But it’s different when you see it in your own search results. A month ago, many of my searches related to vacation planning. But now they don’t anymore. The coronavirus scrapped my plans. But there are many life events that could have also caused this: health reasons, family problems, etc. These are things that ad networks could piece together as I visit sites. It’s possible to imagine even darker versions of this – imagine the months of searches that relate to a pregnancy with a miscarriage. Many companies could profit from a couple going through that process, if they showed the right ads in the right places at the right time. There is a lot of trend information that you just want to keep to yourself.

What happens moving forward?

I will continue using DuckDuckGo. I don’t see a reason to switch back to Google. I’m going to continue to fall back using !g when I need to. I’m going to try to avoid talking about the fallback (but let’s be honest, I just did it again).

I still use lots of Google products. I’m not in the process of porting away from any of them. I still use Chrome in addition to Firefox and mobile Safari. Google Docs still holds a place in my heart. Etsy is hosted on GCP and uses Google Apps. Google Photos is still the best place for me to store and share my photos.

I liked the exercise of reading a month of my search history. You should do it, too. It became clear that I broadcast lots of information by having these very personal conversations with search engines. I’d like to understand more about the digital traces I leave online.

I don’t want to turn into a digital hermit. But I would like to become more deliberate about the traces that I leave around the internet. Even as a developer, I’m not sure what will happen if I disable third-party cookies across the internet. But I’d like to start reading more about digital privacy to understand what tradeoffs I am making.

Disclaimer: I worked at Google from 2010-2015, but did not work on search.

www.bitlog.com

Jake Voytko's personal log of bits

Category Archives: industry

Finish something every day

Benefits

When does shipping every day work? When does it fail?

Finish something every day

DuckDuckGo is good enough for regular use

Google is the king of low-intent searches

DuckDuckGo is good enough

It turns out that I care about privacy, but I still use Google Analytics on my blog. I haven’t been thinking about digital privacy for long enough to have a consistent and principled opinion. Sorry about that.

What happens moving forward?