Mastodon
4 min read

Let's Talk: You Don't Need That Data v1

"Let's Talk" is a series where I work in the open on a concept I have for a conference-type talk. This current talk is focused on convincing people we should collect less data
Let's Talk: You Don't Need That Data v1
Photo by Stephen Dawson / Unsplash

I mentioned a little while ago on Mastodon that I've been wanting to put together a conference talk, but

  1. I've never done one before and have no idea where to start
  2. I hadn't had a topic that I felt passionate about

As you can see in the post I linked, I at least have solved issue 2, and I've decided the best way to solve issue 1 is to iterate in public. What could go wrong there?

Right now, I am thinking the talk will be called You Don't Need That Data.

There are a few different angles I am wanting to cover, and they all feel interrelated

Content of the talk

  • If you are lucky, your work environment has moved from "data driven" to "data informed" but either way you still need data before you can act — even if the action needed is common sense
    • I have an accedote of a former boss that now works for a major healthcare company in the US, and they had months of meetings around where to put the "chat with us" button on their website. This meeting involed multiple directors and senior directors, all who were too afraid to make a decision without the data first
      • They could have run an A/B test at launch and then finalized something
      • They could have looked at how things are implemented across the internet and leveraged Jacob's Law to place it somewhere users would have expected it to be
      • They could have used COMMON FUCKING SENSE AND JUST PUT IT IN THE LOWER RIGHT HAND CORNER
      • They could have been brave and asked if their users even wanted chat and if where they were putting it was meeting their users expectations for that interaction model
    • Any of those choices would have moved them forward, but instead they were more comfortable wasting time and money waiting for the data to tell them what to do so they could make the "safe" decision
  • We have a tendancy to overcollect data on our users, thinking that we either need to know every little thing that is happening and every little piece of demographics, and that we will somehow use those pieces of information to improve either user experience or business operations
    • Gonna go out on a limb here and say that 99.9999999999999999999999% of the time that is bullshit
    • Adding more noise to a dataset makes it harder to extract anything meaningful
    • In the EU there are strict laws around what you can and cannot collect, so you are better off collecting less unless you really need it
    • In the US it is now becoming a risk to businesses and users to have certain types of data stored
      • That information will (not could, not might.. lets be honest with ourselves, it will) be used to target at risk and minority groups that have made the mistake of self-identifying with your company
    • Annecdote: I run some pretty strict ad blocking on my devices. The number of links from emails that break the second I click them because they are run through click trackers is mind boggling. If I am linding on a specific page with UTM params in a hot state, shouldn't that give you the data you need? Why are you leaking info to a 3rd party?

How it Ends

The takeaways I want people to have are
  1. Trust your gut - don't wait for data to tell you what to do. You probably know which option is going to win out. Get stuff out in prod and test
    • And if the issue is that your dev to prod cycle takes ~9+ months for simple updates and changes, then fix the fucking operating model.
    • If you can't fix the operating model - RUN. It is clearly broken by design and those keeping it broken don't want it to be fixed because they have some incentive to retain a top-down control over every micro-decision that happens and it shows there is zero trust for the doers in the company
  2. Collect less - odds are you aren't using the majority of the data you have now and most of it is just noise.
    • Start with an audit of your data warehouse/lake/iceberg/whatever we are calling it this month. See what is actually used for reports and what is collected for the sake of collecting it. See what is putting you at risk if there is a breach. See how much you'd save in storage and what the performance improvements would be if you got rid of all the excess
    • Every time something new gets added and people want to track every last bit, ask what it is going to be used for and how the success of tracking that information is being measured. Count how many times the answer is "we always track this"

The whole inspiration for this came from a conversation with my wife. We were joking about the inevitable dystopian future where you go to the bathroom at work and need to badge in at the toilet paper roll for a metered amount of TP sheets, and how after either using too much TP or taking too much time you'd be reported to your manager for not being efficient enough with your bowel movement.As we were riffing on how this situation could get worse and worse, she mentioned lowering the height of the stall walls and doors so you could have eye contact with people as they enter, and then said that was a great line to use to shoot down the "we need to collect data on <insert bad idea>". She said:

You wouldn't need to collect data to know lowering the height of bathroom stalls is a bad idea, no matter how much faster it made people use the bathroom. Why are we wasting time collecting data to make this decision?

That's the "spaghetti against the wall" draft of this talk that has been bouncing around in my head. I don't know if it is interresting, if it makes sense, or if anyone would want to hear any of this.

I'm going to keep on refining it and hopefully people will want to give me feedback. Feel free to hit me up on Mastodon with comments since I haven't figured out how to turn comments on here on Ghost yet, and I am not going to put my email out there to help limit spam.

Thanks!!!