Saturday, May 4, 2013

Lamenting the rise of the bio-brogrammers

A couple years ago the "brogramming" meme spread through Silicon Valley and the broader tech world. One manifestation would feature a hipster-filtered photo of the eponymous brogrammer in Wayfarer sunglasses, popped collar, drink or dumbbell in hand, tapping on his laptop in some incongruous setting. A former colleague of mine featured herself in some of the funniest ones I can remember. Less-tasteful versions included scantily clad women in supporting roles.

The meme started innocently enough, of course: as an expression that the ranks of software engineers aren't just populated by pale, odorous introverts with their glasses held together by tape. But then we thought about the meme and the attitudes that underlie it.

Today, it's de rigueur for tech companies to openly denounce brogramming and its sexist, exclusionary undertones. Not only because it's the right thing to do, but it's also a matter of survival: it's simply too hard to find top talent, and we cannot risk alienating wide swaths of the pool. The fall of brogramming was, of course, just one step in an ongoing journey, which continues to hit roadbumps.

~~~

Over the last few days, we saw echoes of this attitude in the genomics and bioinformatics community, when a twitter meme erupted featuring maternal insults phrased in terms of genome size. Most of these were hilarious and innocent. Some were clever but also sexist and degrading - conflating C-value with bra cups, giggling at those TATAs, or interpreting the .bed and .bam file formats as verbs, to call out some unfortunate examples. A few were just horrible.


I'm sure a vast majority of the participants would disavow the more offensive examples, and that they generally bear no ill will toward women in science. Similarly, the problem with 'brogramming' wasn't any intended overt insult to women, nor any direct claim of their inferiority; rather, the problem was the subconscious attitude, which expresses itself in subtle ways, insinuating to others and especially to women: you are not one of us.



In the case of "yo mamma's genome so fat...", the truly offensive examples are a small minority, but they were enabled and encouraged when the meme was spread by PIs, much-followed thought leaders, and at least one trade rag. You set, and they - the bio-brogrammers - spiked.

I submit that the meme could have started, 'Your genome is so big...'. Consider the following tautology: either this is at least as funny, or else it is less funny. If the former, then why include an unnecessary ebonics reference to a woman? If it's not as funny, then why not?

~~~


To conclude on a lighter note, I can recall, around the age of 14 or so, downloading a big text file of 'yo momma' insults, one per line. I wrote a Visual Basic program to select and display a random insult from the text file. However, the insults had several different categories: some referred to weight, of course, and others intelligence, odor, and so on; and another feature of my program was to select the desired category. (Alas, genome size was not among them.)

My first try was to select a random insult from the whole list; if it matches the selected category, accept and display it; otherwise, reject and sample again. I like to think of this as my first Monte Carlo sampling algorithm. However, I noticed that it would sometimes take a long time to find an insult from the requested category, especially those with relatively few insults. I needed a better algorithm, so I started looking into data structures. A (category,insult array) map solves that problem nicely. But what if the data are organized in this fashion and the user selects an insult from 'any category'? I could first choose a random category, then a random insult from the associated array. However, this would lead to a bias toward the smaller categories. You have to sample the category with probability proportional to the length of the associated array. Bingo - instant, uniformly random 'yo momma' insults, at your fingertips.

Fifteen years later, with two CS degrees well under my belt and the third impending, while standing at a whiteboard at the Googleplex, I was asked how to sample uniformly from an unevenly sharded dataset. I nailed that interview - but I didn't tell this story, because it was about my silly and immature 14-year-old self giggling at 'yo momma' jokes.