Wednesday, February 27, 2013

A true story about Big Science

Once, I decided to consult the literature for details about how to perform a certain selection test using PAML. I turned to my officemate Matt, and asked if he knew of any papers using it. He suggested three relevant papers, which indeed described details of that test, at least in their supplements. I was an author on two of those papers!

Sunday, February 24, 2013

My thoughts on the immortality of television sets

There's a new GB&E manuscript sensationally blasting a certain widely-reported claim of the 2012 ENCODE Consortium paper, namely that the data generated in that project "enabled us to assign biochemical functions for 80% of the genome." I'm one of 400+ authors on that paper, but I was a bit player - not at all involved in the consortium machinations that resulted in that particular wording, which has proven quite controversial, and has already been discussed/clarified by other authors big and small.

The first author of the new criticism, Dan Graur, is an authority on molecular evolution and authored a popular textbook on that topic (one I own!). The manuscript stridently argues that ENCODE erred in using a definition of "functional element" in the human genome based on certain reproducible biochemical activities, rather than a definition based on natural selection and evolutionary conservation. Interestingly, while the consortium was mostly focused on high-throughput experimental assays to identify the biochemical activities, my modest contributions to ENCODE were entirely based on examining evolutionary evidence, through sequence-level comparative genomics. So, a few comments by a former rogue evolutionary ENCODE-insider:

Tuesday, February 19, 2013

assert-type: concise runtime type assertions for Node.js

I recently published my first npm package: assert-type, a library to help with writing concise runtime type assertions in Node.js programs.

Background: An OCaml hacker's year with Node.js

The new DNAnexus platform uses Node.js for several back-end components, so I've had to write a fair amount of JavaScript in the year since I joined. Considering I wrote the majority of my grad school code in OCaml, a language found at the opposite end of Steve Yegge's liberal/conservative axis, this has been quite a large adjustment. Indeed, I frequently find myself encountering certain kinds of silly runtime bugs, and writing especially tedious kinds of unit tests, that are both largely obviated in a language like OCaml.

So, I still count myself a hardcore conservative. But there's certainly a lot I've enjoyed about Node.js. When requirements evolve, as they always do, JavaScript and Node's "module system" (those are air quotes) will usually offer quick hacks instead of the careful refactoring that might be demanded by a type-safe language. This incurs technical debt, but a lot of times that's a fine tradeoff, especially at a startup. More generally, Node's rapid code/test/deploy cycle is a lot of fun, without all the build process and binary dependency headaches. The vibrancy of the developer community is amazing, as is the speed at which the runtime itself is improving. (There was a period a few years ago when I feared OCaml was dying out entirely, but there's some real momentum building now.)

Sunday, February 10, 2013

Testing OCaml projects on Travis CI

Update (Oct 2013): Anil  Madhavapeddy has fleshed this out further.

This evening I spent some time getting unit tests for my OCaml projects to run on Travis CI, a free service for continuous integration on public GitHub projects. Although Travis has no built-in OCaml environment, it's straightforward to hijack its C environment to install OCaml and OPAM, then build an OCaml project and run its tests.

1. Perform the initial setup to get Travis CI watching your GitHub repo (up to and including step two of that guide).

2. Add a .travis.yml file to the root of your repo, with these contents:

language: c
script: bash -ex travis-ci.sh

3. Fill in travis-ci.sh, also in the repo root, with something like this:

# OPAM version to install
export OPAM_VERSION=0.9.1
# OPAM packages needed to build tests
export OPAM_PACKAGES='ocamlfind ounit'

# install ocaml from apt
sudo apt-get update -qq
sudo apt-get install -qq ocaml

# install opam
curl -L https://github.com/OCamlPro/opam/archive/${OPAM_VERSION}.tar.gz | tar xz -C /tmp
pushd /tmp/opam-${OPAM_VERSION}
./configure
make
sudo make install
opam init
eval `opam config -env`
popd

# install packages from opam
opam install -q -y ${OPAM_PACKAGES}

# compile & run tests (here assuming OASIS DevFiles)
./configure --enable-tests
make test

4. Add and commit these two new files, and push to GitHub. Travis CI will then execute the tests.

Working examples: ForkWorkyajl-ocaml

Installing OCaml and OPAM add less than two minutes of overhead, leaving plenty of room for your tests within the stated 15-20 minute time limit for open-source builds. I'm sure the above steps could be used as the basis for an eventual OCaml+OPAM environment built-in to Travis CI.

Sunday, February 3, 2013

Apartment hunting in Mountain View

Welcome to my fourth try at blogging; the last fell stagnant in 2006. In fact, I meant to start this one a year ago when I first moved here to Silicon Valley, but, better late than never...

I recently sent a friend some advice on apartment hunting in Mountain View, shortly after completing my second housing search here. First things first: it's not cheap. As of this writing, a decent 1br in this area will run $1500-$2000/mo, and a 2br will go for $2000-$2500. It's a bit more expensive in nearby Palo Alto, and a helluva lot worse in San Francisco!



A lot of high-density apartment housing developed in Mountain View in the 60's and 70's, presumably during the initial rise of Silicon Valley. As a result, most of the stock is of that vintage. They get renovated from time to time of course, but this only helps so much. Some warning signs to look for in an apparently nice unit: ungrounded (two-prong) electrical outlets, gravity wall heaters that make a lot of noise warming up and cooling down, lack of kitchen exhaust fan, several layered coats of paint (usually detectable around doorjambs and windowsills), superficial bathroom renovations involving acrylic slapped on the existing tile and tub, adjacency to noise sources such as Caltrain/Central Expwy and freeways (101/85).