A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.



Internet voting, security, and transparency at E-VOTE-ID

7 minute read


Last week, I attended my first voting conference: E-VOTE-ID. I’ve presented at statistics conferences before but never an interdisciplinary one like E-VOTE-ID. It brought together people working on electronic voting issues from a whole range of disciplines: legal studies, sociology, cryptography and security, voting systems developers, former election officials, and one statistician. This guy!

Open Source Licenses Explained

3 minute read


I ALWAYS forget to put a license on my work until someone reminds me. I’ve learned over and over that it’s important, but I think the reason why it hasn’t stuck is that I was never taught why it’s important.

Why There Isn’t More Evidence That Pesticides Disrupt Your Hormones

9 minute read


I’ve really been gotten on the crunchy bandwagon this year – buying high quality grassfed meats, organic produce, paraben-free beauty products, and swapping out plastic food storage containers for glass ones. Up until recently, I was skeptical about the evidence that these choices really make a difference for your health.

Questions to Consider about Your Product from a VC Partner

4 minute read


I participated in my first hackathon two weekends ago. I use code to do data analysis most of the time, not write apps or websites. For me, it was more of a fun learning experience and I got to see what kinds of work are expected and rewarded.

Why do people lose interest in academic careers during grad school?

6 minute read


Fewer grad students are on the job market for faculty positions – is it because they realize that there are fewer jobs or because they are genuinely more interested in other career paths? Roach and Sauermann studied interest in academic careers in a way that has never been done before: longitudinal surveys of current graduate students. By giving people the survey twice, once in their first or second year of the PhD and then again three years later, they are able to measure changes in interest. Previously, people have only looked at cross-sectional data and compared two groups at different points in their PhD.

How Well Did I Follow Pedagogy Guidelines at R Bootcamp 2017?

7 minute read


This week I had the privilege of participating in two workshops: I was a participant at a train-the-trainer workshop to become a Software Carpentry instructor and an instructor at the R Bootcamp put on by the Statistics Department and D-Lab. It was a unique opportunity to spend two days learning how to teach one of these bootcamps, and then to put my skills to the test a few days later.

Embedding Python plotly figures in markup

2 minute read


A lightweight markup language is a simple, human-readable language for formatting text. It’s easy to read and compatible with most text editors. Documents written in lightweight markup are usually then converted to things that are harder for people, but easier for computers, to read, like HTML. The most common ones that I’ve heard of people using are Markdown, R Markdown, and reStructured Text. I imagine that most people who do data analysis/exploratory visualization/data science use a markup language more often than they write in raw HTML.

Which logistic regression method in Python should I use?

6 minute read


This question is related to my last blog post about what people consider when choosing which Python package to use. Say I want to use some statistical method. I have a few options. I could code it up from scratch myself, knowing that this might have undetected bugs and be pretty slow. I could Google what I’m looking for and use the first thing I find; similarly, there are no guarantees. Or, I could do my research, find all the packages that seem to offer what I’m looking for, and decide which looks best based on how thoroughly they’ve documented and tested their code.

What’s important when vetting open source packages?

6 minute read


I’m in the early stages of creating several Python packages right now (shameless self plug – see permute, cryptorandom, and pscore_match). I want people to actually use them when they’re ready. They have potential for wide use, but they have narrow functionality compared to big packages like numpy or scipy. I could imagine that somebody looking to do a particular task in Python, like propensity score matching, would do a Google search and stumble upon my package.



Simple Random Sampling: Not So Simple


We propose several best practices for researchers using PRNGs for simulations, including the wide adoption of hash function based PRNGs.


Talk 1 on Relevant Topic in Your Field

Published in UC San Francisco, Department of Testing, 2012

This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!