Nonresponse bias: The authors sent the survey to 30,000 first- and second-year doctoral students. They ended up with 854 people who responded to the first and second survey. They lost people at the following points:
30% of people who received the first survey responded
Of those people, only 80% provided a permanent email address to follow up with
Among the people who received the second survey, only 40% responded
They noted that conditional on having done the first survey, people were more likely to respond to the second survey if they were US citizens and were second year students at the time of the first survey. They say “controlling for these factors, we do not find significant differences with respect to career interests”. What does that mean?
They entirely disregard nonresponse to the first survey. I could imagine that this would bias results. For instance, who is on top of responding to emails?
Arbitrary categories for the response variable: Defining categories is somewhat artificial. The survey instrument has five categories ranging from 1 (very disinterested or dissatisfied) to 5 (very interested or satisfied), with 3 denoting no preference. The authors dichotomized the survey responses to interested (4 or 5) or not interested (1 through 3). Would the results have changed if they changed the way they dichotomized, or if they had dealt with the survey scale directly?
Nonparametric analyses? They run a bunch of t tests in the section called “Nonparametric Analyses”. The t test is very much a parametric test. The assumptions aren’t quite satisfied, but the comparison of means is interesting and illustrative.
Complex regression models: They did a complicated parametric, multi-category regression to assess the joint effect of various factors on career preference. The structure of the model is odd: it posits that the variables are linearly related to the 5-point Likert scale response and it puts the responses measured as percents on the same scale as the 5-point Likert scale questions. I have no doubt that these models are measuring correlations in the data, but I don’t trust inferences about the magnitudes of coefficients.
Take-aways
In my mind, the missing piece in this paper is the breakdown by gender/race/university. I’d like to know how different groups change their attitudes over the course of their PhD.
The results seem straightforward otherwise: those who lose interest in an academic career realize they don’t enjoy research as much as those who remain interested.
In my field, that realization seems to stem from the fact that there’s a disconnect between the kind of research that we want to do and the kind of research that is rewarded and recognized.
If you follow me on social media, you might’ve seen that I’ve been traveling a ton this past year, and most of it has been related to my grad school work. In my five years as a PhD student, I’ve visited five states and five countries for conferences and other events. As someone who didn’t travel much as a kid, I’ve been loving these opportunities!
Last week, I attended my first voting conference: E-VOTE-ID. I’ve presented at statistics conferences before but never an interdisciplinary one like E-VOTE-ID. It brought together people working on electronic voting issues from a whole range of disciplines: legal studies, sociology, cryptography and security, voting systems developers, former election officials, and one statistician. This guy!