Chris’s main conclusion was that basic job came out looking way better than basic income. Additionally, a major purpose of his post was to encourage other people to play around with the math as well rather than just bloviating. Since I’m a big basic income proponent and have some quibbles with how he came to conclude that basic income doesn’t look too good, I will follow his lead and play around with the math.
I don’t know Chris Stucchio and I don’t know if he was inherently biased for basic income or basic job, but I’m definitely inherently biased for basic income, so take this whole post with a grain of salt. However, to give it some semblance of fairness, I’m going to write this whole thing without doing any math. I’m going to make what I think are reasonable changes to Chris’s assumptions and see what that tells me. Maybe it will say basic income sucks, and then I will be sad, but I will still publish those results. You’ll just have to trust that I’m telling the truth, I suppose.
In Chris’s model, basic income is paid to everyone. It is also possible to have a system like progressive income tax, where it gradually phases out; in fact, fellow Rutgers alumnus Milton Friedman proposed to implement basic income through a negative income tax. So let’s imagine some system like that and reduce the costs by 50% right off the bat.
direct_costs = num_adults * basic_income / 2
Chris correctly noted that there are incentives for more work and less work in basic income. He thinks it’s more likely that the negative incentive will be more prominent. I think it’s more fair to just call it a wash, since it’s very unclear. So I deleted that part of his model. I doubt this has a big impact on anything anyway.
At this point, I want to add an effect that has been neglected. Chris treated the number of disabled adults as a constant, but that is likely not true. So let’s conservatively say 2 million people currently on disability would start working if they got a basic income, likely at some not-so-great wage.
undisabled = 2e6
undisabled_hourly_wage = uniform(0, 10).rvs()
undisabled_cost_benefit = -1 * undisabled * (40*52*undisabled_hourly_wage)
Chris included the “JK Rowling effect”, the odds that someone not forced to work a shitty job could create a great achievement that would have a significant positive economic impact, like JK Rowling writing Harry Potter while on welfare. I think there should be an additional effect for less spectacular events. With a basic income, many people would be free to pursue new career paths and start small businesses (or even bring existing careers and businesses out from under the table, as people on welfare often cannot work without facing penalties). How big is this effect? Fuck if I know. But I want to include something. Fuck, let’s just say that basic income improves average productivity by something between 0 and 20%. The average hourly wage in the US is about $25/hr and I don’t know if the average wage for increased productivity should be higher or lower, so let’s pick it from between $10 and $30.
avg_hourly_wage = uniform(10, 30).rvs()
productivity_multiplier = uniform(0.0, 0.2).rvs()
productivity_cost_benefit = (-1 * labor_force * (40*52*avg_hourly_wage) *
productivity_multiplier)
Now let’s move to basic job. Most of Chris’s assumptions seem good enough. I’ll make one change – the value of work from people who currently aren’t working. Chris says it’s worth somewhere between $0/hr and $7.25/hr, as otherwise they’d probably be working a minimum wage or higher job. Sounds reasonable enough, but there are also people who bring negative value to the table. These people would be forced to work, likely in some boring job they hate. So I’m doing this:
basic_job_hourly_productivity = uniform(-7.25, 7.25).rvs()
I could definitely quibble more, but somebody could quibble with my changes too, so I don’t want to go too crazy. The above changes seem reasonable enough to me. So here’s my modified code. Now I’m going to try to run it. This will be interesting not only to see the results, but to see if I could make these changes without introducing a syntax error!
Lower is better on these plots, so it looks like basic income wins! At least, if you agree with my completely unbiased assessment…
Update: Chris posted a follow-up article that I basically entirely agree with.
]]>Otherwise, please move along.
]]>
Here’s the plan:
Doing this requires a lot of play-by-play game data that allows me to look specifically at jump shots (no dunks). One source of this data I found was BasketballValue.com. They have data from 2005-2012; however, the first season is in a different format so I am ignoring it. So I have 6 seasons of data, from the 2006-2007 season up through last year. They do not (yet?) have data for this year, so all of the analysis below does not include the 2012-2013 season.
So I did the analysis as I described above (data and source code are on GitHub). First, I looked at the entire league, pooling together all players. The average FG% for a jump shot from 2006-2012 was 36%. The average FG% for a player who did not have any jump shots blocked in a game was also 36%. But what about players who did have a jump shot blocked?
This is no statistical fluke. These numbers are from tens of thousands of shots by thousands of players in thousands of games. The sample size is large.
However, for individual players, the sample size is small. Steph Curry has only taken 289 jump shots in games where he had a jump shot blocked. And from those shots, he actually made a higher percentage (34% vs 29%) after he had a jump shot blocked. This goes against the leaguewide trend, but it’s not horribly rare. Below you can see a table and scatter plot for every player who has taken more than a handful of shots in these situations.
I would not read much into the value for individual players, since the sample sizes are relatively small, but the leaguewide trend is clear. NBA players shoot worse after they have a jump shot blocked.
Why? Maybe they are overcompensating. Maybe they are scared. Maybe they play differently later in the game than earlier (due to the score, for instance), regardless of if a shot is blocked or not. Who knows? Certainly not me. If anyone wants to investigate further, you can just take my code and start hacking.
Before Block | After Block | Overall | |||||
---|---|---|---|---|---|---|---|
Name | FG% difference | FG% | FGA | FG% | FGA | FG% | FGA |
One of my main scientific goals is the application of mathematical models to find interesting insights into biological systems. This is a really broad goal, as depending on the area, there may be very different ways to gain insight. Here, I want to discuss one example, an interesting paper by Sriram and coworkers that was published in PLOS Computational Biology last year entitled “Modeling cortisol dynamics in the neuro-endocrine axis distinguishes normal, depression, and post-traumatic stress disorder (PTSD) in humans”.
From the title of this paper alone it is already clear that an interesting application of a model is their primary goal. Their hypothesis (based on a prior hypothesis in the literature) is that differences in cortisol profiles between different types of stress can be explained by the responsiveness of the hypothalamic-pituitary-adrenal (HPA) axis, a key player in the body’s response to stress. They built a model of the HPA axis not dissimilar to a model that I previously studied, albeit with very different goals (if you trace the citation history back, both my paper and Sriram’s paper are based on this paper).
But this isn’t about me. Let’s get back to the topic at hand.
From a purely mathematical perspective, the primary novelty in Sriram’s model is the inclusion of an additional degradation term in every equation. So instead of just having a first order degradation term in each equation, they also added a Michaelis-Menten degradation term meant to model enzymatic degradation.
They fit this model to three different datasets: PTSD, depressed, and normal. One concern, which they mention, is that they are heavily data-limited and thus have only 3 cortisol profiles for each case. That of course makes you wonder about how generalizable and predictive this is, since with that little data you can’t cross-validate, but it is certainly enough for an interesting preliminary study. They use the different fits of the model to conclude that the feedback properties of the HPA axis (i.e., model parameters) are different under the different types of chronic stress, as they hypothesized.
In other words, the model also allows them to look at how different types of stress look in the parameter space, rather than just by looking at a somewhat arbitrary high-level marker like cortisol levels which may not reveal the full picture of what’s really going on. The model also allows them to explore bifurcations, transitions between different types of stress, and various interesting things like that.
However, I am a bit concerned by this passage from the Methods section:
Although more parameters could be different between the three groups, according to the hypothesis, only two kinetic parameters, namely k_{stress} and K_{i}, are considered to be significantly different in the three pathological cases. Therefore, the model calibration was performed simultaneously for the three time series, allowing k_{stress} and K_{i} to differ for all the three cases, and forcing the remaining 18 parameters to be the same.
If their hypothesis is that everything is driven by those two parameters, and thus they only allow those two parameters to vary when they’re fitting their three different cases, and then they observe different values for those two parameters in those three cases, that’s not really strong support of their hypothesis, is it? They never discuss if other combinations of parameters could capture the same results when allowed to vary and fit to the same data. Maybe they could have achieved similar results with some other parameters. But we don’t know, because they only tested the ones that they a priori hypothesized to be important.
Another interesting aspect of this paper relates to biological rhythms. Well-known are circadian rhythms, which lead to a clear 24 hour pattern in the output of the HPA axis. Less well-known are ultradian rhythms, a term basically referring to any rhythm faster than 24 hours, which in the context of the HPA axis is apparent in roughly hourly oscillations in HPA axis output. This paper says that their model can reproduce both circadian and ultradian rhythms in a single model, given appropriate parametrization. However, their simulations don’t actually show this, as the parametrizations they reached had only circadian rhythms. Therefore, it is not clear to me if there are actually reasonable parameter values that give rise to reasonable dual rhythms.
The authors note that it is the addition of the Michaelis-Menten degradation terms that allows for the production of both circadian and ultradian rhythms. What seems less clear to me is the precise physiological processes meant to be represented by these terms and whether there is sufficient data/evidence to include those terms (and their numerous parameters) rather than, say, adding an explicit delay. As their sensitivity analysis (Figure 7) found, some parameters related to degradation have extremely low sensitivities, for instance V_{S5} which govern the enzymatic degradation of cortisol. The parameter governing the linear degradation of cortisol, K_{d3}, has a much higher sensitivity. Looking at the parameter values in Table 1, K_{d3} is a c couple orders of magnitude larger than V_{S5}, so it’s doesn’t seem surprising that when these factors are used as coefficients for linear combinations of terms, the former turns out to be far more sensitive.
In total, I really like the conceptual idea behind this paper, the idea of using models to assess more fundamental underlying parameters that are difficult to directly measured. However, I’m not sure how much these results contribute towards supporting the hypothesis that it is the feedback properties of the HPA axis that produce different outputs in response to different stressors. Even so, I found the paper to be interesting and suggestive of model-based approaches towards stratification that may be useful in a variety of different domains.
Some of the content in this post was based on discussions with my friend Pantelis Mavroudis.
Sriram, K., Rodriguez-Fernandez, M., & Doyle, F. (2012). Modeling Cortisol Dynamics in the Neuro-endocrine Axis Distinguishes Normal, Depression, and Post-traumatic Stress Disorder (PTSD) in Humans PLoS Computational Biology, 8 (2) DOI: 10.1371/journal.pcbi.1002379
]]>…or did he?
Let’s think about how the NFL measures yardage. They take the difference between where the ball was before the play and where the ball is after the play, and then they round to the nearest integer. So what happens if you rush for half a yard? It’ll get recorded as either 0 yards or 1 yard. Spread out over an entire season, and this kind of rounding error can have a big impact.
So here’s the idea: let’s calculate the odds that Adrian Peterson actually outrushed Eric Dickerson. To do this, I made some assumptions and then ran a bunch of simulations.
The main assumption was that the length of every rushing attempt could fall anywhere within -0.5 and +0.5 yards of the reported total, with uniform probability. I think that makes sense, because a carry reported as 6 yards could just as easily be 5.7 yards or 6.4 yards or whatever. There are obviously some caveats to that, but I think it’s good enough for some quick estimates.
So, based on that assumption, I took the actual rushing totals and added a random error for each carry to come up with one realization of what true unrounded yardage total could have led to the total in the record books. I repeated this a lot of times, for both Peterson and Dickerson. In other words, I calculated the distributions of real rushing totals that could, through accumulated rounding errors, end up reported as 2097 yards for Peterson and 2105 yards for Dickerson. Here’s what it looks like:
Clearly, these two distributions overlap significantly. If they didn’t overlap, that would mean that one player’s rushing total was always higher than the other’s. Instead, it means that it is possible that Peterson actually outrushed Dickerson.
From these simulations, it was straightforward to assign probabilities to these possibilities by testing which player had more simulated years as the overall rushing champ. I found that 85% of the time, Dickerson came out on top. This means that…
There is approximately a 15% chance that Adrian Peterson actually broke Dickerson’s record, but it was not noticed due errors accumulated by rounding the lengths of rushes to integer values.
Nine yards what? Indeed.
For completeness, here is the MATLAB code I used to run the simulations, generate the plot, and estimate the probabilities.
% Simulations
N = 100000; % Number of random seasons to simulate
yp = zeros(1, N) + 2097; % Peterson's total yards for each random season
yd = zeros(1, N) + 2105; % Dickerson's total yards for each random season
for i=1:N
% For each player in each simulated season, add a random error (between
% -0.5 and 0.5) for each carry
yp(i) = yp(i) + sum(rand(348, 1) - 0.5);
yd(i) = yd(i) + sum(rand(379, 1) - 0.5);
end
% The sum of random uniform numbers produces a normal distrubtion..
x = linspace(2070, 2130, 500);
figure;
h = plot(x, normpdf(x, mean(yp), std(yp)), x, normpdf(x, mean(yd), std(yd)));
legend('Adrian Peterson', 'Eric Dickerson', 'Location', 'Northwest');
xlabel('Total Yards');
ylabel('Probability Density');
xlim([2070, 2130]);
% Line styling
set(h(1), 'Color', [122, 16, 228]/255, 'LineWidth', 3);
set(h(2), 'Color', [0 0 1], 'LineWidth', 3);
sum(yp > yd)/N % Probability that Peterson's total is higher than Dickerson's
]]>
That being said, I was pretty shitty at cooking back then. Here is a list of some of the key things that I’ve learned since then, in no particular order:
There is a universal cheat code for making delicious vegetables: roast them. Put the oven at like 450, chop up the vegetables into relatively small pieces, put some oil, salt, and pepper on them, stick them on a baking sheet, and then roast them until the outsides are crispy (maybe flipping them over once or twice as they cook). If you want to spice it up, add some garlic or put on some cheese or lemon juice after they’re done. I’ve done this for broccoli, brussel sprouts, string beans, sweet potatoes, carrots, parsnips, tomatoes, asparagus, and probably some others I’ve forgotten. It always works. I cannot overstate how much tastier roast vegetables are than steamed vegetables. It’s just completely and utterly different. The only downside is that you have to wash the baking sheet afterwards, which can get kind of messy, but that’s a small price to pay.
Speaking of vegetables, sweet potatoes are essential. Cheap, nutritious, great tasting, and they last for a long time without going bad. If you don’t want go through the trouble of cutting one up and roasting it, just stick it in the microwave for 5-10 minutes. It’s not as good as roasting, but it’s pretty damn good and very easy.
There is huge variability in cheese quality, and it’s largely independent of price. Sure, the cheapest of the cheap is pretty shitty, but beyond that, there’s no telling how good a cheese will be. So I have a shorcut for you: Cabot Cheese. They sell that brand at all the grocery stores here, it’s not particularly expensive (it’s one of the cheapest beyond the real bottom-of-the-barrel stuff), and it tastes fantastic. Far better than the vast majority of more expensive cheeses. This one is my favorite.
You can get groceries delivered most places. I use Peapod, but there are plenty of other competitors. This is an absolutely ridiculous time saver. The time I spend grocery shopping in an entire year is probably about equal to the time a typical person spends every week. There are obvious downsides, like less flexibility, service fees, less choice when picking out fresh ingredients, etc. But how much is your time worth to you?
Buy meat in bulk and freeze it. It’s way cheaper that way, often half the price of a more reasonable sized package. It’s easier to work with smaller packages, but you can just buy freezer bags and separate the huge packs of meat into manageable portions.
Cabbage is really cheap and healthy, and it basically soaks up the flavor of whatever you put it with. So you can do stuff like this.
Whole milk is delicious. I used to think that I didn’t like milk, so I stopped drinking it for a while. But when I tried some whole milk that was left over from a recipe, I realized that it’s awesome. I think the reason I thought I didn’t like milk is because my parents would buy skim or 1%.
Everyone knows that you can buy lunchmeat from the deli counter at a grocery store. It’s great, but it’s usually expensive. What I didn’t realize until more recently was that you can also buy prepackaged lunchmeat for much, much cheaper. It’s lower quality, but if you put it on some good bread with good cheese (see above) and other toppings, it’s perfectly fine.
A very cheap and easy recipe for making incredibly delicious chicken thighs.
The key to making great burgers: there really isn’t one, so don’t worry about it. I take ground beef straight out of the package, coat it in salt, and fry it. Delicious. No need to add weird ingredients, mix things together, form perfectly shaped patties, etc.
A mixture of kielbasa, beans, tomatoes, and pretty much any other vegetables/leftovers/whatever you have is a decent meal, and it can easily be made in mass quantities. Leftovers are convenient.
Here is a calculator that will compare the odds of your single vote swinging the 2012 US presidential election with the odds of you dying on the way to your polling place.
So, how is this estimate made? Well, just think about what rare confluence of events would have to occur for your vote to swing the election.
Nate Silver currently claims that there is an 0.4% chance that my home state of New Jersey will be the “tipping point” state in the 2012 election. What does that mean? The tipping point state is the state that provides the decisive electoral vote. An example: imagine Romney wins the election and manages to carry New Jersey. Did New Jersey really matter?
Probably not.
New Jersey is so liberal that, if Romney won New Jersey, there is roughly a 99.6% chance that he would have already had enough electoral votes from states he won by larger margins, so swapping the results in New Jersey wouldn’t have changed the anything. There is only an 0.4% chance that New Jersey would be that bellwether between winning and losing.
But even in a landslide election there is a tipping point state, a state that lies at the center of the electoral vote distribution. So what are the odds that New Jersey will be the tipping point state and the election will be decided by only the tipping point state? Far lower than 0.4%.
We all remember the tipping point state in 2000: Florida, decided by a margin of 537 votes. We don’t all remember New Mexico in 2000, which actually had a smaller margin of 366 votes but didn’t get much publicity because it was not the tipping point state – no matter which way New Mexico voted, the results were the same. But still, New Mexico in 2000 was the closest result ever in a US presidential election.
But even 366 is far greater than 1. If a single extra New Mexican decided to vote that day, then instead Gore would have won New Mexico by 365 or 367. Big deal. What are the odds that a state will be the tipping point state and also have a single vote decide its result?
I won’t bore you with the details (read: I don’t want to bore myself with the details), but this question has been investigated by by Gelman et al.. They found that, for the 1992 election, there was roughly a 1 in 10 million chance that New Jersey would decide the entire national election by a single vote. After doing some more reading, I noticed a more recent paper by Gelman et al. (with the et al. now including Nate Silver) studying the 2008 election, and it makes things look even worse. A single New Jersey voter in 2008 only had roughly a 1 in 150 million chance of deciding the election. Given that the 2012 election is probably going to be similar to the 2008 election, I’ll use this value.
However, those 1 in 150 million odds don’t necessarily factor in what actually happens in close elections like the 2000 presidential election. “It is true that the outcome of that election came down to a handful of voters; but their names were Kennedy, O’Connor, Rehnquist, Scalia and Thomas. And it was only the votes they cast while wearing their robes that mattered, not the ones they may have cast in their home precincts.” So even if your vote is truly the deciding vote in the election, the inherent error in counting ballots will likely make it such that a somewhat arbitrary legal process will decide the election. And in that case, does it really matter what your vote was?
However, that line of reasoning is somewhat convincingly refuted by yet another paper by Gelman et al. (see: the appendix). Basically, just as a vote can swing an election from 50-50, you can also imagine a vote moving the outcome from inside to outside the range of closeness needed to instigate an arbitrary legal resolution to the election. So I’m going to stick with the 1 in 150 million estimate.
I bike to get around. There is about one bicycle death per 10 million miles. Biking to my polling place adds 4 miles beyond my normal daily commute. That means that the odds of me dying in an accident on the way to vote are roughly 4 in 10 million – 60 times the odds of my vote swinging the election.
Repeat: I am roughly 60 times more likely to die on my way to the polling booth than I am to cast a meaningful vote in the Presidential election. Similar odds probably hold true for you. Try the calculator above to find out.
There is only one thing we say to the God of Death: not today. So I’ll skip that 4 mile death ride and let the rest of you guys decide which neocon wins the presidency.
The death rates for various modes of transportation come from a few different sources. Let me know if you find a better source.
The data on the odds of swinging a presidential election come from “What is the probability your vote will make a difference?” by Andrew Gelman, Nate Silver, Aaron Edlin. These estimates are from the 2008 election, but they are the most recent ones available (AFAIK) and they are probably pretty similar to the 2012 values.
]]>In the mean time, I read the abstract, all that was available at the time. I noticed the affiliation of one of the authors: Rutgers, my undergraduate alma mater and current graduate school! Awesome, I’m all for school spirit! Except, the affiliation didn’t actually say “Rutgers”, it said “Butgers”. A humorous typo… or something more sinister?
A simple OCR error could mistake Rutgers for Butgers. However, after waiting for my scanned copy of the original paper, I found that it was not an OCR error, it was a typo in the original publication.
I emailed Nature and told them to fix the typo. No reply, of course. The conspiracy runs deeper.
Conspiracy? Oh, yes. I wondered how common this type of typo is. So I googled Butgers, and it seems to be most commonly used as a derogatory term by our athletic rivals. Maybe they got bored by always beating us in sports, and they decided to focus their efforts on mocking us in our own academic publications?
Let’s test that hypothesis by doing a Google Scholar search for Butgers. There are only 326 results, most of them from the pre-digital era where OCR might serve as a plausible excuse. Look through that list of papers. Notice the topics. The journals. Key phrases will jump out at you:
Really? Really? Am I supposed to believe that, out of all the thousands of papers published by Rutgers University, the vast majority of the ones with “Rutgers” replaced by “Butgers” are about sanitation, sewage, and waste?
I don’t know who’s behind this, but whoever he is, he has a great sense of humor.
]]>