The task of writing a summary of the person that is Anshuman, or Ansh, is a daunting one. I met Ansh through the Internet, and even with the limitations that provides considered, Ansh is still bit of an enigma. As he outlines in this dialogue, Ansh practices the philosophy of keeping his image small and is somewhat aligned with the ideas of post-rationalism. I respect this about him; among many other traits he possesses. As will be evident to everyone, Ansh is extremely well-read, intellectually-honest, and thinks deeply about how he consumes information as well as what that information should do to how he perceives the world.
Just prior to uploading this, I asked Ansh if there was anywhere I could direct people, hoping I could help him capture some value from the time and effort he put into this conversation with me. His response: I have nothing to plug or anything like that, so no need to include anything extra when you publish. From what I can tell, Ansh dedicated many, many hours to this project simply out of good-faith and intellectual-curiosity. I can’t thank him enough for that. I learned a lot from the time he gave me.
And now, because I have said so little about him, possibly the best way I can give you a conception of Ansh, is to let his own words do the talking. The following is the very last paragraph of this piece…
“I realized I’ve opened up several highly controversial cans of worms here, the contents of which I personally feel ill-equipped to explore much further, but future technological progress occupies a nontrivial amount of my headspace these days. The future is such a fascinating place, and I’m very glad that we’re all travelling to it together.”
If you enjoy this piece, please do reach out and say so — especially to Ansh. Feel free to direct any criticism my way, though.
Thank you so much for joining me here. I’m really intrigued to see where this goes, but I have no doubt it going to a great discussion. For me, at the very least!
I have been wanting to set up this conversation for a while, as I know we share some common interests, though we haven’t dived too deeply into them yet. However, today is the day.
To generalise, I would say that we both are consumers/participants in the online “rationalist” community — broadly speaking. This community might be typified by blogs such as LessWrong and SlateStarCodex, among others, and tends to be oriented around a number of different disciplines and areas of study, such as probability and decision theory, cognitive and computer science as well as evolutionary biology and various realms of philosophy. Again, a generalisation, but to some degree this means we probably share similar values and way of dissecting up the evidence of the world.
Would you agree with what I have mostly said so far?
Building on that, I thought it might be cool to start off with a little bit about your background and how you think it has influenced your worldview.
So, let me put this to you: How you make sense of the world? Why do you think you see the world that way? And, what makes you think that the way you see the world is — at least, partially — correct?
The word that describes me most fundamentally is curious. I won’t pretend to be curious about every topic under the sun, but the ideas that catch my fancy can quickly spiral into obsessions. It’s a character trait that turned me into a prodigious reader from an early age, consuming literature of widely differing genres and ages, rapidly seeking to absorb as much knowledge as was humanly possible. It’s a drive that has only grown stronger as I’ve aged, even as the task has exploded in complexity and scope: the younger me may have been optimistic about the possibility of satisfying my curiosity, but an additional decade or so has drilled into me the complete futility of such a goal. Still, I muddle on, though now my knowledge acquisition pipeline has diversified and consists of reading blog posts and news and journal articles in addition to books, as well as listening to podcasts, taking online and in-person courses, watching videos, etc. My formal schooling has generally focused on Computer Science and Statistics and I’m due to graduate in a few months with a degree in Statistics and Data Science. I want to be clear in asserting that my schooling has been only one component of my learning, however, and my extracurricular interests and pursuits have been broad though likely more shallow in depth.
I would struggle to define a purpose for such knowledge acquisition. It’s undoubtedly instrumentally valuable (in work, school, and ordinary conversation, even), but I would be lying if I pretended even a small percentage of my curiosity was motivated by the possibility of this knowledge being useful. It’s always a delight when it is, but I’ve come to accept my curiosity as a brute fact, an intrinsic value that I seek to optimize for its own sake.
In the process of trying to absorb as much as I can, I’ve begun to think of myself as a collector of different ways of viewing the world. Isaiah Berlin wrote of hedgehogs and foxes, and I would humbly suggest that I’m a hedgehog whose one big idea is foxiness. Scott Page has written a book on the virtues of mental and computational models, models that allow us to categorize and process the information and data we collect throughout our daily lives and it’s this kind of foxiness that appeals to me, likely due to my quantitative background and style of thinking. There are some models that I tend to rely on more than others, perhaps due to quirks of my psychology or cultural and educational background, but I’d like to think that I’m always open to at least considering alien ones who offer novel insights.
In this lifelong process of gathering viewpoints and frameworks, some problems naturally arose. How could I be certain in the accuracy of any of these models? How could I know which would be the “right” one for any given situation? Both of these problems, and the quest to discover solutions to them, led me to the rationality community, typified by LessWrong and SlateStarCodex and thinkers like Eliezer Yudkowsky and Scott Alexander. Eliezer’s definition of rationality, though perhaps unoriginal, allowed me to conceptualize my difficulties. I was trying to become more epistemically rational in trying to ensure that my models, though imperfect, could approximate truth, and I was also trying to be instrumentally rational, in ensuring that I could find the right map to reflect the territory of a given situation and better achieve my goals. The rationalist project of trying to find and defeat human biases that stand in the way of becoming more rational appealed to me rather immediately, though over time I’ve become more enamored by the post-rationalist project exemplified by SlateStarCodex, one that is more attuned with the difficulty of the rationalist quest and attempts to synthesize its aims with a sense of meta-rationality, or an understanding of when “irrational” behavior might arise and might be valuable.
There are many reasons why these movements and their associated figures appealed to me. A quantitative background, a preference for consequentialist ethical frameworks, an introverted social disposition, etc. There are failure modes inherent to all of these philosophies, many of which I am keenly aware of, but I’ve resigned myself to the task of doing the best that I can. To answer your final question, I’m not sure at all that my way of seeing the world is correct- a proper application of Aumann’s Agreement Theorem would have me recognize that I should conditionalize on the fact that I was born as myself, and that any disagreements between me and others are irrational, given that our priors are in fact shared. I take this as a sign that disagreements, though persistent, are always opportunities for learning. They give me the chance to integrate another point of view into my own thinking. Most of these attempts will fail, unsupported by the proper background or disposition, but sometimes they stick. And I do my best to take ideas seriously, to update my beliefs in the light of new evidence, to dispassionately accept the possibility that I might be wrong. And it’s with this that I have a smidge of confidence in my own (meta) rationality.
That was wonderful, Ansh. Thank you. In many ways it felt as though I was hearing a retelling of my own experience; only you deserve credit for describing it all much more poetically than I possibly could.
To start with, I deeply empathise with the curiosity element that is core to your being. This is a trait that I share and led me also to become an avid reader at a young age. Given this, it would be negligent of me not to ask what some of your favourite books are or which ones you believe have been most formative to the current version of yourself? I know this is a tough question, however, so you can limit it to ones you’ve read this year if that makes it easier. I’ll let you decide how to categorise it and make it a computationally tractable problem.
I also greatly appreciate the Isiah Berlin reference and your phrasing of being “a hedgehog whose one big idea is foxiness.” Again, I have to say this is all too similar to myself. I often think of Tyler Cowan’s comment where he says that being a generalist is “Underrated, but, of course, I am a generalist. But a generalist is also a specialist. I sometimes say that I specialise in being a generalist, and it’s a speciality too. It’s not different from specialising, you’re just specialising in different things.” I would have to agree with this, both parts — being a generalist is underrated and that it is a speciality of its own. Presumably you would too?
In regards to your generality, and specifically with the following quote in mind “my knowledge acquisition pipeline has diversified and consists of reading blog posts and news and journal articles in addition to books, as well as listening to podcasts, taking online and in-person courses, watching videos, etc.” I wanted to ask about how you see the interplay between these wide sources of knowledge you acquire and the skills you develop (or aim to). I recognise that you aren’t predominantly motivated by the instrumental value of knowledge, as you stated, but the point I am getting at is: How do you perceive the application of knowledge (let’s call that a skill) and how that reflects on your grasp of the knowledge that precedes it?
Or, coming at this from a slightly different angle: In your generalist ambitions, what stops you from being a dilettante? What proactive or preventive measures are you trying to make use of in order to stop you from being someone who only holds a superficial and ineffective splash of knowledge from a multitude of disciplines? Is this something that even concerns you, or is knowledge for knowledge’s sake good enough?
Admittedly, I ask out of selfish motivation — it is something I fear. Of the failure modes associated with being a generalist, I would think that overconfidence (in relation to actual knowledge) across a vast number of fields is one of the more probable. I understand that being motivated by curiosity, rather than instrumental value, is one thing — but how obliged do you feel to stress-test that knowledge and ensure it is reliable?
Using my own example, this past year I have been exploring a topic that you know much more about: computer science (predominantly programming and discrete mathematics). Now, I am far, far away from possessing anything that remotely resembles knowledge on these topics; that needs to be clear from the start. However, something I have tried to do whilst undertaking this learning process is to ensure I take assessments or work on projects, in order to apply what I have been learning as well as get some explicit feedback on how I am going. I don’t think doing this is necessary when it comes to “gathering viewpoints and frameworks,” though. What I mean by this, is after consuming various material — such as lectures or books — I felt like I could (at least in part) “think programmatically” and had incorporated words such as “algorithm,” “recursion” and “data structure” into my mental (and spoken) vocabulary. In your sense of becoming a generalist, is this sufficient? Could I have stopped there?
To me, the answer (in this case) is a clear: No. The issue is that the framework that I have constructed still has a certain flakiness to it. The remedy I have in mind is the development of strong, “real world” skills.
Continuing with the example, a mind that can write a Quine program, has very likely acquired and deeply internalised much of the programming knowledge required to “think programmatically.” While writing a Quine is not necessary or sufficient for thinking programmatically, I would still say that anyone that can write a Quine is much-more-likely-than-not to have a robust, accessible and non-superficial knowledge of programming. This concept seems extremely important to me when it comes to being able to appreciate and draw on the differing frameworks and viewpoints that we both wish to collect. Personally, I want to know that I am not simply climbing to the point of Peak Ignorance on the Dunning-Kruger curve, and then moving onto the next curiosity-driven interest in my quest for general knowledge.
What do you think about that?
To summarise, I think the idea that I am pushing back on here is the one where you can read a popular science book on X, and then claim that you can think like an x-ist — be it evolutionary biology, economics or computer science. It seems to me that the possession of various skills seems to be somewhat of a safeguard against this. While the possession of any single skill is only a datum, the combination of many skills, however, is a good proxy for how deeply someone knows that particular framework, including the meta-knowledge it entails, such as when it is — and is not — the incorrect lens to view the world through.
Please do tell me if your view diverges from this at all.
In comparing our journeys, it’s also interesting — though not surprising — that you mention mental-models. I am yet to read Scott Page’s book, however, my own introduction to the world of mental-models was through Shane Parrish and Farnam Street. From there, it was a small hop, skip and jump to the rationalist community; which I, too, immediately fell in love with. As a whole, I appreciated what the rationalist (and post-rationalist) community stood for, and I was also drawn to the slightly more rigorous and quantifiable approach that it utilises.
Now, with all that covered, I would like to dive a little more deeply into some of your ideas. We clearly have many shared traits — at least, at this level of analysis — so let’s see if things diverge at all when we examine them a little more closely and we dive into the weeds. What I would love to do, is to ask you to elaborate a little more on the following passage:
“I’m not sure at all that my way of seeing the world is correct- a proper application of Aumann’s Agreement Theorem would have me recognize that I should conditionalize on the fact that I was born as myself, and that any disagreements between me and others are irrational, given that our priors are in fact shared. I take this as a sign that disagreements, though persistent, are always opportunities for learning.”
This is an extremely interesting few sentences for me, and I would like to better understand the thought processes behind it. I also think it will be extremely valuable for anyone else reading this. So, with that in mind, would you mind possibly even starting with an introduction to Aumann’s Agreement Theorem and why this is an important concept…
Thanks for the kind words and, once again, for the opportunity to explicate my own ideology and thinking patterns. Already the experience feels like an exegesis of some of the things I wanted to express about myself but never found the chance to.
For foundational books, here is an attempt:
- Thinking, Fast and Slow (Daniel Kahneman)
The replication crisis may not have been entirely kind to Kahneman and Tversky’s findings on cognitive biases, but reading this in high school crystallized my understanding of common “failure modes” in human reasoning. Over time I’ve started to question whether these biases and fallacies are really examples of “irrationality”, but it’s impossible for me to ignore that I still remain highly skeptical of mine and others’ behavior and reasoning. Kahneman’s book helped lay the foundations for such an attitude.
- The Big Picture (Sean Carroll)
Carroll’s “poetic naturalism” remains the best way for me to reconcile my belief in the reductionary and explanatory power of science with the possibility of higher-level constructs like morality and consciousness. Concepts like free will, consciousness, and moral systems exist, but they are emergent and sometimes human-derived from fundamental physical phenomena: they are maps of the territory, and thus are useful, but not ontologically primitive.
- The Elephant in the Brain (Robin Hanson and Kevin Simler)
If Kahneman and Tversky were responsible for an initially skeptical attitude towards human reasoning, Hanson’s famous “X is not about Y (but really about Z)” allowed me to grok that common explanations of behavior are often motivated by social desirability bias, and sometimes our true motivations are less pure than we’d like them to be. Once Hanson and Simler have opened your mind to the possible widespread prevalence of social signalling and crony beliefs, it’s hard to look back.
- Antifragile (Nassim Nicholas Taleb)
Again, another piece in the skeptical framework and mindset I’ve tried to cultivate. Taleb’s style of discourse is nothing to admire, but his writing undoubtedly carries wisdom. It’s important to know when you are in Mediocristan and when you are in Extremistan and that sometimes your uncertainty is unbounded. Taleb viciously critiques thinking that’s common on Wall Street and in the Ivory Tower and implores his readers to look for motivated people with “skin in the game” in order to glean what truly rational behavior looks like.
- Inadequate Equilibria (Eliezer Yudkowsky)
When does an efficient market fail? And is there a case for foregoing epistemic modesty and trusting your own inside view over the outside view of “the experts” and society? Yudkowsky’s project in this book is to lay out how one can better assess their own meta-rationality and figure out where an outside view is likely to be wrong, which is an essential component of any reasonable contrarian’s toolkit.
- Knowledge and Decisions (Thomas Sowell)
How should knowledge be aggregated and distributed to decision-makers for optimal outcomes? Building on Hayek’s famous essay, “The Use of Knowledge in Society,” Sowell lays out the difficulty of central planning in certain environments and how it may be wiser to rely on local knowledge and tradition in those situations. Perhaps overly political, but this to me is Sowell’s magnum opus, and firmly lays out the limitations of a “high modernist” attitude that tries to impose order on systems that are much too chaotic and predict and control.
And, while not books, the archives of Eliezer Yudkowsky’s writings on LessWrong (I think collected and edited into Rationality: From AI to Zombies) and Scott Alexander’s writings at slatestarcodex.com were, of course, foundational to my thinking today, specifically Yudkowsky’s explanation of Bayesian epistemology and Alexander’s frequent attempts at explaining political and social behavior (in, e.g. “I can tolerate Anything Except the Outgroup”, and “Conflict v. Mistake Theory”).
Your concerns about the (possible) failure modes of being a generalist are extremely well-founded and I’m in basic agreement with your points. I found this blog post to be a similar critical analysis of such a phenomena: it seems exceedingly common in the modern intellectual sphere to both aspire towards becoming a generalist and to increasingly consume content from such generalists (e.g. Tyler Cowen, Sam Harris, Scott Alexander). It’s definitely a possibility that deep specialization is no longer as prized as it once was, and it’s completely possible that this is to the detriment of one’s own intellectual flourishing and, perhaps alarmingly, to societal progress.
I don’t have any strong rebuttals to such arguments, but I’ll still attempt to justify my own intellectual explorations. As you noted, because of the intrinsic satisfaction I gain from widespread exploration of ideas and concepts across multiple fields, I would find it personally very difficult to curtail such habits. I try to ensure that I gain as much as I can from specialization in my academic and professional pursuits: there I focus much more on becoming a “subject matter” expert and less on trying to coagulate knowledge from disparate fields and thinkers. In other words, my schooling and work is focused on “depth”, whereas my more casual intellectual interests are more focused on “breadth”. There is inevitably some crossover and I won’t pretend that, for example, my casual understanding of economics and politics doesn’t bleed into my career and schoolwork, but I try to make clear to myself and others that I can only seriously vouch for my experience in the fields of computer and data science. I try to practice “keeping my identity small”, in the words of Paul Graham, and would never claim to be able to think like an X-ist until and unless I had spent much more time and engaged much more thoroughly with established research in the field of X.
At the same time, I think there is something to be gained from recombination of ideas from different disciplines and thinkers. The reason I like to think of myself as a hedgehog who practices foxiness is because I assume that collecting different ways of thinking, even if I can’t claim that my own thinking is representative of an expert in the field, can both be fruitful and is stimulating in and of itself. So I’ll say “my own approximation of what an expert economist would say about something like this” instead of “as someone who can think like an economist”, or otherwise attempt to convey a general air of epistemic modesty.
This actually nicely flows into a discussion of Aumann’s agreement theorem. Scott Aaronson’s explanation here of common knowledge and the applications of Aumann’s theorem are the best explanation I’ve found, but I’ll attempt an imperfect reconstruction. Put simply, rational agents cannot agree to disagree given common priors about the state of the world. Aaranson walks through an example of rational agents starting from common priors, or common beliefs about the state of the world, and gradually exchanging evidence until they converge upon a shared conclusion. This “updating” process follows a random and therefore unpredictable walk: one cannot predict the final end state of their beliefs, or even any intermediate state until new evidence has been received, but one can expect eventual convergence with other rational agents given that all knowledge between them is “common” or shared.
Aaronson explains some of the implications: “For example, rational agents with common priors, and common knowledge of each other’s rationality, should never engage in speculative trade (e.g., buying and selling stocks, assuming that they don’t need cash, they’re not earning a commission, etc.). Why? Basically because, if I try to sell you a stock for (say) $50, then you should reason that the very fact that I’m offering it means I must have information you don’t that it’s worth less than $50, so then you update accordingly and you don’t want it either,” and “… we get a clear picture of what rational disagreements should look like: they should follow unbiased random walks, until sooner or later they terminate in common knowledge of complete agreement.” Of course, disagreements rarely look like this in practice, so some of the starting assumptions of the theorem must fail, e.g. people are irrational, or people start from different priors.
The second assumption is an interesting one. To outsource to Aaranson again: “… there’s a paper by Tyler Cowen and Robin Hanson called “Are Disagreements Honest?”—one of the most worldview-destabilizing papers I’ve ever read—that calls [throwing out the second assumption] into question. What it says, basically, is this: if you’re really a thoroughgoing Bayesian rationalist, then your prior ought to allow for the possibility that you are the other person. Or to put it another way: “you being born as you,” rather than as someone else, should be treated as just one more contingent fact that you observe and then conditionalize on! And likewise, the other person should condition on the observation that they’re them and not you. In this way, absolutely everything that makes you different from someone else can be understood as “differing information,” so we’re right back to the situation covered by Aumann’s Theorem.” In other words, the idea of rational disagreement cannot be rescued by discarding the assumption of common priors, since rational agents should also have shared priors!
Why is Aumann’s theorem important? I personally look at it as an idealized standard to hold myself to: I try to take disagreement seriously, to practice epistemic modesty, and to really practice the idea of updating my beliefs in response to new evidence. But inevitably, and perhaps too frequently, I have to trust my inside view of my own meta-rationality and conclude that sometimes convergence of beliefs is not possible. It’s a neat idea, but humans are perhaps the wrong species for it to apply to.
Firstly, in relation to the books you mentioned: What a stunning list! I haven’t yet touched The Big Picture (though I am now keen to), but the others I have either read partially or in their entirety. I must admit, however, that I am slightly envious that you came to Kahneman and Tversky’s work during high school — it was many a year later for myself. I also really like how you mentioned that the replication crisis has called aspects of their work into question, but even with this considered, the ideas outlined in the Elephant In The Brain seem to reinforce the need for some cautious scepticism of others, but most importantly, ourselves — irrespective of particular (mostly academic) details. It’s also interesting that you mention books by Taleb and Sowell. I see a lot of similarities between these two (for example, Sowell captured a lot of Taleb’s idea of”skin in the game” when he matter of factly said: It is hard to imagine a more stupid or more dangerous way of making decisions than by putting those decisions in the hands of people who pay no price for being wrong.). Of the two, I have been significantly influenced by (and prefer) Sowell’s work, in particular. Though, I must say, not always due to being in full agreement with him.
In relation to the other writings you reference, again, I must (in true confirmation-bias style) commend your taste. I have referenced Yudkowsky a number of times on this site already, and family and friends are no doubt sick of me talking about things of his I have read. Regardless, many of his essays either provided me with a Hansonian “viewquake” or precisely put into words ideas that I was previously incapable of crystallising. In addition to this, I would estimate that “I Can Tolerate Anything Except the Outgroup” is my most recommended and shared article/essay of all-time; I pass it on to just about anyone I think will come close to appreciating it.
Next on the agenda: Thank you for sharing that piece on some concerns of intellectual generality. I thoroughly enjoyed it and would recommend everyone reading this to also read that. I would quickly like to just highlight some of the points and passages I mostly appreciated — even if, for the most part, they are criticisms which could apply quite strongly to myself:
“In economic terms, content disaggregation enabled by digital platforms ought to create efficiencies through intellectual hyper-specialization. Instead, we have the endless hellscape of the casual polymath. A newsletter about venture capital will find time to opine on herd immunity. The tech blog you visit to learn about data science is also your source of financial strategies for early retirement. The Twitter account you followed to understand politics now seems more focused on their mindfulness practice. We have maxed out variety of interests within people, at the cost of diversity across them.”
“If you saw JFK in 1947, you might have thought “wow, he’s rich, his father was the Chairman of the SEC, and he’s a member of the US House of Representatives, what an impressive guy!” A decade later, you could have added “Pulitzer Prize winning author” to that list. But this reasoning is totally backwards. JFK was only able to become a politician because of his wealth. In fact, his father only became SEC Chairman after extensive political donations to FDR. And obviously, his book was ghost-written by his speechwriter. So you’re justified in being impressed by exactly one accomplishment, and everything else ought to be discounted.”
“Rather than as a Renaissance Man, Leonardo would be better regarded as an exceptional painter with various hobbies.”
There were many well-reasoned ideas in that piece, and I found it very eye-opening to read such a critical analysis of something I gravitate towards. Following on from that, I also tremendously enjoyed the piece you shared on common knowledge and Aumann’s agreement theorem by Aaronson. Again, thank you, it provided me with a lot of clarity (it is terrifically well explained) and also helped me see the error in the following assumption, that the author he himself admits he made previously:
“When I first learned about this stuff 12 years ago, it seemed obvious to me that a lot of it could be dismissed as irrelevant to the real world for reasons of complexity. I.e., sure, it might apply to ideal reasoners with unlimited time and computational power, but as soon as you impose realistic constraints, this whole Aumannian house of cards should collapse. … So one could conjecture that agreement, in general, requires a lot of communication. So then I sat down and tried to prove that as a theorem. And you know what I found? That my intuition here wasn’t even close to correct!”
Your point that Aumann’s theorem is important “as an idealized standard to hold myself to” and that you “try to take disagreement seriously, to practice epistemic modesty, and to really practice the idea of updating my beliefs in response to new evidence” is well taken. It is something I also aspire to, but all too commonly fall short of. To return to Aaronson one final time, he conveys the idea of epistemic humility in a precise and pithy manner when he states:
“… what licenses you to privilege an observation just because it’s your eyes that made it, or a thought just because it happened to occur in your head? Like, if you’re objectively smarter or more observant than everyone else around you, fine, but to whatever extent you agree that you aren’t, your opinion gets no special epistemic protection just because it’s yours.”
Moving on from all this appreciation and agreement (as well as the theorem of it), next up, I would like to ask you about a few topics that you might be able to discuss simultaneously: 1) statistics in science, and 2) causal inference. Both of these — as you know — are very important topics for anyone who is attempting to live anything approximating a well-informed, rational life. Would you care to address these topics — either separately or in conjunction — and attempting to convey how you think about them?
To possibly provide you with some direction, I would mention that those who claim to be pro-science, are for Enlightenment ideals, identify as rational skeptics etc. often tend to be evangelical in their support of science — possibly believing, or disbelieving, based on little else other than the p-value of a paper. What are some of the errors that can be made here, or things we should possibly be concerned about? As you know, these possible failure modes are still important, even if, in general, we recognise science as a method of truth-seeking?
Following on from that, how is causal inference related to truth-seeking; or, for the uninitiated, what even is it?
I would feel uncomfortable if I didn’t extensively caveat my discussion of statistics and causal inference below, so I’ll take a brief moment to do that here. I only have undergraduate training in the topics discussed and limited practical experience, so much of what I say has been gathered second-hand from books, blogs, podcasts, and research papers I’ve happened upon. Much of what I say will likely be incomplete, but I’ll do my best to try and distill what I’ve picked up into some important takeaways. In other words:
Epistemic Status: My best guess
I like the idea of tackling statistics and causal inference together. To start with some definitions, statistics is the broad field of interpreting and analyzing data, whereas causal inference more specifically attempts to tease out causal effects: what happens when I intervene on X to Y? It’s a very natural question that follows statistical inquiries, but many of the standard statistical tools are actually inadequate to deal with causal questions. Simple linear regression only gives us associations between variables and we all know that correlation does not imply causation. Students are taught to “control for confounders” when building regression models and often slide into causal language about said models when doing so is often unjustifiable. Even in the analysis of experimental studies, causal models and effects are implicitly assumed but not concretely formulated, often leading to inappropriate inferences. I think Judea Pearl (one of the main figures in the field of causal inference) is onto something when he writes that “ … human intuition is organized around casual, not statistical, relations.” Here’s an example illustrating the difference between the two that will likely appeal to the portion of the audience following along who have a background in fitness and nutrition.
A little background first. The “calories-in-calories-out” (CICO) model, though simplistic, approximates the process of weight gain or loss in humans quite well. Put simply, eating consistently over your body’s energy requirements (or caloric maintenance) will lead to weight gain and eating consistently under your caloric maintenance will lead to weight loss. We’ll note that the weight gained is likely to be a combination of several different tissues, principally muscle and fat tissue. The partitioning-ratio (p-ratio) is the proportion of an energy surplus (in the context of weight gain) or deficit (in the context of weight loss) that is realized in terms of muscle gain or loss. Previous research has found an inverse correlation between an individual’s body fat percentage and their p-ratio, suggesting that leaner individuals gain proportionally more muscle in an energy surplus than fatter individuals. This has led to advice suggesting that individuals seeking to maximize muscle gain while gaining weight should first attempt to reduce their body fat percentage before doing so.
Put bluntly, I think most of this previous research has confused a statistical inquiry for a causal one. Even though an inverse correlation between body fat percentage and the p-ratio was observed across individuals, the relevant question is a causal one: if an individual gains weight from a lower body fat percentage compared to a higher one, what difference in p-ratios will be observed? We can predict that leaner individuals will have higher p-ratios than fatter ones, but we have no idea if this is due to their genetics or better habits enabling them to maintain both a lower body fat percentage and a better p-ratio. We haven’t answered the question of what causal effect body fat percentage has on one’s p-ratio, which is the relevant effect for fitness enthusiasts who are trying to understand the necessity of losing body fat before attempting to maximize muscle gains. More recent analysis of this question (by Dr. Eric Trexler, for example) suggests that an individual’s p-ratio is unlikely to be substantially affected by their body-fat percentage (outside of certain extreme values) and is instead largely determined by the magnitude of their energy surplus and their genetics.
To come back to the question of science and statistics, some readers will likely be familiar with terms like the “replication crisis” in science, referring to a phenomenon largely centered around social psychology research failing to replicate. What this means is that a study that found a potential effect of an experimental intervention on a variable of interest, when replicated again by a different team of researchers, fails to produce the same results. The methodological practices of a number of scientific fields have come under severe scrutiny after this realization and many cherished findings are now being discarded as suspect and likely nonexistent. An even more damning recent critique, applied specifically to psychology research but likely relevant to a number of fields, is one that asserts the presence of a Generalizability Crisis. The author (in my opinion) convincingly argues that many psychologists have insufficient rationale to make strong generalizable claims from the specific statistical methods they employ in their research, dramatically reducing the relevance of their field.
There are other reasons to suspect that the process by which scientists find significant results is highly error-prone. This post lays out how the field of parapsychology (the field that studies phenomena like ESP, telepathy, precognition, etc.) is a useful control group for standard scientific processes. If parapsychologists find that their null hypotheses, e.g. a hypothesis asserting that ESP is not real, can be rejected using standard scientific methods, then that leads us to conclude that positive results in other fields are highly suspect if they rely on those same methods. This article provides an amusing deep dive into the rigor with which parapsychology is now performed, demonstrating again how null hypotheses almost everyone believes to be certainly true can be rejected in favor of outlandish alternative explanations. If someone is interested in better understanding the weaknesses of modern scientific inquiry, I’d highly recommend Science Fictions by Stuart Richie, where Richie identifies misaligned incentives in academia and provides numerous anecdotes of both well-meaning and malicious scientists causing serious harm because of poor methodological practices. There’s also a Saturday Morning Breakfast Cereal comic that nicely summarizes the book here.
There are some epistemic hygiene tips I can provide to better guard oneself against poor scientific practice. The first step is to Beware the Man of One Study and recognize that one study provides very little evidence of any particular claim. Some readers will likely be familiar with hierarchies of evidence, where anecdotes and expert opinion provide less evidence than randomized-controlled trials and meta-analyses. I would thus advocate for a Bayesian epistemological attitude, where one assigns a prior probability to a hypothesis (preferably before encountering any research on it) and then gradually updates their belief in response to weak evidence (in the case of anecdote or a single observational study) or makes a substantial credence shift when they find strong evidence, like a systematic review or meta-analysis.
In case those last few sentences sound like gobbely-gook, I highly recommend this tutorial on Bayes’ rule and Bayesian epistemology. For other actionable takeaways, I’ll quote this (brilliant) article on What’s Wrong With Social Science:
- Read the papers you cite. Or at least make your grad students do it for you. It doesn’t need to be exhaustive: the abstract, a quick look at the descriptive stats, a good look at the table with the main regression results, and then a skim of the conclusions. Maybe a glance at the methodology if they’re doing something unusual. It won’t take more than a couple of minutes. And you owe it not only to SCIENCE!, but also to yourself: the ability to discriminate between what is real and what is not is rather useful if you want to produce good research.23
- When doing peer review, reject claims that are likely to be false. The base replication rate for studies with p>.001 is below 50%. When reviewing a paper whose central claim has a p-value above that, you should recommend against publication unless the paper is exceptional (good methodology, high prior likelihood, etc.)24 If we’re going to have publication bias, at least let that be a bias for true positives. Remember to subtract another 10 percentage points for interaction effects. You don’t need to be complicit in the publication of false claims.
- Stop assuming good faith. I’m not saying every academic interaction should be hostile and adversarial, but the good guys are behaving like dodos right now and the predators are running wild.
The article also has a number of takeaways for institutions and other methodology tips, e.g. to increase sample sizes and decrease significance thresholds (reducing the false-positive rate of scientific findings greatly), move towards open data, and more dramatic reforms like doing away with the scientific journal system entirely, the cases for which I personally find quite compelling.
So there’s my rather disjointed reply on causal inference and statistics. If I could sum up some key takeaways:
- There is a difference between causal and statistical relations. Humans are more often interested in the former.
- Modern scientific practice is quite error-prone.
- Consider adopting an explicitly Bayesian epistemological attitude where you update your beliefs in proportion to the strength of the evidence you observe.
Thank you for that, Ansh. I think you have just laid out some very important ideas, in a very digestible manner. This is by no means an easy task, especially when the subject matter shares a border with some rather technical topics. You seem to have quite a gift for conveying such ideas and I don’t think it is beyond the realm of possibility for you to become quite the populariser of important notions, like the ones above — I suspect many would benefit from reading more of your work! I know I would, at least. I must also credit you on your epistemic humility; though, it is not surprising at this point.
I also want to thank you for the additional resources you have supplied. I am yet to read Science Fictions, but I have had it recommended to me on good authority previously, and now you have added further weight to that claim — alas, the constant battle of decaying time and an increasing booklist. Regardless, I thoroughly enjoyed the comic version you sent through, and am in large agreement with the ideas it contains. Relating to the other articles and papers, many I was familiar with and it highlighted to me that we really do share some deep-seated values, even though our geography, cultural background and educational history, among many other factors, differ substantially. This was reassuring to me. No doubt one slicing up of that data could lead to the conclusion that we simply exist within the same echo-chamber, but personally I don’t think that theory captures the most truth-value possible.
Following on from that, I think most of what you referenced are great starting points regarding the general topics we are discussing at large here. I would add that I definitely think the quote from Judea Pearl hints at something quite fundamental. With that in mind, I would say, though, that his book The Book of Why: The New Science of Cause and Effect is probably not an excellent starting point for these topics. Whilst it was written for a wider audience (in comparison to Pearl’s academic books on causality), it is my belief at least, that it is still excessively technical for most and a non-trivial amount of background knowledge is required to extract something truly meaningful from it. If someone reading this is a relative novice to what we are talking about here, I would suggest they first work their way through the resources you have so kindly linked to (among others, possibly), and then consider The Book of Why following that. Would you agree, or do you have your own advice?
Relating to your discussion on statistics and causal inference, I will try to summarise in order to check my own comprehension and for the benefit of readers. Feel free to comment on my recapitulation.
In essence, (classical) statistics provides us with methods to detect patterns or associations. Causal inference, on the other hand, is what allows us to uncover what causes those associations; what inscribes such patterns into nature. While statistics and causal inference may often look the same — especially to a layperson — it is important to understand that they are not. Importantly, statistics and post-hoc speculation on the causal model is not equal to properly executed causal inference; again, even though it may look very similar in execution.
This relates to what you said about weight change, calorie intake and p-ratio (a well chosen example, I must say). Am I correct in understanding you as saying the following: Just because leaner individuals are found to have superior p-ratios, doesn’t mean someone will improve their p-ratio by getting leaner? This is because, like you said, it is a causal question and mostly what we have to be guided by is associative data.
In response to that, one might argue, “Well, if being leaner is associated with a better p-ratio, then it couldn’t hurt!”. To this, I’m not sure I agree. A simplistic analogy could be as follows: You observe that wealthy people tend to have lots of expensive assets. Using associative-reasoning, one could conclude that by buying an expensive car or house, they, too, can become wealthy. However, causal-reasoning tells us that the expensive assets are likely to be an effect of prior wealth, and one cannot become categorically wealthy by targeting the effect (akin to treating the symptom and not the cause). The same line of reasoning also applies to the p-ratio debate: The superior p-ratio in leaner individuals could be an effect of an upstream cause (such as genetics), and targeting it as a method of intervention is possibly redundant — it could, however, even have negative instrumental value. To blend the two topics together, if one is aiming to become wealthy (or muscular), then buying assets that are expensive (or dieting), may actually take one further from their goal. It is the incorrect move for that individual to make; even if they made it with the best of intentions. An incorrect causal model is not just non-optimal, but quite possibly a hindrance.
Moving onto the discussion of the replication and generalisability crisis, could we possibly take a more concrete example? I understand that this will move us onto some less stable epistemological ground, but I would be interested in hearing your opinion on the matter — feel free to say as much or as little as you please. For instance, take the famous “Stanford Prison Experiment.” Sometime ago, before social psychology drew the level of scrutiny it has received in recent years, the work of Philip Zimbardo was held up as (somewhat conclusive) evidence that human nature was an inherently dark and tangled web of self-serving and mostly deplorable behaviours. In more recent years, however, the issues of social psychology (among other fields) has led to the conclusions of similar experiments being questioned or discredited. This fits the narrative of many current public intellectuals (just as the prior conclusions fitted those of others, previously), and I have heard many, in essence, say the following:
“The Stanford Prison Experiment was methodologically flawed for the following X, Y and Z reasons. Therefore, we can see that humans are not inherently evil.”
They may even go as far as to claim that humans are inherently good (which is not the actual point I am trying to debate here). While my portrayal is no doubt reductive, what I believe I have seen numerous times, however, is the revelation of the flaws in a study being used as evidence to support the alternative conclusion. What I would like to ask you is, how do you think we interpret this? What does the Stanford Prison Experiment — with all its criticisms — say, if anything, about human nature?
Personally — and I am open to revision of this opinion — I feel as though the Stanford Prison Experiment could only be used as evidence of undesirable elements of human nature, or interpreted as “neutral”/non-evidence either way (due to the ambiguity caused by being discredited). To me it seems like a false-move to deem a study as sufficiently methodologically flawed, and then use that as evidence of the alternate conclusion. Whilst I understand that, in a Bayesian sense, absence of evidence actually is evidence of absence, do you think that we can point to defective data and use that to privilege the alternative hypothesis? Or, can we only use the (revealed to be flawed) evidence to weaken the conclusions of the hypothesis that it previously appeared to support?
Finally, I think your tips on epistemic hygiene and quotes from What’s Wrong with Social Science are extremely insightful. I am certainly in support of the call “to increase sample sizes and decrease significance thresholds (reducing the false-positive rate of scientific findings greatly), move towards open data, and more dramatic reforms like doing away with the scientific journal system entirely.” Relating to this last point in particular, I am reminded of the Eliezer Yudkowsky quote from Scientific Evidence, Legal Evidence, Rational Evidence:
“Perhaps future generations, acting on the theory that science is the public, reproducible knowledge of humankind, will only label as “scientific” papers published in an open-access journal. If you charge for access to the knowledge, is it part of the knowledge of humankind? Can we fully trust a result if people must pay to criticize it?”
Thanks again for the kind words! I definitely strive to learn how to distill important insights from complex topics so it’s reassuring to hear that I’m making progress on this front.
Yes, I’d definitely agree that The Book of Why is likely an inappropriate starting point for many trying to learn about causal inference. I think it’s also fair to say that Pearl and Mackenzie portray a somewhat narrow perspective on the field of causal inference and often unfairly malign other researchers and contributors in the area. I would suggest that interested readers also check out Andrew Gelman’s review of the book as well as Aronow and S¨avje’s review here. Pearl is undoubtedly a giant of the field and one of its most significant contributors, but there are some other perspectives that I think he neglects or grants insufficient credit to, so I feel compelled to provide those alternative viewpoints with a platform as well.
Addressing your summary of my causal inference primer, I think you’ve done an excellent job summing up the thesis I laid out. You nailed the distinction between association and causation and how confusing one type of reasoning for the other can lead to trouble. It is also much easier to find associations and effects of causes rather than causes of effects, so we often lean towards studying what we can measure and quantify even if it provides us little actionable insight.
On the topic of Stanford Prison Experiment and replicability/generalizability, I generally agree with what you’ve said thus far. I mainly view these crises as wake-up calls: they told us that our knowledge of the world was much more limited (at least within these fields) than we’d been led to believe, and that to move forward, we’d need to have more rigorous methodological practices. I have personally been influenced by Thomas Kuhn’s ideas about scientific paradigms (I’ll also recommend some excellent (though fairly unrelated) articles on Kuhn’s work by Lou Keep here and here), and to me the field of social psychology appears to be mostly pre-paradigmatic. There are bits of data and observations that we have (the quality of which is actually quite suspect!), but there is no cohesive framework underlying our approach to analyzing this data and integrating it into our current knowledge base. Studies are carried out and ad hoc hypotheses are justified or debunked without any deep theoretical underpinning. I don’t mean to attach an excessively negative valence to this observation and only intend to convey my impression that the field of social psychology is markedly different from some of the natural sciences, where “normal science” (using Kuhn’s terminology) is occurring. To me, it’s unsurprising that the study of human psychology is much harder to get right than physics, for example, so I don’t think we should be overly disheartened at our lack of progress on this front.
As you point out, while finding evidence that a particular study failed to replicate does provide some Bayesian evidence against the result the original study found, one should be careful to not over-update on this fact and take it in stride, recognizing that even “true” hypotheses may fail to replicate. I think the framing you lay out here: ” … can we only use the (revealed to be flawed) evidence to weaken the conclusions of the hypothesis that it previously appeared to support?” does a good job of reflecting this approach, though ultimately I think this still boils down to an increased level of credence in an alternative hypothesis, though perhaps not as much of an increase as some are inclined to posit.
Basically, instead of saying that: “The Stanford Prison Experiment was methodologically flawed for the following X, Y and Z reasons. Therefore, we can see that humans are not inherently evil.” I would personally say: “The Stanford Prison Experiment was methodologically flawed for the following X, Y and Z reasons. Therefore, I don’t believe as strongly as I did before that humans are evil.”
One other epistemological approach I find intriguing, related to the standard “rationalist” Bayesian reasoning I think both you and I are fond of, is using the principle of Jeffrey Conditionalization (section A here). To me it seems like a quite elegant solution to the problem of uncertain evidence. When we get a scientific study that provides some evidence in favor of a hypothesis, perhaps we should use base rates of replicability/general trust-worthiness to assign a level of belief in the quality of that evidence and then perform a Jeffrey update to more accurately adjust our level of belief. Practically this does seem a bit tricky to implement, but I think the general principle is sound and worth keeping in mind.
Thanks for sharing those reviews of The Book of Why, Ansh. Again, I think this is a behaviour that exemplifies your approach to information consumption and processing. You don’t just read the book; but also reviews, critiques and interpretations of it — by people who possess domain relevant knowledge, such as statisticians and data scientists, in this instance. I am mostly ignorant of Gelman’s work and ideas — though, I’m aware he carries quite the respected reputation — so thank you for passing that along. I definitely found his review more consumable of the two, based on my level of familiarity with the field.
There were two things, in particular, that I liked about Gelman’s post. The first was his conception of the division of labour that is occurring in the modelling of phenomena. While possibly this is something that seems obvious to someone such as yourself, I think I would have been in a much better position to read Pearl’s book had I had this simple little idea already in my mind. The second — which builds upon the first — is captured in the following quote:
“If you think you’re working with a purely qualitative model, it turns out that, no, you’re actually making lots of data-based quantitative decisions about which effects and interactions you decide are real and which ones you decide are not there. And if you think you’re working with a purely quantitative model, no, you’re really making lots of assumptions (causal or otherwise) about how your data connect to reality.”
Whilst reading the book, I had the reoccurring sensation that Pearl and Mackenzie were approaching (at least) a two-part problem in a uni-dimensional way. This was as close as I could get to formulating a concern with the content; I couldn’t verablise my concern explicitly. Upon reading Gelman’s post, I recognised immediately what I hadn’t quite been able to put my finger on. He captured my concerns perfectly — whilst also picking many more technical issues that I was beyond ignorant to, might I add. However, even though I had these doubts and issues with the thesis, I was also doubting my doubts. This was because — as Gelman also touches on — Pearl and Mackenzie claim to be critiquing the currently established wisdom. As a mostly-naive reader, it was difficult to disentangle how much I could justifiably be sceptical of — or whether I was simply a part of the masses raised on non-causal statistics and looking to preserve my current worldview.
I think your review of my comments and your general summary of the replication issue– and how we should think about it — is all very apt. I think, ultimately, you are correct. A failed replication should add weight to a competing hypothesis. Personally, I was likely insufficiently updating when this occurred, due to my interpretation that others were excessive in doing so. Possibly I would now phrase this as: Given a failed replication, the confidence that is then deducted from the hypothesis should not all then be added to alternative options.
Possibly I am being unnecessarily pedantic — or I am simply wrong, from a technical standpoint — but it feels to me that we shouldn’t simply reduce the probability of one hypothesis and then re-normalise the probability distribution. The experimental evidence that indicates what we believed to beour most probable hypothesis, is probably wrong, should count for something in and of itself.
What I am hinting at here is some kind of meta-confidence rating. Not only should we be able to distinguish confidence between competing theories, but attempt to have some gauge on how likely, in the absolute sense, our most probable hypothesis is to be actually true. I have no doubt this is already an established concept, in one regard or another, I am just unaware of its formal appearance. If you have any wisdom to share, or criticisms to deliver, in this regard, then have at it.
Finally, you summarised my own view well when you said: “To me, it’s unsurprising that the study of human psychology is much harder to get right than physics, for example, so I don’t think we should be overly disheartened at our lack of progress on this front.” The only addition I would add is that, while many take great pleasure in arguing that the social sciences are not real sciences, I think that they are either heavily discounting — or completely ignoring — the potential value of social science.
While the expected utility of physics is likely much higher, due to the “consistency” of its knowledge (a higher probability of “truth” contained in each finding), I think that the social sciences have a well deserved place in our broader, investigative mechanisms. Whilst the social sciences tend not to do what many consider “real” science, I personally see their role as something akin to playing the lottery. Conversely, the harder sciences are more like working a job. Both have an expected pay-off — the probability of reward, multiplied by the magnitude of it — and should be recognised for the unique strategic value they can provide an individual in the quest for improved financial well-being.
While this is not a perfect analogy, as by most counts the expected utility of playing the lottery is personal loss (though a well-funded government), I think there are some parallels. While a wage is much, much more consistent in its contributions, winning the lottery — should it actually occur — is, at least, an order of magnitude more rewarding in any one instance. What I mean by that, is this: If we do, in fact, gain actual knowledge of human-nature — through the social or “soft” sciences — then I think this is very likely to be instrumentally more valuable than enhancing our knowledge of the physical world. Now, there are many caveats here, and I don’t hold that belief unwaveringly; but the stock I place in it is not trivial. I would be curious to hear your thoughts on all of that.
I’ll comment briefly on your assessment of my information consumption strategy. This is essentially my guard against epistemic learned helplessness. Often I feel confused about which direction and how strongly to update given new information, so I’m always very careful to consider a number of different viewpoints before I make any significant shift to my beliefs. I also find intellectual disagreements utterly fascinating, especially when they’re between people I respect highly. I might even say that my biggest intellectual interest is trying to disentangle complex disagreements, in the vein of Julia Galef or John Nerst. There’s nothing quite like getting to the root of a nontrivial disagreement and realizing how each person arrived at their particular viewpoint, what facts might lead them to change their mind, and what values they hold that are integral to their beliefs. It’s also easy to find disagreements about virtually any topic of interest, so it’s a strategy I can employ whenever I learn about something new.
I felt similarly about Gelman’s blog post. The recognition of the fact that separating the qualitative and quantitative is a doomed endeavor, and his and Pearl’s agreement on this point, was a significant takeaway for me as well, and it got me to better appreciate that I’m always doing a mixture of both in my professional and academic work.
On this point: “Not only should we be able to distinguish confidence between competing theories, but attempt to have some gauge on how likely, in the absolute sense, our most probable hypothesis is to be actually true. I have no doubt this is already an established concept, in one regard or another, I am just unaware of its formal appearance.” This is interesting, and I think it has some relation to the concept of realizability (I’m linking to a post describing this concept in the context of AI alignment, since that’s where I first heard about it. There are likely better sources that are more relevant). Put simply, realizability is the assumption that our prior distribution contains the “real world”, or that the correct hypothesis is contained in our space of competing hypotheses. I think a strict Bayesian would reply to your concern by stating that you simply multiply the relative likelihoods of competing theories by the prior odds you place on them to arrive at your posterior odds, which represent your absolute credence in any hypothesis under consideration. Your worry seems to be that your conception of the hypothesis space is limited and that you may be leaving out reasonable hypotheses and/or the “correct” hypothesis (violating realizability). I’m not sure if there is a technical solution to this, but my own personal advice is to just accept that your credences are likely flexible anyway and that adding new hypotheses seems like a fine way to update them, even if the relative likelihood between previous hypotheses hasn’t changed. I hope that makes sense.
I find your analysis of the relative value of social and natural sciences fascinating. It reminded me of this reddit post, where the author contends that: “… the human sciences aren’t really “hard”- in fact, they are further advanced than any of the other sciences. It’s just that our standards for what counts as “real” non-trivial knowledge and advances in the human sciences is impossibly demanding compared with other areas.” I think there’s definitely some merit to this hypothesis, as well as the one you advanced. I particularly like the way you frame some areas of scientific research as akin to buying lottery tickets (perhaps a comparison to starting a company might work better here, since that likely has positive expected value, unlike a lottery ticket). I’m not sure I would personally differentiate the social and natural sciences in this regard, however, and I might instead take each field, and likely even specific areas of study within each field, separately. For instance, research into machine learning methods is likely to provide a dependable and significant social and epistemic return, whereas research into graph theory is unlikely to provide much of either. I might agree that reaching generalizable conclusions in the social sciences is more difficult than in the natural ones, though I’m not entirely sure if those conclusions provide more social value. I think my main worry is that human individual differences are so ubiquitous that expecting to arrive at fundamental and tractable laws of human behavior is a doomed endeavor. We might only be able to arrive at heavily contextual conclusions that have limited applicability outside of their epistemic domain.
Another related line of thinking that this has provoked in me is the possibility that human behavior becomes simpler/more predictable down the line because of increasing technological capabilities. We might bypass the need for significant progress in the field of psychology by improving recommendation systems and machine learning algorithms to the point where we don’t need to understand the fundamental causes of human behavior, so long as we can predict/steer humans in the directions we want them to. Or perhaps progress in genomics advances to the point where we have highly accurate polygenic scores for basically every single human trait and embryo selection/editing can lead to idealized humans that are perfect along every dimension we can conceive of.
I realized I’ve opened up several highly controversial cans of worms here, the contents of which I personally feel ill-equipped to explore much further, but future technological progress occupies a nontrivial amount of my headspace these days. The future is such a fascinating place, and I’m very glad that we’re all travelling to it together.