This is such a great example of bullshit studies. Yeah I'm gonna go on a limb an...

p_j_w · on July 13, 2021

Your criticism of the sample size is unfounded. You absolutely can have meaningful results with only 258 people provided you have good sampling. If you want to criticize their sampling, then by all means do it, but that's not what you've done here.

dragontamer · on July 13, 2021

A lot of the COVID19 stuff only has a population size (NOT EVEN sample size) of like, 15. Ex: J&J's blood clot issue.

People make conclusions off of tinier sample sizes. There are plenty of scientific studies with just 50 or 100 people, but you need to keep the small numbers in mind.

As long as sampling was done correctly, then small sample size won't matter much !!

danenania · on July 13, 2021

People make conclusions based on tiny sample sizes when the results fit their biases.

Most studies are worthless due to small sample sizes and confounding variables, and most “conclusions” are based on cherry-picked outliers. The fact that it’s common to do this doesn’t make it responsible or correct.

anthonypasq · on July 13, 2021

have you taken intro statistics? all small sample sizes do is require your effect to be more pronounced to still be statistically significant.

danenania · on July 13, 2021

Ah, then I guess that's why so few "statistically significant" studies end up being contradicted by further research...

mactrey · on July 13, 2021

You make it sound like those studies were bad because the sample size was too small. In reality bad study design or having an unrepresentative sample are more common issues.

danenania · on July 13, 2021

I think they go hand-in-hand. People naively apply intro-level statistics to messy real-world problems and then think they need an orders of magnitude smaller sample than they would need to even hint at the possibility of an interesting effect.

majormajor · on July 13, 2021

Neither a bad experiment nor an unrepresentative sample will be helped by a larger sample size, though.

danenania · on July 13, 2021

True, though no sample is ever truly representative along every dimension. Even when a lot of effort is put into making a sample as representative as possible, there will be variance in all the potential ways the sample can diverge from the population; this variance seems to be consistently underestimated.

arcticbull · on July 13, 2021

I agree with you broadly, though would the J&J sample size not be about 21.4 million? 15 is just the number of adverse events right?

dragontamer · on July 13, 2021

When we get into "total population" statistics, we kinda leave the realm of statistics lol.

There's 15 documented adverse events, but we don't really know how many adverse events there were in total. If we simply study those 15 adverse events, then your population study is only those 15 events (with a hell-of-a-lot of selection bias).

----------

But I'm mostly thinking about how one person I know thinks: "Did you hear about these 15 cases and even one of them died! Clearly J&J is unsafe to take!!!!"

For that kind of conclusion, its like they've focused their mind entirely on 15 cases, as if their "population of study" was 15. Its obviously bad logic, but maybe I wasn't very clear in my original post that I was referencing some bad logic there.

---------

But there's plenty of 50-person or 200-person studies that ARE properly done, and the science is solid. Especially when we're looking at say Phase 1 / Phase 2 style trials of safety and/or efficacy (the earlier stage trials before a Phase 3 trial is conducted).

Its not a big enough sample to be "certain" of the results, but its a big enough sample to "provide evidence", possibly in conjuction with other studies.

----------

In many cases: having 5 or 6 studies with 50-random people each and performing a meta-study across them (especially if all 5 of those studies were of high quality sampling) would lead to a more robust conclusion than just one 1000-person study done slightly incorrectly (ex: the 1000-person "study" is a voluntary telephone survey or some other low-quality study)

NotSammyHagar · on July 13, 2021

Sorry but I disagree. The key point is that over a very large population, millions, only 15 people got it. It's just wrong to say it was computed over only 15 people.

This study referenced up above had the ~285 or whatever people who answered the survey, not that much info. ~285/350 is not that much data.

Anecdotally and based on my own experience I'd agree that the burnout is high, probably over 50%.

brewdad · on July 13, 2021

Would you agree that we can conclude that the incidence of burnout is high amongst software developers? The "real" level may be 83% or 93% or 43% but it seems fair to conclude that it is indeed some larger than desirable number that merits a larger and more robust study.

That is the value of a study of this size. The ability to quickly determine if there is any "there" there. The headline is sensationalistic but such is the nature of publish or perish science today.

spywaregorilla · on July 13, 2021

> When we get into "total population" statistics, we kinda leave the realm of statistics lol.

Not even close to accurate.

dragontamer · on July 13, 2021

When there's an election or census over some particular figure, there's no "+/-" or "standard deviation".

If the election count is say, 48% for A and 50% for B and 2% other, that's that. You don't have error bars, you don't have student-t tests, you don't have random varaibles, you don't have "confidence intervals". You assume you have the ground truth. You have a precise number assumed to be without error.

Its actually a fundamentally different kind of math than what is typically taught in statistic classes (standard error vs standard deviation. R-values, random variables, etc. etc.)

When it comes to "total population" style information gathering, the questions of reliability are about whether-or-not you in fact, grabbed the total population... and nothing really "mathematical" (such as what are my 95% confidence interval?)

---------

When people say 600,000 died of COVID19 in the USA, there's no error bar, standard deviation, or random variable on that number. It is what it is: the ground truth according to a particular source.

And when you have a 2nd source of information with slightly different numbers (ex: State databases vs US level databases), you have a non-mathematical problem for how to resolve the apparent contradiction.

There's no "metastudy" over other surveys. There's no "resample again", there's no reproducibility. These statistical concepts and tests simply do not apply. If you do in fact decide to conduct a 2nd census, you will find that the numbers will differ in practice (ex: recounting the same set of ballots a 2nd time). But there's no underlying mathematical model for why this happens, as is the case in typical population statistics (student t-test assumes a standard error over a standard deviation of a hypothetical sample, etc. etc.)

spywaregorilla · on July 13, 2021

Elections are not statistical tests. Number of people dead are not statistical tests. You said "When we get into "total population" statistics, we kinda leave the realm of statistics lol." That is false. There are many questions that statistics is best suited to answer even if an entire population has been surveyed. Often the term population is poorly defined, and is much wider than the simple base case.

In the case of J&J blood clots, there's billions of additional people who could take the J&J vaccine. 15/21 million people who took he J&J vaccine got blood clots. But if you want to know what percentage of "people who take the J&J vaccine get blood clots" that's not the same thing. The population includes an infinite number of people.

spywaregorilla · on July 13, 2021

Yes that is correct. Perhaps not as a rigorous experiment, but at a high level yes.

bumby · on July 13, 2021

>There are plenty of scientific studies with just 50 or 100 people, but you need to keep the small numbers in mind.

The converse also needs to be held in mind as well. Very large studies are more likely to display statistically significant findings. Your p-value is really just a test of your sample size, so as the study grows your p-value tends to shrink. Get a large enough sample and you'll almost always find statistical significance. Just referencing a p-value isn't enough, it needs to be read in the context of the study size (whether large or small).

Edit: For those downvoting, here is a much better and eloquent explanation:

https://twitter.com/daniela_witten/status/131218095960997478...

spywaregorilla · on July 13, 2021

This is a bad explanation of p values. Of the start:

> If you’re testing whether mean=0 and actually the truth is that mean=0.000000001, and if you have a large enough sample size, then YOU WILL GET A TINY P-VALUE.

Think about this for a second. Does it make sense that the results of my statistical test would change if I measured in terragrams vs. micrograms? No, it doesn't, because it's not the numerical difference between null hypothesis and sample mean, it's the difference relative to standard deviation.

If the effect size is 0.000000001 and the standard deviation is 0.0000000000000001 then this is an enormous, important effect. If the standard deviation is 0.1, then this is a really small and unimportant effect. It can still be observed correctly with a sufficient sample size though because you've stated it's a real effect to begin with.

You will absolutely not get a tiny p value just because your sample size is large. That is wrong. You will be able to detect very small effects, but if there is no difference, you are more likely to draw the correct conclusion with 1,000,000 samples than with 100 samples.

The point the author here is trying to convey but is not getting clearly, is that a large sample size can detect an effect that is practically useless and call it significant. p-values tell you if it seems probable, given your data, if there is a difference between the two datasets. They make no claims about whether the difference is material.

Example python code to do prove p values are not likelier to be smaller just due to large sample sizes, given no actual difference:

   import random
   from scipy.stats import ttest_ind

   num_tests = 10000
   res = []
   for test in range(num_tests):
      n = 100
      a = [random.random() for _ in range(n)]
      b = [random.random() for _ in range(n)]

      res = res + [ttest_ind(a,b)[1] <= 0.05]

   print(f"10,000 tests with n=200 {sum(res)/ num_tests}")

   num_tests = 1000
   res = []
   for test in range(num_tests):
      n = 100000
      a = [random.random() for _ in range(n)]
      b = [random.random() for _ in range(n)]

      res = res + [ttest_ind(a,b)[1] <= 0.05]

   print(f"10,000 tests with n=20,000 {sum(res)/ num_tests}")

bumby · on July 13, 2021

This is a good explanation, thank you.

>You will be able to detect very small effects, but if there is no difference, you are more likely to draw the correct conclusion with 1,000,000 samples than with 100 samples.

I believe that this is to the tweet author's point where they stated that in the real-world a null hypothesis never exactly holds true. So you can detect even the tiniest of variance with a large enough sample.

>given no actual difference

I think the error here is that there is a priori knowledge of no difference, where the author is stating that in real-life scenarios there will always be some differences even if being too minute to be of practical significance. We can fabricate "no difference" in simulations but in real experimental design there will almost always be variance, even if it's just an artifact of the measurement process rather than being causal from the independent variable(s). Whether or not the differences statistically significant depends on the experiment design, to include sample size.

So while I understand your valid point I think the author's claim was more about the practical application of statistics rather than the mathematical precision as it relates with simulated examples. But I could be misinterpreting.

spywaregorilla · on July 13, 2021

If you're performing a rigorous experiment, then you have a control group, and your null hypothesis is that the experimental group will be the same. In actuality you will find that many null hypotheses hold true and the experimental design has no effect on the outcome, at all. Of course in some fields, like psych, most everything has some effect. But it's absolutely not correct to blame p-values for helping you distinguish between no effect and astonishingly small effects. They're functioning correctly. That is a secondary problem to solve, usually by stating a minimum effect size.

Of far greater problem is False Discovery Rate related things. Where you test 20 different things at once, and by chance identify one of them as significant even though the true effect size is 0. This is another area where increasing your sample size can help avoid problems, but even still you need to acknowledge your tools are imperfect.

bumby · on July 13, 2021

>In actuality you will find that many null hypotheses hold true

I'm assuming you mean within the confines of the experiment, correct? I agree. The tweet author was eluding to the fact that "IRL" the null hypothesis is almost never true at the population level. Meaning if you grab a large enough sample you will detect very, very small differences. (This was her Lucky Charms ~ blood type example in the tweet). I also agree with that. I don't think the two claims are mutually exclusive and the fact they can coexist is (I believe) precisely her point about sample size.

spywaregorilla · on July 13, 2021

I mean there are many real world examples where the impact of A has no effect on B, obviously. The number I'm thinking of, has, truly, no effect on the time since you last blinked. No sample size will change that.

Lucky Charms does probably relate to blood type in some impossibly small way. It makes sense that a biological trait has some relation to dietary consumption. I don't think any sort of non-garbage tier journal with peer review would publish it, but good on p-values for helping us detect them though. Not bad on sample size for making it possible to discern this effect size with a high degree of confidence.

It's worth noting that we also have many other tools to help us. For example you can test, given an expected effect size and a sample size, what the probability is of getting a statistically significant result, or a non-significant result, or a significant result that erroneously goes in the wrong direction. Or what the range of likely true mean effect size is given a significant sample difference.

We want large samples. They enhance confidence in findings. The author's premise seems to be it's better not to know that small rocks exist if we're only looking for big rocks. But fails to mention that the tools to find small rocks also help us identify big rocks with more clarity.

loonster · on July 13, 2021

The size of the sample depends on how rare of an event that is being studied. For something that happens 80% of the time, 300 is a very reasonable number. Something that happens 1/1000 or 999/1000 would need a much larger sample size.

bumby · on July 13, 2021

Yes, I agree. But that’s to the same point. The p-value is still a measure of whether your sample size is large enough to capture a small probability event in the context of assessing the null hypothesis

walshemj · on July 13, 2021

Normal election poling uses 1k sample size as a minimum and 3k is preferred.

abc_lisper · on July 13, 2021

Yes. 130 people is sufficient for a good sample.

mateo411 · on July 13, 2021

If we use a 95% confidence interval and we have a sample size of 258 then the margin of error is around 4.5%. If the sample size is 130, then the margin of error will be around 6.5%. That's a little high, but probably still good enough.

danenania · on July 13, 2021

The sample size you need also depends on variance and the presence of confounding variables. It’s impossible to say whether n is “big enough” in the abstract; it depends completely on what you are measuring and how you measure it.

mateo411 · on July 13, 2021

We are sampling a Bernoulli distribution for measuring burn out. A software developer could be burnt out/ not burnt out.

The variance of a Bernoulli distribution is P(1-P), which is (.83)(.17), since 83% of the respondents answered that they were burnt out.

danenania · on July 13, 2021

But why are they "burned out"? Is it because they're a software developer or because they live in an advanced post-industrial civilization in the midst of a global pandemic? Could they feel burned out today but not tomorrow? Has the sun come out in the UK at any time in the last two weeks? Do all the participants even agree on the definition of "burnout"? Were there five other similar studies done that we don't know about that were discarded because they didn't produce the desired click-baity conclusion?

I could go on and on.

mateo411 · on July 13, 2021

It's important that the 258 were uniformly drawn from the population of software engineers in the UK.

It's not important whether the sun came out in the last two weeks or not, unless we were trying to measure the weather affecting burn out.

It's true different people might have a different interpretation of burn out, but it should be pretty similar to other people. I'm going to assume all people who took the study understand the idea behind burn out.

The point of the poll isn't to answer why but to measure the percentage of burn out with some degree of accuracy. The margin of error in this poll was 4.5%. That's how all yes/no polling works. We don't know that 83% of people are burnt out, but we are reasonably confident that 78% to 88% of people are burnt out.

enraged_camel · on July 13, 2021

Sorry, but what part of "they only surveyed software developers in the United Kingdom" gives you the impression that their sampling might actually be good?

Worth noting that the study itself doesn't seem to mention their sampling methodology.

antris · on July 13, 2021

Studies are never meant to be an ultimate answer to everything. And just because they're not the ultimate answer, doesn't mean that they are therefore worthless, bad or "bullshit".

danenania · on July 13, 2021

It actually does mean that in most cases. Having false confidence in a hypothesis based on non-rigorous studies is much worse (and in some cases, a lot more dangerous) than withholding judgment until the data (all the data, not some cherry-picked slice of it) really is conclusive.

enraged_camel · on July 13, 2021

No. You perform studies to validate observations or theories and, if you are able to validate them, offer potential factors as contributing causes. A study where the data does not support the conclusion is worthless, and can in fact be harmful.

alienthrowaway · on July 13, 2021

I wouldn't be so quick to discard it. From personal experience, the past 14+ months of working from home has been very stressful.

I didn't think I was burnt out, until I started interviewing (due to work "stress", which turned out to be burn-out in disguise). One recruiter informed me they have seen a spike - 70% of interviewees canceled scheduled interviews, post-screening. I almost cancelled mine due to being overwhelmed/burnout from preparations + work, and my thesis is there are a lot of burned out developers out there. Unfortunately, I didn't hear what the cancellation rate baseline was before work-from-home.

If there are tech recruiters/HR folk/hiring managers reading this, I'd love to know the numbers you've been seeing for interview cancelations.

SirZimzim · on July 13, 2021

There should be a study on studies being posted to HN. We must be at like 99% of them that get critiqued for being a bad study in one way or another. I would love to see some examples where everyone accepts the study.

LolWolf · on July 13, 2021

To be extremely fair to hn: almost all such studies about human behavior, especially those relying on polling, are garbage.

Often, the statistical methods are inappropriate at best, or downright deceiving at worst; there is nominal, if any, understanding about what "controlling for a variable" means, among many other such things. Not to mention that often the methods themselves are highly questionable in their reproducibility, statistics aside.

(Another classic is to point out flaws in generalizations for mice studies and such, but that is, imo, independent of the study's methods. It's a fair critique, but I don't think it's what you're interested in measuring w.r.t. "bad" studies.)

pmoriarty · on July 13, 2021

"Data was collected in 1 day. Is this a joke? No it's not!"

What makes you think data collected over 1 day worse than the same data collected in multiple days?

"I really don't understand how they can do a half-assed data survey on 258 people only, and then have the gall to claim "83% of DEVELOPERS""

It's called extrapolation.

No study ever interviews everyone. They all have to extrapolate from a subset of the entire population.

Obviously the more participants there are the more confidence you can have in the conclusion, but whether 200, 2000, or 20000 participants would make for a "good" study is completely arbitrary.

jedimastert · on July 13, 2021

> They all have to extrapolate from a subset of the entire population.

This only really works if you have a representative subset of he population you're extrapolating to. Is the distribution of experience roughly equivalent to the general population of "developers"? Age distribution? Company size? # of children? Home life? These kinds of demographics can have a big impact, and I see no discussion of how they were accounted for.

Heck, even the general work ethic culture difference between the UK vs the rest of the world could be a huge confounding factor that I know wasn't really accounted for.

antris · on July 13, 2021

Getting a perfectly representative subset of the population, means surveying the whole population of the world. Also, accounting for every imaginable variable is always impossible, because we cannot ethically grow people in an isolated environment.

Sure, a meta-analysis of several studies reveals more than one single study, but that doesn't make a single study bad or worthless - or as said in the parent - "a bullshit study".

jedimastert · on July 13, 2021

There are plenty of ways to account for not having a representative sample distribution, I'm just saying that I didn't see any kind of discussion about demographics or confounding variables in the actual study.

This was a pilot study, a way to open an avenue of discussion, and to be sure I think the study warrants more investigation. They even say so: "In future we would like to conduct larger studies over multiple countries."

pmoriarty · on July 13, 2021

"This only really works if you have a representative subset of he population"

Of course. Never said otherwise.

syshum · on July 13, 2021

In the context of your comment it is implied that you believe the "study" in question does have a representative subset.

while you may not intend that, most people reading your comment would believe you think the study is fine

pmoriarty · on July 13, 2021

"In the context of your comment it is implied that you believe the "study" in question does have a representative subset."

No. I was replying to the specific objections the parent poster had.

I never said or implied anything about whether I thought it was a good study or not.

fouric · on July 13, 2021

Clearly, several people (including me) think that you did imply it - so, regardless of whether or not you intended to do so, that's what you communicated in your post.

In particular, the way in which you mentioned extrapolation in the context of the study (and the conversation in general) carried the implication that the extrapolation was valid.

pmoriarty · on July 13, 2021

"Clearly, several people (including me) think that you did imply it"

Wow. Several people on the internet think something.

Is that a large enough sample size?

spywaregorilla · on July 13, 2021

Nit, the 258 number makes this an inference on the broader population, not an extrapolation, if it's a random sample.

It is, however, an extrapolation to make claims about the entire dev world when looking just at UK devs.

disgruntledphd2 · on July 13, 2021

You always want at least a week of data collection, to control for weekly effects. People are often happier on a Friday than a Monday, for instance.

pmoriarty · on July 13, 2021

I'm not sure how much of a difference the day of the week would make for people who are really burnt out.

thinkharderdev · on July 13, 2021

The question is precisely whether they are "really" burnt out or just had a long day (or couple of days) at work and are feeling burnt out at that particular moment. We all experience something akin to burnout at some point or another but it makes a big difference whether it is "I felt burnt out for a day or two after a tough sprint or major production incident" or "I've been working 80h per week for months". The former you might be fine after a relaxing weekend whereas the latter may take months to recover and seriously jeopardize your long-term physical and mental health.

disgruntledphd2 · on July 13, 2021

True burnout, probably not, but it's always better to avoid problems like this by collecting data for at least a week.

qw3rty01 · on July 13, 2021

it wouldn't, it would make a difference for the people who aren't burnt out

mirekrusin · on July 13, 2021

Another study finds out that 83% of population are old ladies with a cat that happily respond to surveys by post.

mikehearn · on July 13, 2021

Having a poll in the field for two days is pretty common, and n=258 is definitely enough for a representative sample.

throwawayswede · on July 13, 2021

- This was not a poll.

- Being common doesn't mean it's good or acceptable. In this case, it doesn't account for the effects of day of the week, or what sort of issues the person was having on that day specifically.

- n=258 is enough if it was sampled properly, and they provide no details to show that it was, so it wasn't. They provided nothing more than the engineers they surveyed were in the UK. Nothing on experience, age group distribution, family-size, years of work, company size, work change.

- As it stands, this provides 0 valuable information, and I can only assume it's intended for buzz.

mikehearn · on July 13, 2021

> "It doesn't account for... what sort of issues the person was having on that day specifically"

I'm not clear on what you are suggesting here. Are you saying they should contact people multiple times to make sure they weren't having a bad day the first time around?

2-3 days is common because any longer than that and the survey results will no longer be reflective of a specific point in time.

Also, if you check their crosstabs they have breakdowns for gender and age.

himoacs · on July 13, 2021

Thanks for taking the time to read the details. I am always skeptical when I see "Study finds...".

dheera · on July 13, 2021

The thing is, everything in this world is now gauged by engagement, not quality. The fact that this is getting more reactions on HN gives it even more exposure, and actively reinforces this behavior even more.

If you have a really good, thorough study that is so thorough that people have nothing to comment on it, it will not rise to the top of pretty much any platform, because comments are valued higher than upvotes on most platforms. So you want to engineer some clickbait in there so that your article gets madly commented upon; being incomplete is perhaps the easiest way to do this. (I don't know about HN, but it certainly appears Facebook and Twitter work that way.)

throwawayswede · on July 13, 2021

> everything in this world is now gauged by engagement

nope not true, only algorithm driven website. just because this post or comment made it to the top of HN doesn't mean people think it's valuable now, it just means like you said: it has buzz around it, and it should, because it's bullshit and should be exposed for the bullshit it is. Is that enough to get people to be more skeptical about what they read? I don't know, probably not though.

stcredzero · on July 13, 2021

This is such a great example of bullshit studies.

Lots of social science seems to be confirmation of stuff that people have been suspecting. Not to take away from such things: It's important to verify with data.

However, the really interesting question: Where does the burnout come from? Is it possible to specifically characterize the source? By starting to break it down like that, we start toward finding a solution.

mirekrusin · on July 13, 2021

If they asked me to do this survey, their mail would end up in my spam box. I’m not burned out.

roody15 · on July 13, 2021

Sadly lots of today’s “research” meets this low threshold. Lots of politicized headlines use low quality research to push narratives.

#depressing

marvindanig · on July 13, 2021

I recently came across a US Government funded study [1] that used anecdata of 12 people–just 12 tired souls!– to put forth a seal on the claim that e-ink screens are better for the human eyes than emissive type of displays.

Edit: Reality however: the opposite is true. Only in some scenarios (like reading under open sun) does one type of screen outdo the other. In most usual cases, that is indoor reading, the emissive display is better for the eyes.

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5929099/

ottermax8 · on July 13, 2021

Yeah I can do a survey too :)

jquery · on July 13, 2021

I’m guilty of posting before reading the article. I’m glad you pointed this out. Sounds like an awful study.

There’s true burnout and then there’s feelings of stress. They aren’t the same thing at all. I know because I’ve experienced both. Burnout cuts you down to your soul. I haven’t programmed professionally in 2 years and the thought of doing so makes me nauseous. That’s burnout.

systemvoltage · on July 13, 2021

Let's me 100% honest here - if we cannot define clearly the psychological and medical definition of a burn-out (with accurate diagnosis), it is just fine to discuss in non-scientific context. The moment you call this a study and no one here is willing to ask hard questions makes me extremely disappointed in HN and the intellectual curiosity that usually drives discussions here. Why do I feel like there is some political angle to this? (labor rights).

smoldesu · on July 13, 2021

It would take a lot of mental gymnastics to twist this into a political statement. If we "cannot define clearly the psychological and medical definition of a burn-out", why are we even studying it in the first place? That would be like experimenting without a control group.

jquery · on July 14, 2021

Looking back on this comment, I want to apologize for trying to say what is "real" burnout and what isn't.

Thinking about it more, they can all be symptoms of the same thing, to varying degrees. Apologies to anyone I may have upset by minimizing their experience.

myWindoonn · on July 13, 2021

Your arguments are specious; perhaps you're simply angry that you didn't understand an industry you've worked in for a while? Perhaps you're in management, or in the 20% who aren't burnt out, and you're embarrassed by proxy?

Psychological experiments and surveys are often done in only one or two days. Why would a wider window of time help when taking a snapshot of a current situation?

Hm, is 258 a small sample size? Let's put in some effort instead of just screaming "bullshit" a bunch. There are approximately 408,000 people in the entire population [0], and plugging everything into a calculator, we get two results: First, they should have tried to poll around 400 people (and perhaps they did!), but more importantly, the 95% confidence interval is only around ±6%.

Let me repeat that more clearly: By typical polling standards (95% confidence interval, random sample from filtered population), we expect that 77-89% of UK software developers are actually burnt out. That is still three out of every four people in the industry!

Not sure what to say. Take a stats class?

[0] https://www.statista.com/statistics/318818/numbers-of-progra...

thinkharderdev · on July 13, 2021

All of those calculations assume a random sample which this was not:

"To reduce self-selection bias, invitations to complete surveys were sent out to members of Survation’s panel and the surveys were then conducted by online interview. This approach is far more rigorous than say, running a poll on a website in which anyone can choose to take part."

Are "members of Survation's panel" a random subset of developers? How many invitations were sent out to get 238 responses (i.e. what was the response rate)? Are developers with burnout more likely to respond to a survey about burnout? Was the sample population demographically similar to the overall population of developers?

Lutger · on July 13, 2021

It's a lot and I'm not surprised. Recent research in my country (from a reputable institute, the Netherlands) mentions 1 out of 6 people feel burnout and ranks IT industry in the top 3 sectors. This survey articulates it in quite a broad way however, just asking if respondents have 'feelings of burnout', which most of us have once in a while.

That doesn't mean they are burned out clinically speaking, or even if they have symptoms of burnout. It does point to a large, systematic problem in the industry (and society in general).

throwawayswede · on July 13, 2021

Your comment starts with a disingenuous sentence, and continues to show a level of bitterness I've seen before in people who believe themselves to be more knowledgeable/experienced than their surroundings (or they actually are), but never got a chance (or so they believe).

I didn't claim to know everything there is to know about statistics and research in general, but I think I know enough to tell that this is not a study, but a marketing campaign for whatever this usehaystack thing is.

> First, they should have tried to poll around 400 people (and perhaps they did!),

If you had read the article, you would have known that that's not what they did.

> By typical polling standards (95% confidence interval, random sample from filtered population), we expect that 77-89% of UK software developers are actually burnt out. That is still three out of every four people in the industry!

Do I understand correctly that you've worked on this? That certainly explains the hostility, but nevertheless.

> we expect that 77-89% of UK software developers are actually burnt out.

That's not what the link/study says.

> That is still three out of every four people in the industry!

This is absolutely wrong, without a shred of a doubt.

I recommend that you read: How to lie with Statistics by Darrell Huff.

myWindoonn · on July 13, 2021

I didn't produce this survey. I just know how to compute confidence intervals. I'm not going to regurgitate the Wikipedia article on confidence intervals at you; it's clear that you don't really want to learn how statistics work, and you'd rather feel smug because you surely know better than to consider a survey of your fellow people.

throwawayswede · on July 14, 2021

Just read other replies. I'm not going regurgitate other comments to you. You're clearly way more annoyed by this than average. I don't feel like this is worth engaging with anymore.