Categories
analysis

Paradoxic Fandom

Q: What’s the difference between Star Wars fans and Star Trek fans?
A: Star Trek fans don’t hate 90% of their movies!

It’s funny because it’s true!

As a life-long Star Wars fan, this joke resonated with me: my position of finding a lot to enjoy in literally every Star Wars film now seems highly unusual. It also raises an interesting question. What does it mean to be a fan of something you are mostly angry about? How does that happen?

I’ve been thinking about this for a few years now.

Here’s what I’ve figured out.

That’s No Moon

This phenomenon is much bigger than Star Wars, if such a thing is possible.

As an area of work and personal curiosity, free-to-play mobile games show a particularly stark example of what I’m calling ‘Paradoxic Fandom’. Studying the games that have found greatest long-term financial success, I noticed that almost all their player communities tended to have the same repeating refrains:

  • Every update makes the game worse
  • The game is dying
  • The developers are out of touch with players
  • The developers only care about ‘x’ players (x is either new players, or the biggest spenders)

Where I work, it was particularly noticeable that we didn’t get much feedback like that until we came up with our first truly successful long-term game.

I’ve seen Paradoxic Fandom elsewhere too.

  • The dominant feedback on most social media platforms (Facebook, Twitter etc) is that all changes to that platform are for the worse… but people keep using them.
  • The ‘Bad webcomics wiki’ seems like the go-to place for fans of a webcomic to complain about how much they hate it.
  • In Future Shock (2014), a documentary about the long-running UK anthology comic 2000AD, one of the historic editors draws a distinction between ‘readers’ and ‘fans’; he implies that the fans were the most difficult to deal with.
  • I keep in touch with developments in LEGO through the blog From Bricks To Bothans; it became apparent that main writer Ace Kim (since 2002!) has a similar love-hate relationship with LEGO (sample: a review of the ‘Ultimate Collector Series’ LEGO Star Destroyer, ending with “Things like this re-affirms my decision to stop collecting LEGO”)

It’s not just statistical regression

There’s two phenomena that look a bit like what I’m talking about, but are meaningfully different: the Sophomore Slump, and its close relative the Sports Illustrated cover jinx.

The Sophomore Slump occurs if someone (or a group of people) perform worse when they have less to prove. Having found success on their initial effort, they may try less hard for the follow-up, be it students in their sophomore year, or bands on their second album.

The Sports Illustrated cover jinx is when an athlete performs exceptionally well, gets featured on the cover of the magazine, then has a disappointing performance immediately after. It’s possible that some athletes find the additional scrutiny difficult to deal with, but this seems much more likely to be a simple case of regression to the mean: in anything where there’s a fairly strong random component to performance, an outlier is most often followed by a more average result.

This can apply to almost anything. For example, I found out the excellent line “I’ve come here to chew bubblegum and kick ass… and I’m all out of bubblegum” was from the film They Live (1988), but when I eventually watched it, literally nothing in it was as good as that line. Regression to the mean!

But Paradoxic Fandom is an ongoing, sustained effect, so it’s not just a statistical regression/slump/jinx.

Paradoxic, not yet toxic

When consumers begin to harass or threaten people they perceive as damaging the thing they love, they have crossed a line into what gets termed ‘Toxic Fandom’. Here, I’m interested in the wider phenomenon, which can easily give rise to Toxic Fandom but is meaningfully distinct. So I’m calling it Paradoxic Fandom, and the key defining features are:

  • The product/service is ongoing over months, years or even decades
  • Fans continue to consume and engage with the product
  • … but they report finding the product/service consistently disappointing, getting worse over time
  • … and they believe this is because the creators are out of touch with the fans

So what is going on here?

I think this actually begs four related questions:

1) Why do companies do things badly?
Why can’t video games harness everything they learned and get better with every iteration? Why, when there is clearly money to be made in satisfying a large audience, does capitalism fail to deliver?

2) Why do consumers misjudge things?
If criticism is unwarranted or short-sighted, why does that happen?

3) Why do people remain fans/consumers of things they seemingly hate?

4) Even when responses are mixed, why does criticism dominate the discourse?

I’ve been thinking about this ever since the borderline-allergic reaction of some ‘fans’ to The Last Jedi (2017). Here’s what I think is behind each of those questions.

1. Why do companies do things badly

What about the money?
If something is a financial success, the budget for follow-ups is likely to be bigger, which in theory should help. But while money helps with execution, in artistic endeavours it’s very clear that money cannot buy ‘quality’ (whatever that is) – if it could, movie and game studios wouldn’t lose the most money on the big-budget failures, but that’s exactly what happens.

The second death star: bigger budget, worse results.

You can’t please all of the people all of the time
The follow-up thing will certainly have similarities to the original thing, but also differences; the people who loved the first thing will have liked different things about it, so inevitably some will be disappointed.

More money, more problems
Companies generally try to make more money – like LEGO looking to grow their business, or a free-to-play game looking to maximise profits. Much as capitalism is built on the lucky fact that in the right environment, competitive self-interest can produce great results for everyone, the end-result is always some kind of compromise between buyer and seller.

One simple factor is that a company will try to make more money out of their customers. Most benevolently this could be by making further content, but there are less benevolent ways too (most simply, releasing a DVD with two different slip-cases to try to get fans to buy two copies).

But that can only go so far. Generally, the best and most scalable way to make more money is to find more customers. It’s quite possible that having attracted all the customers you can with your existing product, you’ll need to make some changes to attract larger numbers, which brings us back to not being able to please all the people all of the time.

2. Why do consumers misjudge things?

I should make one thing very clear: when people complain about things, it is often worth listening, and very productive action can be taken as a result. But sometimes, as consumers, we do misjudge things, and that criticism is less useful. Why does that happen?

One thing I’ve found is that making things – really, almost anything – is always more difficult than you expect.

I see a connection between the Gell-Mann Amnesia effect (read a newspaper article about something you know and see how wrong it is; go on to assume everything else you read is just fine) and the Dunning-Kruger effect (non-experts have a sense of illusory superiority, because they don’t know enough to know better).

In whatever area you work or have expertise, you should be able to identify how wrong most people are about it. Most obviously I find this in any question that begins with “Why don’t they just…?”

A personal example: I want a new mobile phone, and battery life is much more important to me than weight or how thin it is – but I can’t find such a phone on the market. Why don’t they just make a version of a great phone that’s much thicker in order to have a bigger battery?

But if I try to imagine the sorts of things I’d know if I designed phones, I can imagine that there might be complicated cooling issues with a thick battery; or perhaps they make trial-concept versions of potential new designs and have people use them for a few weeks, and discover that even if you think you’d be okay with the weight/thickness trade-off, it’s ultimately too annoying. I can even more easily imagine there’s a reason that I can’t conceive of at all!

Knowing the answer to these questions about one’s own areas of expertise, one should really extend it to other areas. When you start to ask “Why don’t they just…” you should remember Gell-Mann’s newspaper experience and the Dunning-Kruger effect, and consider if perhaps you just don’t have the expertise to spot the problems with your idea.

Another example, from an area of personal expertise: in anything involving code, new updates bring new bugs. The frequent response is “Why don’t they just test it properly, and make sure all the bugs are fixed before they put out the update”. The problem with this is that, on average, it takes longer to find each incremental bug than the last (things that occur one time in ten take longer to find than those that go wrong every time; things that only happen under very particular circumstances only turn up if you test extremely large combinations of actions, etc). As such, to truly guarantee all bugs are fixed would take so long that the customer waiting for their bug-free update will have moved on long it arrives. In the real world, ‘the perfect is the enemy of the good’, and compromises have to be made to get anything done.

Follow ups like “Why don’t they just hire more/better testers”, “Why don’t they let players/customers test it first” run aground in a very similar way. (Don’t forget though, there might actually be reasonable ways to improve things a bit! Customers are great at identifying problems, but it’s on you to figure out the solutions).

So we see why professional efforts aren’t improved as simply as consumers often assume . I think there are four other reasons consumers can be disappointed by updates: expectation failures, familiarity contempt, confirmation bias, and in the longer term, simply ageing. Here’s how those break down:

Expectation failure
A combination of factors tend to make us inordinately sensitive to our expectations not being met. For example, I use the internet on my train commute, and one time the train went into a tunnel, so the internet cut off, and I immediately felt annoyed. I realised this was a ridiculous response: I’ve made the journey hundreds of times, I’ve literally made a spreadsheet to identify where in the journey internet access drops. But in that brief moment, my expectation of continued internet access failed, and therefore in that moment I was annoyed.

In creative media this seems most obvious in movies, where the most significant factor in how much someone enjoyed a film seems to lie less in its objective qualities but more in how it compared to their expectations.

As noted above, a follow-up thing must be at least a little different to the original thing. Expectations are based on the original thing, so some expectations can’t be met – so some disappointment is inevitable.

Familiarity Contempt

‘Familiarity breeds contempt’ sums this aspect up well.

The IMDb ‘goofs’ section for 1977’s Star Wars Episode 4 is one of the longest of any film. But this is not because it was a terribly made film. It’s because so many people love it so much, it has received orders of magnitude more scrutiny than most other films.

I think this applies to the LEGO example, and especially to successful mobile games. These superfans are so close to the material they see every flaw. Something that seems perfectly functional to a casual player could be perceived by the superfan as being riddled with bugs.

So this is another effect: the people that engage the most with a thing will also be the most knowledgeable about its flaws.

In a fan screening, the completely missable moment this stormtrooper bumps his head generates a wave of laughter, as everyone knows where to look for it! It’s great.

Confirmation bias and performative criticism
Reality is complex and nuanced, so I think we make it more digestible by applying confirmation bias. If there’s something/someone we think is good, we are more likely to overlook or discount their flaws; if we judge something to be bad, every fresh piece of evidence is another chance to spot something wrong with it. It takes active, conscious effort to try to maintain a balanced view.

As a thought experiment, when was the last time you changed your mind about something? I find this to be alarmingly rare in myself. How likely is it that my immediate judgement of something was wrong and should have been corrected, once further evidence emerged? If I’m very optimistic, maybe not that often – but certainly much more than actually seems to happen.

I think this applies to many artistic endeavours. If for any reason your judgement on the creator of something has soured, you are likely to apply confirmation bias to their subsequent works and seek out their flaws to confirm your belief.

With art taking many forms, this produces an interesting corollary: the sooner you experience a follow-up work to something, the more likely you are to apply confirmation bias and continue to enjoy it – or hate it. If you are enjoying a TV series, I think a new episode is more likely to benefit from positive confirmation bias than a new series, which in turn is much more likely to maintain positive bias than a revival of the series many years or decades later.

Let the hate flow through you!

Related to this, I noticed that many criticisms of The Last Jedi in particular were bad-faith interpretations of plot, applying a level of scrutiny no screenplay could survive – an example of negative confirmation bias. It seems as if there are incentives to enact a kind of performative criticism once you flip into negative confirmation bias. So as an experiment, I wondered what it would be like to apply this to that sacred text, Star Wars Episode IV: A New Hope (as it was called in the re-release; just ‘Star Wars’ in the original release, which is the one post-release change that interestingly escapes criticism).

Please put on your flame-proof goggles and hold your nose as I have at it.

There’s so much wrong with this film it’s hard to know where to start, but let’s just go from the beginning.

Darth Vader is a terrible villain and an idiot. His “big entrance” is to come in after a bunch of cannon fodder have already done all the fighting, and then he does some kind of useless interrogation on someone that’s already been subdued. More importantly, he’s chased this ship to Tatooine, where the rebels presumably have a contact, and when the Death Star plans are obviously sent down to the planet in an escape ship, what does he do, as the guy with the ability to sense things with the Force? Just send down a bunch of useless stormtroopers to walk around asking if anyone has seen any droids lately?! What exactly is Vader doing while that’s going on – did he have to leave in a hurry to get to the big exposition meeting in the Death Star?!

And then later on, the Falcon, the very ship that escaped Tatooine – obviously with the plans –  gets caught by the Death Star; Vader is right there standing in front of it, he senses ‘something’… and then walks off?! Oh it’s fine, we’ll just send in the guys with the scanning machine instead! But wait, what’s with this scanning machine you have to physically take into a ship to operate? At the start when the droids take the escape pod, the Imperials scan it remotely for life forms, so why can’t they do that in the Death Star?

Okay, how about the heroes. Well, Luke is the most pathetic magical orphan character ever conceived. There’s literally nothing likeable about him. All we know is he apparently wants to get off the planet to… join the academy (what, the Imperial academy? What kind of ideal is this?!), but all he does is slouch around and whine about his chores. His step-parents are then apparently roasted alive (weird tonal imbalance, BTW) and he’s sad for about two seconds. People talk about how he’s a great pilot, but the first time we see him in a vehicle he’s in a land speeder keeping an eye on the scanner – while C-3P0 drives?! Whatever happened to “Show, don’t tell”?

Oh, but he’s magic! Obi-Wan teaches him to ‘stretch out with his feelings’ and suddenly he can hit a target no computer can hit, in a spaceship he’s never flown before! And also, by the way, despite never having fired a gun, he’s actually a crack shot, able to shoot out door controls from across a room, and out-shoot trained soldiers in multiple encounters! Also: he’s given an incredible weapon in the form of a light sabre… and then literally never uses it. Has Lucas never heard of Chekhov’s Gun? Why didn’t Luke use his light sabre to escape the trash compactor?

Oh, but the film has such a great back-story, right? Obi-Wan Kenobi has been apparently waiting for Luke to grow up so he can give him his father’s light sabre, and train him in the force. But, er, what was he waiting for exactly? Could have started a little earlier maybe? If Darth Vader had actually come down to sort things out Luke would have been killed before they even met! As it is, Obi-Wan literally gets in about 2 minutes of training before dying! (Another tragic death which Luke shrugs off in less than a minute, before running off to man the ship’s gun-turrets, which, by the way, is yet another thing he’s never done before that he’s apparently great at).

Alright, how about the Death Star. This whole thing is one giant plot hole. So apparently you can fly around the galaxy at faster-than-light speed in a station the size of a small moon? Why even build a hugely expensive space station then? Just stick a light-speed engine on a moon and fly it straight into any planet you don’t like! Oh, but I guess if that’s possible maybe the planet could just fly out of the way with their own giant engine!?

And how about that Death Star security? Literally any random droid can unlock doors and operate machinery from any random access port? Except tractor beams, which can only be disabled from a lever on a vertiginous ledge for some reason? And doing any of this doesn’t notify anyone anywhere apparently? And there’s so little CCTV that a bunch of random idiots can run around and you don’t even know where they are? They literally escape a dead end by jumping through a waste pipe and nobody can figure out to just throw a grenade down there?!

So probably the most sensible character is Princess Leia. She spends the whole time being rude to everyone she meets, but she’s at least smart enough to figure out they are being tracked when they leave the Death Star – but what does she do about it? Try to find and remove the tracker, or maybe go to another system and switch ships? No, let’s just go straight to home base, it’s fine, we’ll have at least half an hour for our techs to figure out a weakness in this giant planet-destroying space station that can be exploited by, er, about 30 small ships! I’m sure we can do that before it blows us all up. Wow! Good thing the plot wants this to succeed or this rebellion would be over!

And then finally, the big climax: the trench run. First, what even is this trench? And the secret weakness – literally one torpedo in the wrong place on the outside of the station blows the whole thing up? What kind of design is this exactly?

Given this lucky gift, what do the rebels do? Take it in turns to fly really far away from the exhaust port, and then spend ages flying along this trench to reach it, so the enemy can take them out one by one?! If you have the element of surprise, why not go straight to the exhaust port? Or why not all go at once? And even if there was some reason for the long run-up, the day is apparent saved when the Millennium Falcon comes in at the last second and shoots the bad guys at the end of the trench – er, literally any of the other X-Wings could have done that on any of the previous runs?

And…. breathe.

So, what just happened there? Wasn’t that far too long? Why yes, yes it was. Writing it was easy, fun, and made me feel smart, so I wanted to keep going. I can actually now understand why someone might read/write things like this over a prolonged period of time, rather than positively engaging with something they enjoyed. If it seems hard to imagine, I recommend giving it a go with something you know well yourself!

It’s not them, it’s you: fandom vs. ageing

On a time scale of multiple years or even decades, a newly significant effect enters: the viewer/consumer/player themselves has significantly changed. I suspect this is where Star Wars suffers the most. The main saga films are (I think) most enjoyed by children; when a new trilogy comes out 16 years after the last, those children have grown up!

I remember many Star Wars fans who were disappointed by the prequel trilogy nonetheless really enjoying the 2003 animated TV series. This had a lot of over-the-top action that would never stand scrutiny in live-action form, but I suspect the animated format bypasses a lot of the adult reality-check apparatus, and allowed these folk to be childishly delighted in the way they originally were.

I particularly remember Mace Windu taking down an army of battle droids in a way that I am confident would not have been received positively in live-action form.

This became particularly evident with Disney’s more recent Star Wars sequel trilogy. I’ve seen many young fans citing the prequel-trilogy as a superior era (echoing – yet reversing – the response to the prequels at the time, when they were reviled by fans of the original trilogy). This is also evident from the comments endorsing the Scene 38 Reimagined video – a fan video replacing the somewhat feeble 1977 Obi-Wan vs Vader lightsaber battle with something more like the prequel or sequel trilogies (and which would be utterly out of place in Episode 4).

So there are many reasons that account for strong criticism from fans, but don’t get the wrong idea: that should never be taken as an excuse to disregard all criticism! Criticism from consumers is often valid and useful for companies to heed (especially that familiarity/mistakes one) – you just have to take these effects into account.

3. Why do people remain fans of things they seemingly hate?

See what I did there

So, companies disappoint people because money can’t buy quality in artistic areas; and also because you can’t please all the people all of the time but companies want to continue and grow. Fans will be disappointed by new works due to expectation failure, confirmation bias (earlier disappointment drives fresh disappointment), familiarity contempt, and ageing; their analysis of the flaws may well be completely valid and useful feedback, or may be flawed due to the Dunning-Kruger effect.

The obvious conclusion is that people would simply stop consuming things they hate and find something new. But evidently, in many cases, that doesn’t happen. Why?

First, I think it’s a clue that everything affected by Paradoxic Fandom is in the arts or services; products don’t seem to have the same issue.

My friend John Broughton referred me to the 1970 book “Exit, Voice and Loyalty” by Albert Hirschman. To paraphrase significantly, “exit” means a customer stops using the service, and “voice” is what happens when “exit” is not possible: the customer voices their feedback on what they don’t like in the hope it will improve. I think this explains why products get off the hook – exit is fairly easy, you just buy something else.

In creative products, there is no easy exit. If you don’t like the sequel to a game or film you love, there is no alternative version you can switch to. In services, switching is either impossible (you can’t take everyone you interact with on a social network with you) or has a high enough cost that it’s better to stay. In the games-as-a-service model, it’s amplified further: when an online game updates, there’s no way to carry on playing the old version you liked better. To keep playing you have to accept the changes.

A fascinating exception that proves the rule: Runescape found a way to work around this. They released ‘Old School’ Runescape, recreating the older version that most fans first fell in love with – while continuing to develop the newer version. They now continue to develop the ‘Old School’ version, but all changes in it must pass a majority vote by the players.

So if you love a thing, and it changes in a way you don’t like, and you can’t switch… well of course you’ll be vocal about it! This is perfectly reasonable!

However, as “Why don’t they just…” is almost always unfounded, and as cultural products must appeal to more than just one person, very often that feedback won’t (or even can’t) be heeded. At this point, things can get pretty rancorous.

4. Why does criticism dominate the discourse?

From Paradoxic Fandom to Toxic Fandom

SamSykesSwears summed up the stages of toxic fandom (I think referencing the pattern of abusive relationships) in this tweet:

  1. I love this
  2. I own this
  3. I can control this
  4. I can’t control this
  5. I hate this
  6. I must destroy this

For example, the changes George Lucas has made with each edition of the original Star Wars trilogy particularly provoke “I can’t control this”. The clue to the irrationality here is how in many reviews, literally every single change is reviled. It seems improbable that literally every change the original creator would make to their creation before it came out was good, and every one after is bad. This looks far more like a near-religious adherence to some sort of holy text.

I think this was insightfully extended in a response from EricVBailey:

  • I can’t destroy this
  • I am even more mad now
  • I can harass those who are part of it though
  • I found a whole community of people doing this
  • I have found my validation
  • I love this


Self-selection and feedback loops

Even given all the above, it seems odd that a casual glance at much online discourse tends towards negativity, and especially the more toxic end of it. I think two things are at work here.

One is self-selection. This is easily seen in Amazon reviews of products: the majority of reviews seem to be people who only just got it (so have nothing useful to add), or who have had some terrible problem. This is because both of those moments are cues to leave a review. Using something and having it work just fine does not prompt you to go write a review. Similarly, playing a mobile game or watching a movie and simply enjoying it does not motivate you to review / talk about it online as much as hating it does.

My armored walker was destroyed by primitive weapons… would give 0 stars if I could

So self-selection skews what people tend to write about things.

The other factor is Feedback loops.

There are a few feedback loops online that end up fomenting more toxic discourse. One that has been well-covered (I thought particularly well by Tom Scott’s Royal Institution lecture) is that algorithms optimising for people to spend time on a platform will tend to find success by showing people more extreme and click-provoking content, which is often negative. So YouTube will naturally take you from “10 Things You Missed In The Last Jedi” to “27 Last Jedi moments that made no sense” to “237 reasons I hate The Last Jedi and You Should Too”.

There’s also a very natural social feedback loop that can work in tandem with this. I think for many people, it’s quite scary to make comments you think others will disagree with. If you found problems with something everyone loves, you’re less likely to shout about it; but if drama-optimising algorithms are showing more people that agree with you, that will embolden you to speak out more (see also: politicians making racist remarks emboldening racists).

My colleague Chris Hohbein saw this play out dramatically in the No Man’s Sky community. After that game’s launch, the community was incredibly toxic (mostly due to the game failing to meet their expectations); as updates to the game improved things, the hate diminished, and positive discussion flipped over to dominate the discourse instead.

Perhaps this doesn’t sound like that strong an effect. As a person who watches the first midnight showings of Star Wars films and feels moved to make comment on them in public right after, I can testify that the feeling of not knowing which way everyone else is going to go does make it feel a bit scary! That said, it’s Star Wars, I really should have figured out the pattern by now…

What I told you was true… from a certain point of view
Paradoxic Fandom certainly doesn’t apply to everything, and some of the above noted effects operate in reverse – for example, as in No Man’s Sky, confirmation bias can be positive; positive online conversation can beget more of the same.

But there are some particularly interesting counter-examples. For example, from that joke at the start, why is Star Trek different? And what about the ever-expanding Marvel Cinematic Universe?Are they doing things right in a way that others don’t?

In the case of Marvel, I’ve seen the argument that they play it “too safe”. In most Star Wars films, someone significant dies or loses a limb; Marvel films are frequently a battle to retain the status quo, and good guys and bad guys alike usually survive. So they entertain in the moment, but don’t do anything that might upset anyone. Over the last 10 years and 23 films, there have been a handful of notable exceptions from that, which seemed acceptable. Do they just have the right frequency of ‘not rocking the boat’?

For Star Trek, each TV series will also tend to avoid disrupting its own status quo, but it’s unclear to me why subsequent series in different settings don’t seem to get as much hate as other franchise follow-ups*. This one is a mystery to me. Maybe I should become a Star Trek fan.

*Edit: I am reliably informed that I’m guilty of my own familiarity / contempt bias, and that actually Paradoxic Fandom is alive and well among Star Trek fandom. Prior to writing this, I had taken a brief look over various YouTube trailer comments, Reddits and forums to compare the Star Trek vs. Star Wars communities, and this seemed to confirm my hunch (or bias?), but this was hardly a rigorous study, and multiple people have now informed me of my error!

Conclusion: What do we do about it?

As a company / content producer

The biggest impacts on us as consumers are when things are better or worse than we expected. To the extent that you can, you want to exceed expectations. In practice this is tricky – if you hold your best bits or biggest surprises back from the marketing, perhaps fewer people will want the product. If you announce updates to an online game in advance and always under-play things, fans will spot the pattern and then be disappointed any time you fail to overdeliver. If you hold everything back, fans will worry nothing is happening at all.

To any extent possible, you can also be honest about things – educate consumers about the challenges you face. If you’re sure you’re right and they’re wrong, can you explain why clearly? Doing that – without making it an attack – can help both producer and consumer get closer to the more useful aspects of feedback.

More generally, I think it’s helpful to be aware of the above effects. Feedback from fans is extremely useful to help learn and improve; discounting it entirely is unwise. Knowing what drives it in different forms can help you unpick what’s most useful.

As a consumer / fan

I think a couple of Star Wars analogies help a lot here. As an engaged fan, do you choose the light side, or the dark side?

Luke: Is the dark side stronger?

Yoda: No. Quicker, easier, more seductive. […] A Jedi uses the Force for knowledge and defense. Never for attack.

You can use your deep knowledge of the material to engage in performative criticism (attack), or for knowledge and defence: what good-faith analysis can explain plot-holes? What practical aspects of production led to these compromises being made? What can we learn more generally about the craft from these decisions?

Luke: What’s in there?

Yoda: Only what you take with you

If you go in to a new experience ready to attack, you will find material to attack. Going in to anything new with an open mind, searching for elements to admire as well as those that justify criticism, is a more enlightening and fun way to go about things.

As an individual trying to find entertainment, I do think this is how you win – not by focussing on what you hate, but what you love. I distinctly recall spending the first 15 minutes watching The Force Awakens alternating between thinking “that’s not very Star Wars” and “that’s just a rip off of this earlier bit of Star Wars”, before realising no film could ever walk that line. With that in mind, I was able to enjoy the rest (of that film and the sequel trilogy as a whole) – while still finding a lot to criticise too! I find it highly rewarding to go in looking to enjoy everything I can, while also gaining enjoyment from engaging my more critical faculties later on.

There was just one more thing…

There’s a very important generalisation of Paradoxic Fandom. Consider these more generic forms of the regular mobile game complaints:

  • Every change makes things worse
  • This group/activity is failing and will die out
  • Those making the changes are out of touch with participants
  • Those making the changes don’t care about ordinary people

Does that sound familiar?

This is almost a textbook description of Populism in politics! The incumbent government is portrayed as out of touch with the people, that the country is failing, the government only care about themselves/the elite/the rich.

The country you live in has that same crucial quality we saw from “Exit, Voice and Loyalty”: it’s something you can’t easily change, and all the above listed effects can apply to politics just as they can to a game or multimedia franchise.

This means something! I just need a few more years to think about that.

Tim Mannveille tweets as @metatim, and sometimes writes about Star Wars on NothingAboutPotatoes

Categories
game

Empirically the best radio stations in the world.

If you just want the answer to that link-bait headline question, scroll down to the picture of the map with every country scratched off. Of course, methodology is critical, and you should really conduct your own study, so stick around for the details of quite how to do that.

But perhaps you’re not convinced of the need to do that, so first, as is traditional before the introduction of a ridiculous solution, let me sell you the problem.

If you listen to music in an office, the advent of Spotify is a disaster.

Superficially, of course, Spotify is clearly better than a regular music radio station, in that it has no ads, and most importantly you can put on exactly what you want to hear.

Oh yes, listening to what you actually want to hear – clearly the best of all things to listen to. Or is it???

There are two big problems with this in an office setting.

One is the burden of choice. You want perhaps 8 hours of music in a day, 5 days a week, without much repetition. That’s actually a lot of work to put together, and when trying this we perhaps better understand why people get paid to make radio happen.

The second problem is consensus. When there were five of us in the start-up I worked for, Stubble & Glasses, we discovered the only artist we all liked was Paul Simon. By the end of the first day, we knew we were going to need another office music strategy. And then we hired a bunch more people – some of whom, weirdly, didn’t like Paul Simon.

The obvious solution is also terrible

With radio, you accept that you’re not going to enjoy everything. So perhaps your office needs a communal Spotify playlist, which everyone can contribute to, and which also solves the tremendous burden of choice problem, since you’ve divided up that labour. And you like some of what you get to hear, and some you have to tolerate, but it all sort of evens out in the end.

Except that it doesn’t.

Because when you pick songs for a communal list, your individual incentives are not aligned with the group incentives. As a diverse group, your collective enjoyment will be higher if people pick the blandest, most widely-appealing artists they like. But as an individual, you want to listen to music that really speaks to you (which is usually not bland, and often not widely-appealing), and you also don’t want to “waste” your part of the list by choosing something that somebody else might add anyway. So you’ll be tempted to add the least bland, most personal, and least universally popular music you know.

And so will everyone else.

Excerpt from the office Skype chat while trying out a shared Spotify playlist strategy

And you’ll hate it, and eventually each other.

What about Last.fm or Pandora, then?

Also terrible in this context. Both services are brilliantly designed for an individual, but in much the same way as the above, what’s good for one is usually bad for the group.

Unless you know how to make large physical love/ban buttons for the whole office to enjoy, but even then the service is going to get very confused if you have any diversity in music tastes.

But radio is so 19th Century!

I know, this seems insane. With all the developments in technology and music services, there just has to be a way that the internet makes the concept of radio better.

Well, there is. And I’m going to stop phrasing minor problems in a melodramatic fashion and using subheads in a faux conversational style so I can tell you about it.

What if I told you that you can listen to radio from anywhere in the world

You would probably shrug, because you already knew that. TuneIn radio and others have facilitated this for a while now. And with 70,000 radio stations, it doesn’t seem as if you’re making the burden-of-choice problem any better. How are you supposed to find anything good?

Here’s how.

Radio Station World Tour: The Plan

  • Listen to at least one radio station from every country in the world, using a scratch-off map
  • Keep notes on all the stations you listen to
  • Now simply rotate through your favourite discoveries!

Radio Station World Tour: The Details

I will immediately confess that this solution is actually insane. But it’s insane in a good way. Because as soon as you start, you run into definition problems. Delicious, arbitrary definition problems.

When we tried this at Stubble & Glasses, here’s how we solved them.

  • How long counts as a listen? 4 hours of listening per country. Long enough for you to get a good feel for the station, short enough that you can survive even the least palatable options.
  • What counts as a country’s radio station? If it’s listed as being in the country on TuneIn radio, it counts, especially if it’s wrong. (We’re not convinced that A-net radio really broadcasts from Antarctica, for example. But it definitely counts.)
  • What counts as a country on the scratch-off map? Everything that is named on the scratch-off map, especially if it’s wrong. (South Sudan isn’t marked. Obscure islands in the Arctic are named, and therefore count.)

Inevitably, definition problems resist even these seemingly simplistic answers, so if you really want to do this, check the Nitty Gritty section below.

What actually happens if you attempt this

There are 193 members of the United Nations. If you used our 4-hour minimum listen rule, you could in theory get through 10 per working week, and in this way you could be done inside of 4.5 months.

In practice, if you use the map as your guide to what counts as a country, and if you split up large countries on the basis that that would be too easy (see Nitty Gritty below), and if you accept that I’m massively overblowing this by saying it solves office radio and you only manage to listen to 3 world radio stations a week (as we aimed to do), you’re now up to 228 listens and a 1.5 year project.

Oh, but when you get towards the end, you’re going to discover all the problems with your seemingly simple and elegant definitions, as you search for radio stations for Arctic islands with populations under 1,000, and okay, you can come up with something to get around that (see Nitty Gritty), but this is going to slow you down.

So much, in fact, that the business folded when we were 1.6 years into the project and 93% done. So I took the map with me and finished it mostly alone.

Our listening progress. Note the inflection point around March 2013 when it gets much harder.

The Nitty Gritty 1 – Peculiarities of the Luckies Scratch Map

The Luckies Scratch Map we used appears to use the Gall stereographic projection of the globe. This projection attempts to compromise between accurately representing the relative areas of the countries and accurately representing their shape, which sounds like a good idea, except of course that this means it achieves neither. As usual, the problem of area is more noticeable, and although it’s not as extreme as the Mercator projection, the area of territories towards the North and South poles are significantly overstated.

Perhaps this is okay if you’re using the map to more conventionally record your travels, as getting to Antarctica or some of the isles of the far North is a pretty big deal, and the bigness of that deal is then represented by the amount of scratching you get to do (especially for Antarctica). But in the case of a radio tour, it’s unsettling.

There are also some covered rectangles below the scratch map which you are invited to remove if you’re visiting certain specific countries. It turns out these reveal a range of oddly-chosen pieces of trivia, only those for Iceland and Fiji being particularly useful if you were planning to go there.

The final rectangle is to be scratched off if you are visiting “Luckies Island”, which you may notice does not exist, but is nonetheless shown on the map in the Mediterranean. The inclusion of this kind of Mountweazel may be intended as a copyright trap of some kind, and was quite fun to discover, but it will be highly unnerving for pedants or completists.

In our case, we resolved that we could scratch off Luckies Island only once we had completed the rest of the known world. We could then finally scratch off the mysterious rectangle under “Visiting Luckies Island?”, which turned out to be a bit anticlimactic.

Of course, with all that said, it’s still an excellent product that made this entire enterprise that much more visceral and compelling, so you should probably click on this affiliate link and go buy it.

The Nitty Gritty 2 – Large countries

Some countries are much larger than others. (Actually some countries are 38.8 million times bigger than others, if we happen to choose the largest and smallest). So listening to just 4 hours of radio and then scratching off the entirety of the US or Russia seems disproportionate.

So we made an additional rule: if a country covers an area larger than one latitude/longitude grid area, then one must conduct one listen for each longitudinal band it covers, ideally to a station that originates from within that very band.

This was a mistake. Don’t do this. The challenge is ridiculous enough already, and if you worry about land masses you should really also worry about population density, and possibly representativeness-of-music, and the whole thing gets rapidly out of hand.

I mean, Antarctica, right. According to TuneIn, the only station (as I mentioned above) is A-net, which may or may not really be in Antarctica, and in practice just plays a selection of lovely acoustic guitar folk and suchlike on a loop less than 4 hours long. Now, although it has what sounds like an acoustic version of the Inspector Gadget theme (even better than this version; actually ‘Topsy’, a guitar duet by Duck Baker and Jamie Findlay, which I can’t find streamable on the internet but here’s another version), and acknowledging that the general mood of the station is ideal for crying to when your business is closing and you’re the last person left in the office still doing this, we have to face the fact that Antarctica does cover all 24 longitudinal regions, which according to the above rule means 24 listens of 4 hours each, which is 96 hours or 12 full business days of listening. Which is just silly. (We did it anyway, because you’ve got to stick with the rules you create, or where are you really).

The Nitty Gritty 3 – Countries with no radio available

The next problem is countries/regions with no detectable radio station, or in some cases any human population at all.

Rarely, there may exist some distinctive music that originates from the area in question, or at least some musician, so listening to that seems very reasonable. But having resolved to conduct a world tour on the basis of a scratch-off map, it doesn’t seem satisfactory to leave anywhere unscratched, even if there is no reasonable connection to any music. As such, we came up with the following order of preference for music selection:

  1. A TuneIn radio station classified as belonging to that country/region
  2. Otherwise, a Last.fm radio station based on the country’s name, or a musical style specifically identified with that country
  3. Failing that, music from a specific artist from that country
  4. When that fails, just go with any music with any incredibly tenuous connection to the country, or just the name of the country, or the geographic location, or maybe just the weather there

For that last resort, one can just search Spotify using various related keywords and in this way construct a 4-hour playlist of tenuously related music, and so ultimately justify scratching off every part of the world map.

The best radio stations, empirically speaking

Having spent 2 years listening to radio stations from every country in the world, or in some cases music tenuously related to that country, I can now authoritatively list the Top 5 best* radio stations.

France – Fip (TuneIn / website / Wikipedia)
The station that started our entire tour. Tom Hensby (who you may know as one of the Three Englishmen) introduced us to this station, with its eclectic mix of genres and musical oddities, alongside a legally mandated portion of French music (also eclectic and odd), all introduced in French by presenters with fabulously sultry tones.

I was most impressed by the inclusion of this orchestral cover of an Amon Tobin track, which was not at all easy to get hold of at the time they played it:

To get an idea of their range, you might also find silly musical numbers from 50s musicals, or a lovely cover of Aquarela do Brasil.

We reasoned that if one such incredible radio station was accessible via the internet, surely others could also be found, and this was the main motivation to listen to music from every country. Fip set the bar against which all others would be compared, and a little sadly it turns out that Fip was impossible to beat, but four other stations came very close.

Turkey – Radyo Babylon (TuneInwebsite)
Original note in our spreadsheet: “It’s the new Fip!”
Overview: As eclectic and consistent as Fip, but more of an emphasis on songs with lyrics and a somewhat less soothing overall effect.

Example songs: Dengue Fever – Cannibal Courtship (stick around for the theremin-driven chorus at 1’15”):

Scott Matthew – No Place Called Hell

Megapuss – Duck people

Slovenia – Mars FM 95.9 (TuneInwebsite)
Original note in our spreadsheet:
“I dunno, I think I might  love this station – French rap followed by some Antony Hegarty?(5 hours of listening later…)This is an incredible radio station. So incredibly varied and weird and also great. Very little (any?) talk or adverts. Praised by at least 3 people in the office. And yes, 10. I went there.”
Overview: Also eclectic and consistently good like Fip, just not quite as many jump-out-of-your-chair-what-are-we-listening-to moments of amazement.

Example songs: Lollobrigida – Sex on TV, Sex on the radio

Public Enemy – I Shall Not Be Moved

Blind Arvella Gay – You Are My Dear (along with many others, here)

Russia – Ralph Radio (TuneInwebsite)
Original note in our spreadsheet: “This is actually… very good! Probably need more of a listen to be sure, but in it’s doing very well in terms of an eclectic mix with some random Russian thrown in there.”
Overview: The Russian language works for this station in much the same way as French does for Fip; unintelligible (to us), but pleasing on the ear. Also consistently good and highly varied, but a slightly stronger Western / pop influence, more likely to produce some songs we already loved.

Example songs: Fool’s Garden – Lemon Tree:

Galun – Kiberprostranstvo (trip-hop constructed out of the human voice?)

Presidents of the USA – Peaches

Mauritius – Radio Plus 88.6 (TuneIn / website)
Original note in our spreadsheet: “Super French Partytime in the office, great for a Friday”
Overview: The one stand-out radio station we enjoyed that wasn’t earnestly eclectic and wilfully obscure; a mix of French and/or bollywood tunes (and occasionally Western) with a consistently upbeat vibe.

Example songs: Dhat Teri Ki – Gori Tere Pyaar Mein

Honey Singh – Lungi Dance

Tropical Family – Turn Me On

Honourable mention: A-net radio (TuneIn / website)
As mentioned above, this was the only station listed on TuneIn as being based in Antarctica, and while it’s quite beautiful to imagine someone out there broadcasting a lot of chilled out acoustic guitar, it doesn’t seem plausible. But you should definitely check out their super 90s website and judge for yourself.

As an excellent and highly personal example track, I heard Isaac Guillory’s “Thanksgiving Eve” on A-net and selected it for my father’s memorial service that was being held the next day:

It’s also fun to read the comments on Guillory’s “Somewhere in your heart” to hear from other A-net fans.

A-Net’s playlist isn’t that long, so you should definitely check it out (either on TuneIn or their website) when you want to remember someone special, or if you’re closing up shop on the last day of a business that was outlasted by a project to listen to radio from every country in the world.

I’d say it’s ideal for either of those occasions, and possibly more.

The slightly sad thing is that once the joy of accomplishment wears off, you realise it just looks like a normal map now.

*”Best” based on a very limited sampling of radio stations available in each country. Ratings are also entirely subjective. As is everything, really.

Part-way through, Ben Pindar wrote about the radio world tour for our company blog, which has since gone away. I’ve put up a copy of his original post here.

Epilogue [added 12th January 2014] – what do you mean “Best”?

After posting this, I’ve seen some comments that made me realise my implied position on the definition of “best” might not be quite clear.

Typically, I think people do assume that an article proclaiming the “10 best” of anything will not actually provide the objective ultimate truth of the matter. It’s unlikely that the reviewer will have sampled the full range of contenders, and it’s certain that taste is subjective and an actual “10 best” list can never be compiled in a perfectly objective way that everyone will agree with.

In the case of our radio quest, we’ve clearly gone quite a bit further than usual, in that we made a reasonable sampling of hundreds of radio stations and combined the opinions of a few people, not just one. This still doesn’t give the objective “best stations” list, for the following reasons:

  • Not all radio stations are available on TuneIn in the first place
  • We only listened to a small number of stations per country, sometimes just one
  • We only listened to 4 hours of radio, when in fact programming can vary radically by time of day and day of week
  • The fact we were (sort of) judging as a panel meant that more varied shows were more likely to meet our collective acclaim
  • As mentioned, everything is subjective anyway!

With that said, in case it’s not clear, I do think there is something very powerful in this methodology:

  • TuneIn lists the genre of radio stations you’re browsing, and we tended to avoid generic pop / Top 40 radio shows (as these were mostly very similar). This filtered out a significant portion of radio stations we can be pretty sure wouldn’t rate highly
  • We actively chose radio stations listed as either local or “varied” in genre, hoping that these would have the best chance of being stand-out interesting
  • If a radio station was dull, we would switch to another one instead where possible. This increased our odds of finding stations that were consistently great
  • We did this hundreds of times!

So while I think it’s impossible to compile a “true” list of the best radio stations, this method does produce a shortlist of stations that I expect to be much more rewarding than the average “top 5” list.

And if you don’t find it to your taste, then you can of course conduct your own global search in much the same way. Good luck!

Tim Mannveille tweets as @metatim, and likes to overcomplicate things while on holiday as well as well.

Categories
participant

Learning to Cheat Without Breaking the Rules, Part 3: Schooner or Later

[After a 2 year gap, this finally concludes the story begun in part 1 and part 2 …]

When I was a kid, I couldn’t lie, I was terrified of breaking rules, and I was determined to be perfectly honest and perfectly trustworthy, not realising the two are incompatible. Years passed, and then in October 2011 I played Schooner or Later, and I flagrantly cheated and betrayed a complete stranger, much to my own surprise. Perhaps even more surprising, I found that I was actually okay with this.

Right after that I started a series of posts on the subject of lying and cheating, how both are important and justified in some situations, and how games helped me learn to do those things when necessary. But I stalled on the final post, which was intended to examine the specifics of my betrayal in Schooner or Later. This is hardly satisfactory, especially considering I actually interviewed one of the main creators of the game, Josh Hadley, for the purposes of that write-up.

So, it’s time to face up to my past: this is the final part in the series, describing what happened that night, alongside commentary from Josh, which I think endorses my actions.

Of course, I may be wrong about that.

So, long ago, on the 13th October 2011, there was a Sandpit event at the National Maritime Museum, with a bunch of maritime-related games.

One of these was a variant of Perudo (called Filthy Lying Liar’s Dice, by Gareth Briggs), in which everyone rolls their dice (which in this case corresponded to parts of a pirate ship), and then plays a kind of cumulative variant of Cheat. Losing a round means you lose a die, reducing your ability to understand the state of the game. Thanks to my training in lying-related games (as detailed in part 1), I did pretty well, making it to the final showdown, albeit with a single die.

Playing the game with strangers, I noticed something I hadn’t seen before when playing similar things with friends: despite everyone bluffing / lying and generally trying to harm everyone else’s chances of winning, a sort of camaraderie developed, and by the end we felt somehow unified. If you administered some sort of trust test, I imagine we would all trust one another much more than we had at the start, despite playing a game that hinged on misleading one another. Perhaps this is related to the fact that you can only trust someone with a secret if you know they can lie if asked about it.

Anyway, at the end of the evening, most of the players (about 35 of us) were funnelled into the final game: Schooner or Later (SoL) by The Haberdashery, in particular by Josh Hadley and Casey Middaugh. In part 2 of this series, I argued that moderated games played for fun were the best type to let players explore cheating. Various aspects of the game design and the running of SoL meant an entire spectrum of cheating was available, and even implicitly encouraged. By my count, we saw 7 levels of cheating that night, each more flagrant than the last – and when I asked him about it later, Josh said he was happy with all of these bar one.

Schooner or Later – the basics

At a base level, SoL is a trading game. Players travel between three ports – England, India, China – buying and selling commodities and trying to turn the greatest profit within the set time limit. But that’s just the framework: the actual execution of the game encouraged cheating of every type.

Josh on cheating

When I asked Josh his thoughts on cheating, he answered immediately: “I try to make games where cheating is an interesting and viable approach – it’s something I consciously try to involve in every game that I game.”

He dates this back to when his family used to play Monopoly (which he characterised as “the worst game ever inflicted on humanity”), in which the only way they could make it tolerable was by introducing a house rule: “anything is permitted so long as you don’t get caught out” – just like “Extreme Cheat” that I mentioned in part I.

(Incidentally, the history of Monopoly’s development is quite illuminating).

Even against this backdrop, Josh thought SoL was special: “This game encouraged cheating, almost as a design principle.”

He also thought the theme was appropriate, and was what encouraged him to add in ‘cheating’ elements: as part of the East India Trading Company, “everyone was a gigantic bastard”.

I think that’s very interesting. How do you make a game in which everyone is supposed to be some kind of oversized bastard, given that it will be played by players who may not personally be any kind of bastard at all? Here’s how.

SoL Cheating, Level 1: Opium smuggling

The primary twist on basic trading in SoL is a mechanic that feels illicit, but you are expected to do it (like lying in werewolf). Players can choose to ‘smuggle’ opium, manifested in-game as balloons, from India to China. Two moderators took the role of coast guards trying to prevent the opium trade by popping the balloons of players they could catch.

The layout was excellent for this: there were three smuggling routes to choose from, and two coast guards, so you could usually find a way.

Because of the sneaking necessary, and the compelling stand-in for real violence of balloon popping, choosing to smuggle had the feel of being illicit – but the rewards for success were so substantial that it was hard to resist. Besides, when playing for fun, you want to try everything out. Perhaps at the start of my personal journey – before I had played Extreme Cheat – I might have tried to go ‘legit’ and trade only the regular goods. But as a result I wouldn’t have stood a chance of winning, and I certainly wouldn’t have had as much fun.

Josh noted that there is supposed to be a tension between the opium and standard economic game, with an emergent strategy on when to switch – but he was also curious: as a player, do you think about what you’re doing, in an ethical sense?

With opium trading explicitly written into the rules and briefed, I’m not sure many players spent that much time worrying about this. However, that ambiguity is certainly present in the many further layers of cheating that the game’s design encouraged. For example…

SoL Cheating, Level 2: Cotton dumping

The second twist was somewhat surreal. England wanted to trade cotton (represented by cotton wool balls), and any time you made a trade of goods there, they would insist that you take a handful of cotton wool and try to sell it to India or China. Furthermore, any cotton you were left with at the end of the game would count against your score.

But India and China didn’t want the cotton – they certainly wouldn’t give you money for it, at best they might grudgingly accept it. The cotton wool balls also take up room in your hands, making it harder to get on with the real business of trading tea and pepper, or smuggling Opium.

This means you’re incentivised to “lose” your cotton en route – but the rules don’t say anything about that. Can you just drop it? Do you have to hide it? Give it away to non-players? Having a quirk like this built into the rules is an effective way of making the players aware that they should be creative.

Josh confessed that this broke the rules for a Sandpit game, in that the game design encouraged littering (“I’m fairly sure there’s still wads of cotton wool bunched up behind exhibits in the National Maritime Museum”), and while they paid lip service towards telling people not to do it, they knew it was inevitable.

As well as meeting his goal of encouraging ‘cheating’, Josh also noted that this is actually a historic truth – the majority of the captains did pitch their cotton overboard because they had the same incentives as us! (I’ve not found a good citation for this, sadly).

SoL Cheating, Level 3: Haggling

Haggling was not mentioned in the verbal briefing, but explicitly encouraged in the instructions. Any player not noticing this would quickly cotton on (ho ho) once they heard the haggles going on at the various ports.

What are the rules of haggling? Only what you can propose, and what the other will agree to. There seemed to be a little flexibility in the price, but it seemed you would do a lot better with a more creative approach – for example, pointing out that making it a round 100 would mean they could hand over a single 100-worth token instead of counting out the 90 in the asking price. In this way, the moderators (as the operators of the ports) were actively giving the players cues on how to play, and proving that the game was flexible.

SoL Cheating, Level 4: Co-operation

The fourth twist (in my somewhat arbitrary ordering) was co-operation. The results of haggling suggested that greater quantities of a good yielded bigger discounts. This created an incentive for players to work together, pooling their resources to get a bigger pay-off. Of course, where co-operation is possible, so too is defection.

Why I’m a big cheater, and that’s okay

Having enjoyed the “cheating” of dumping cotton wool (I bestowed it upon a friend who wasn’t playing), smuggling opium (surprisingly frightening but very rewarding), and haggling, I was getting the feeling that the boundaries of the game were open to question.

When I realized both I and another player at port in China had 47 gold and we would each be offered 6 tea in exchange, I proposed we pool our resources to better haggle for more. With this plan agreed spontaneously in front of the port moderator, we were offered 13 tea rather than 12. We pointed out that 14 would be much fairer as then we could divide it by two.

“Not my problem,” came the reply. “You’ll have to work it out between you.” We looked doubtful, not sure how to resolve this fairly.

“Put your hands out, ” said the moderator, and I did. He put the 13 tea bags in my hands.

Then he looked me in the eye and said: “Run!”

This was the magic moment for me. When you’re playing for fun, and taking your cue from the moderators as to what’s allowed, how do you react to a direct prompt like that? How would I?

I said “I wouldn’t do that, I’m an honourable trader,” and turned to my compatriot: “Here’s 3 tea for you. Bye!” and to my own surprise, I made off with the other 10. “That’s not honourable!” my former compatriot replied, “Come back here!”

We both made port at England, at which point my former compatriot attempted to seek restitution from the authority there. Upon understanding the situation, England’s port moderator simply said “I’m sorry, this is clearly an internal matter. I can’t help. I’ll give you 95 for your 10 tea.” Another endorsement – I felt like I’d been bad (which I had), but somehow within the spirit of the game, if not within the letter of the rules.

As I wrote in part 1, I started out in life so determined to be a ‘good’ person, I wouldn’t cheat even in games where cheating was the point (like “Cheat”). The fact that I cheated another player so brazenly here really shocked me, and that’s why I started writing this series of posts: to understand what sequence of events – actually, games – brought me to this place.

But given that I also interviewed one of the key designers of the game, and that I put off writing this particular moment up for 2 years, I’m suspicious of my own motives. These posts could all be read as one big attempt to excuse my decision that day, and talking to Josh may just have been a subconscious attempt to have my actions endorsed.

Fortunately for me, that is exactly what happened.

Josh’s comment was “That’s a great story – I’m really pleased by that!” – and he reaffirmed that he had hoped players would think about the ethics of what they were doing, and question their decisions in just this sort of way.

He also particularly praised the Haberdashery crew for creating the environment that made this work: “We could only give the players freedom if we also gave the crew freedom. So long as you have a good crew – and we had an amazing crew – you have a huge amount of leniency to encourage and permit that kind of behaviour.” Crucially, he pointed out, the Haberdashery crew know how to “manipulate rules without necessarily breaking the game.”

But cheating in SoL didn’t stop there.

SoL Cheating, Level 5: Syndicates

Other players went much further. Beyond short-term co-operation, a few players actually formed syndicates: they pooled their resources, ran interference to facilitate smuggling, and declared a collective score at the end rather than an individual one.

I had the impression that the players that did this already knew (and trusted) one another outside of the game, but I didn’t feel hard done by: it felt like a logical extension to the game. Josh in particular was really pleased by this development – it’s something he had hoped would emerge, and this was the first run of the game where it actually happened (perhaps because the scale was so large, with 35 players).

SoL Cheating, Level 6: Role-play

As another clue that this was no ordinary trading game, one of the moderators running trade at China began to act as if she was succumbing to an opium addiction. I heard from Josh that a player took advantage of this by withholding opium in order to secure a ridiculous bulk discount on tea, allowing them to actually buy All the Tea in China. When the player presented this haul to England, they received what Josh described as a “frankly insulting amount of money”, but nonetheless profited from their wilyness.

Josh considered this another example of the Haberdashery moderators knowing just how much they could bend the rules: they had been specifically briefed that they were empowered to deliver whatever they thought was a good experience, so long as it didn’t break the game. In particular, he admired the way the moderator at India (Ruth) could appear out of control while actually being fully in control, and the moderator at England (Nick) could give the appearance of being inflexible while actually bending the rules as much as anyone. This extreme event briefly impeded normal trading of tea, but this was swiftly fixed.

SoL Cheating, Level 7: Stealing!

The final level of cheating was achieved by a player who flat-out stole some pepper from India when the moderators were distracted. I’m not sure Sandpit players would usually stoop to that sort of behaviour, but in the context of everything else that was going on it must have seemed reasonable.

But this was where Josh drew the line – crucially, with this type of cheating, the moderators were no longer in control of the game. When this happens, “it stops being a game, and becomes a free-for-all”. Fortunately, this didn’t generate a substantial advantage, unlike the syndicate of three players running their Opium smuggling ring who ultimately “won” the game.

Epilogue(s)

One player decided to “corner the market” in cotton wool, and had been collecting it at every available opportunity, storing it in his hat:

He was duly congratulated as the “true winner” and received a round of applause. After all, he had played for fun, and he had certainly succeeded at that.

While the scores were being collected, I overheard one player lamenting how it could be a perfectly reasonable game if it weren’t for all this “cheating”, and that people should just trade according to the rules. Most unfortunately, this turned out to be the very player I had betrayed earlier. I introduced myself, and apologised, but tried to defend myself by saying “Of course, once you saw how the game could be played, I’m sure you went ahead and did similar things yourself,” to which he replied “No, I traded entirely by the rules.”

This was difficult. I felt as if I had to justify my decision somehow, but rather than try to explain the entire argument put forth in this series of posts (which I was only vaguely perceiving), I simply passed the buck: “Well, you saw what happened – I was encouraged to do it. He just shoved the tea in my hand and told me to run. At least I gave you some.” At this he grudgingly admitted that that was true. Only at this point does the role of the moderator end: when their judgements are used by players to justify their decisions.

Finally, one pair of players had to leave before the final scores were counted up – and they happened to have been playing the Liar’s Dice variant with me earlier that evening. For whatever reason, they chose to bequeath their accumulated in-game wealth to me.

I like to think that they felt they could trust me, even though I’d been proved a liar.

Of course, I may be wrong about that.

Tim Mannveille tweets as @metatim, and has previously managed to blog about a sandpit event much more quickly than this.

Categories
analysis

It Takes Two to Tango

Not a real headline: may contain traces of Photoshop. The cat is called Wadsworth.

[This post was originally published in March 2013 on the blog of Stubble & Glasses, which sadly no longer exists.]

“Average [English] man has 9 sexual partners in lifetime, women have 4”
The Telegraph, 15th December 2011

I see a headline like the above every so often, and when confronted with such a significant finding (or any finding, or really just any number) the first thing I do is apply a sense-check. In this case, the result clearly makes no sense.

It takes two to tango

Imagine all men are on one team, all women on another, and the two teams are playing a very long, inadvisable, and poorly-thought-through game to see who can get the highest average number of sexual partners from the other team. The big problem is this: any time one team ‘scores a point’, shall we say, the other team necessarily gets a point too.

Since we also know there are roughly the same number of women as men, the game is always going to result in a tie: with each team fielding the same number of players and each team getting the same number of points, the average score has to come out the same. Cracked helpfully have some diagrams demonstrating this just in case you’re not sure.

So, the sense-check failed. Let’s understand this discrepancy, while keeping things completely Safe For Work.

Fix 1: Confirm figures at source

The Telegraph provided a citation. There, we find that the methodology is fairly standard, and that more specifically the figures are actually Men 9.3, Women 4.7. This means the male team is claiming a 98% higher score than the female (rather than the more dramatic original figure of 9 to 4, or a 125% more), but this is still a long way beyond the expected 0% difference.

Fix 2: Check assumptions

There’s one major assumption in the sense-check which you probably noticed already: that the figures quoted are for heterosexual pairings only. Fortunately (for the purposes of this analysis) according to the original study, they are.

Fix 3: Estimate loopholes

There’s a major loophole with the argument that the game should result in a tie: this is technically only true if we’re considering the entire viable population of the world. This particular survey only applies to England, so we should consider what we might call ‘away games’.

In the absence of data, let’s just make a rough guess and see where it gets us. What if 1 in 10 of the men’s scores came from away games, and only 1 in 20 for the women? (Of course, this is going to bias the figures the other way for the rest of the world, and the real figure might even be reversed – we’re just trying to get a feel for how much of the discrepancy this might explain).

Well, applying those adjustments would leave the men with an 87% higher score, at 8.4 to 4.5.

Fix 4: Allow for subjectivity

There is no referee in this game, which means there’s a certain amount of… interpretation as to what really counts. This could prove a highly diverting side-track if you were discussing this in the pub, but for now let’s just imagine men might overstate the case 5% of the time, and women might understate 5% of the time. This takes the score to 8.0 : 4.7, with the men down to a 70% claimed lead.

Fix 5: Misreporting due to societal pressure

Having corrected for any major discrepancies arising from the methodology, we’re left with one major problem:people lie. So, perhaps men are tempted to exaggerate and women to down-play their actual scores when it comes to number of sexual partners.

They say you should divide a man’s claim in this area by three, and multiply a woman’s by two – this overcorrects for the data we’re considering, but perhaps that’s because people are slightly more honest in a survey than in the casual conversation this rule of thumb is probably meant to apply to.

Fortunately, a widely reported study was carried out on exactly this effect. Men and women were asked this same question, but in different settings. Some were told a researcher may look at their answers, raising fear of social judgement. Others were hooked up to a (fake) polygraph machine, creating pressure to be more honest. (If you’re interested, check out the original paper here).

Women reported 2.6 partners when worried someone would look at the answer, but 4.4 under a fake polygraph. Men reported 3.7, but this went up to 4.0 under the fake polygraph. Ah-ha!

This is interesting because it suggests both men and women down-play their actual figure, but most of the discrepancy is coming from the women. If we apply these corrections to our estimated figures so far, we have men at 8.4, women at 8.0 – much closer.

Unfortunately the study was small (with just 200 participants answering this question), so while these particular results are suggestive, the researchers concede that they are not statistically significant. (As with any emotionally charged research subject, this didn’t stop the media reporting on the result as an established fact).

If you want to get technical about it

As well as being small, that study was only conducted on college students aged 18-25 in the US, who one would frankly hope behave somewhat differently to the general population of England.

Even in the original more robust English data set, there are some fascinatingly subtle problems. Sexual behaviour will change over decades (some of which is covered in the study), and the extent to which people lie about it will also presumably vary significantly by age. In combination with the fact that men and women have different life expectancies, and that cohorts by age group are not actually equal, this introduces some additional distortions to our assumption that we should see a tie – although a few quick calculations suggest these effects are likely to be smaller in magnitude than those we estimated above.

Okay but what’s the actual answer

This excellent paper goes beyond aggregated data to study the distribution of responses, and convincingly finds an explanation. It turns out that the discrepancy is driven primarily by those claiming over 20 sexual partners, because these rare-but-average-biasing individuals evidently round their score (which they may have difficulty remembering accurately) in the direction they deem to be more in keeping with society’s expectations – so men round up and women round down.

In Conclusion

If you skipped straight here, you should know that you missed some fun stuff where we talked about some subtle issues with the parity assumption, and you probably didn’t notice that that compelling-looking chart was actually not statistically significant. But if you just want the two-line answer, it’s this:

The overall average number of heterosexual partners has to be almost identical for men and women. The discrepancy found in studies like this arises primarily due to people with >20 partners rounding the figure they report, which they probably can’t remember exactly, up (for men) or down (for women), in response to perceived societal pressure.

More practically, always sense-check your data, especially if it’s self-reported and on a sensitive subject. And if you can’t make sense of the data, ask an analyst.

–        Tim Mannveille