Categories
game

Advanced Pass the Parcel, the prototype

What is this?

I’m developing a party game called Advanced Pass the Parcel. These are my thoughts on the first playtests, and where I’m thinking of taking it next.

It’s mostly for my own benefit, but other people might be interested. Maybe that’s you?

The parcel… and the cards

Inspiration

In Season 3 Episode 14 of excellent Australian animation Bluey (a fun show for kids but but even better for adults), the children play a game of Pass the Parcel according to the standard modern rules:

  • Every layer contains a small prize
  • The adult operating the music carefully ensures each child gets a prize (a secret rule that the players don’t know)

Lucky’s dad objects to this and runs the more traditional version:

  • There is only a single prize at the very end – but a much bigger one
  • The music stops at random

This has some dramatic consequences, but it also got me thinking about changing the rules for Pass the Parcel in other ways. I liked the idea of running a really weird version of the game with kids that had all sorts of chaotic shenanigans deployed layer by layer – swapping places, surprise rubber spiders, minigames, dance breaks, ability to control the music, general mayhem.

Unfortunately (or perhaps fortunately?) I don’t currently have access to a kids party where I would be able to do that. Fortunately, Clare pointed out that the general idea could work just as well with adults, and I realised she was completely correct!

Basic design

As a game, Pass the Parcel in both the above forms is notable for the complete absence of player agency! According to most definitions, this means it doesn’t even qualify as a game. That’s fine for the kids version, but I don’t think that’s something adults could as easily accept.

Immediately I thought of the most obvious mechanic to fix it: you can play a card after the parcel stops (an ‘Interrupt’) to move it further around the circle. If you got to open a layer it would naturally reward you with more cards. This certainly gives players agency and just might admit a bit of strategy.

There were a lot of other rules and changes I wanted to try, which gave me the second core idea: each layer also reveals a card with some sort of new rule! I love games that do this, but there aren’t many, and it brings some of that feeling of chaos / unpredictability I was originally imagining.

Interrupt design

As a quick way to test the idea I planned to use playing cards as the interrupts. Large numbers would be a bit boring, and I felt there had to be some restriction on what you could play to add some strategy and stop players from just playing all their cards in a single round. I settled on using only Aces (for 1) up to 5 as a range, but I couldn’t decide if an interrupt should be lower than the one played before it (so the parcel would sort-of decelerate), or be +1/-1 from the previous interrupt (for less predictability). I decided to try both by starting with one rule and changing it later on with one of the ‘new rule’ cards several layers deep into the parcel.

Imagined example play

To be very clear about it, this meant a game might play out as follows:

  • Music starts and the parcel is passed around
  • The music stops, so the parcel stops in one player’s hands
  • Anyone can play an interrupt (they have five seconds to decide to do so)
  • If they do (for example, by playing a 4), the parcel moves that many places around the ring – probably to end up at the person who played the card
  • Anyone can then play another interrupt if it is less than 4 (or under the +1/-1 rules, if it was a 3 or a 5)
  • This keeps going with more interrupts, until the parcel stops and after 5 seconds no more interrupts are played
  • The person who ends up with the parcel opens up a layer, gains more interrupt cards, reads out the new rule, and that round is now over
  • Music restarts to begin another round, until eventually someone opens the final layer and wins

Layer design

Gaining one interrupt card for opening a layer would create a net drain on cards in play; gaining three cards would create a very strong rich-get-richer feedback loop, since someone gaining so many cards would very likely get the parcel again. So it kind-of had to be two interrupt cards you gain in each layer.

Physically, the tradition of wrapping paper and tape seemed quite wasteful; fortunately we had a lot of furoshiki cloths of various sizes that were lightweight, quick to wrap and unwrap, and easy to re-use.

I realised there would be a bit of strategy around judging when the final layer was reached, so I planned to have some fun with that. First I had one layer reveal two parcels inside, the new rule being that the parcels moved simultaneously in opposite directions. One of those then ended before the other (with a ‘jackpot’ of 4 interrupt cards), and the remaining parcel was instead a large envelope – which would look quite final after all that furoshiki. However, in that envelope was a hankerchief wrapped around a smaller envelope, to create two further final layers!

Parcel Passing Protocol

A classic problem in the original game is that the music can stop mid-pass. Two children both have their hands on the parcel and will tend to fight over it.

With adults I assumed we could instead go with a much less ambiguous approach: the parcel could be lightly thrown from one person to the next!

This was a mistake.

New rules

Here are a few examples of the rules I tried adding to the different layers:

  • Parcel now reverses direction
  • All players draw 1 new interrupt card
  • Give these interrupt cards to the players either side of you
  • Read this out loud: “The next layer has a spider in it. Not including me, whoever is first to volunteer gets to open it.”

I also tried some rule cards that instead you got to keep and play at any time:

  • Shout ‘Objection!’ to cancel an interrupt, then discard this
  • Declare ‘Technically it’s my birthday’, the player with the most cards must give you two, then discard this card.

The 5 seconds to play an interrupt was quite long, but felt necessary when people were new to the game. So I also add a rule card that reduced this to 3 seconds a few layers deep.

What about the prize?

As you might have noticed, games played by adults routinely do not have any prizes at all. People play Monopoly, Catan, Hearts or Waldshcattenspiel purely for the honour of victory. But having the final layer of a parcel contain nothing other than perhaps a bit of paper saying ‘you win!’ felt far too anticlimactic.

Given the direction the game was headed, the answer seemed obvious: the final layer has a blank card, on which the winner gets to write a new rule for the next game.

Expected behaviour

Under these rules, I expected the following:

  • Players would generally play an interrupt only when it brought the parcel to them
  • Players would hoard cards a little bit to ensure they had interrupts to play when some people had run out, and when the crucial final layer approaches
  • Perhaps emergently some players would co-operate. Player A might play an interrupt so that player B got to open the parcel; player B might later return the favour (with the help of the additional cards) so that player A might get to open a layer in turn.

To make the prize a bit more exciting, and with the idea of co-operation in mind, I changed the final prize to be two blank cards – one for the winner to write, and a second they could choose to give to another player.

Playtest 1

I was very lucky to be able to test the game out three times in relatively quick succession with three different groups of people.

Here’s what I learned from the first group, 17 people who work in video game development (so were more game-literate than the average person):

  • Players were excited by the simple joy of passing the parcel and playing interrupts! It may have helped that some players were lightly inebriated…
  • Interrupts having to be lower than the last one played in a round immediately felt far too restrictive, and we quickly switched to a ‘same or lower’ rule. This led to a much more exciting round finish when an Ace could be trumped by another Ace (and possibly another, and another…)
  • We needed a way to adjudicate bad behaviour (holding on to the parcel instead of passing it, throwing it badly, deliberately not catching it and picking it up slowly and so on). I quickly opted for player’s voting which player misbehaved, and penalising that player by having them discard one interrupt card
  • Players were not very strategic with their interrupts, playing them early and often – even when it brought no immediate advantage to themselves. Probably inebriation played a part here too, but it did feel like doing a thing that triggers everyone else to do a thing (pass the parcel the number of times you indicated) is inherently quite fun?
  • Rather than co-operation, players instead formed enmities! They would play interrupts to deny certain players the parcel – players who previously denied them the parcel, or players who were doing too well, or just players where it seemed funny to deny them the parcel for some reason
  • The interrupt that reads “The next layer contains a spider. First player to volunteer gets to open it” is very silly and out of keeping with the rest, but still proved very fun, suggesting a whole other possible way the game could lean. (Also note, it is a trick – there is no spider of any kind)
  • In a large group and a loud party setting, most players can’t/won’t read the new rules loud enough, so a moderator (me) had to do that for then
  • Complicated new rules just do not work in a loud party setting
  • We later needed a way to punish bad behaviour even when a player had no cards left to discard. We concluded they would be excluded from the next round
  • The rich-get-richer problem was actually pretty bad, with one player getting so many cards they would just play them in bulk (e.g. four 3’s for a total interrupt of 12)
  • The ‘+1/-1’ interrupt rule led to extremely long combos that drained almost all cards from the group, suggesting a lack of strategy, or possibly a failure to guess how many more layers were left?
  • The drama of approaching what may or may not be the final layer was tremendous!
  • With 17 people starting with 3 cards each, and 10 layers of parcel with about 2 cards in each, the game went on for about 30 minutes – probably too long
  • The winners wrote their new rules, and while these were mostly good, there was one that was possibly a bit broken and needed a second draft. Not sure how to deal with that…

In general despite some poor throwing discipline and long running-time, it seemed to go very well. One player afterwards reported “I haven’t laughed that much in ages” – a pretty good sign!

Playtest 2

In the next playtest, I once again found about 17 people willing to play, and once again some of them were slightly tipsy.

As a warm-up I had the group simply pass the parcel in a single circuit, and the relative ease with which they managed (or rather did not manage) to do that convinced me the throwing method was not a good plan. Instead we went with regular passing of the parcel from one person to the next, with the rule being that if the parcel stopped when two players had contact with it, whoever was furthest around the circle (i.e. was just in the process of receiving the parcel) should get it, which proved suitably unambiguous.

I made sure the new rules in each layer were simpler. I also added more new-rule cards that would ensure cards were distributed or replenished to cut down on the rich-get-richer problem.

Finally, rather than using whatever party music was playing, I specifically chose Meute’s track “Rushing Back” because it had a nice beat for passing to (136 BPM, pass every other beat so 68 BPM), followed by Edvard Grieg’s “Hall of the Mountain King” (inspired by a video game it would kind of be a spoiler to name).

Here’s what I learned this time around, with a group that seemed to have a wider range of game literacy:

  • Once again the simply joy of passing and interrupting was much greater than I expected! Interrupts just seem incredibly dramatic
  • This group quickly cottoned on to co-operation as a strategy, enmity formation was much lower. Depends on the group I guess?
  • Given the absence of any agency during the passing phase I tend to assume the gaps between stops should be quite short. But based on how the mood was feeling and perhaps how much people enjoyed the music, it seemed clear that longer periods (with the parcel making 3 or more circuits) actually felt just fine?
  • Changing the core rule of which interrupt can follow which added more confusion than I think is fun for a few people. For a group that includes non-gamers and/or people who had a few drinks, it’s probably best to keep that rule consistent
  • The accelerando of Hall of the Mountain King worked just as well as I hoped, and I was able to time the whole game such that the final stop aligned exactly with the end of the song. So good!!
  • Once again the game lasted about half an hour, which actually felt fine in this case, perhaps because of the more suitable music
  • The nature of the occasion meant no further games would be played, and the appetite to write new rules was not very strong

This now gave me a lot of excitement about the general concept, since two quite different groups of people seemed to have a lot of fun with it.

Playtest 3

I updated the rules again and printed out large-font new rule cards for better legibility than the hand-written ones I started with. Then I tested it out with 14 friends who are much more experienced with my kind of games, and indeed games of all sorts. I repeated the music choices from Playtest 2 since they seemed to work so well. Here’s how that went:

  • The first few rounds were much calmer, with fewer interrupts played, and in particular much fewer ‘frivolous’ interrupts (those that don’t advantage the player)
  • The vibe was initially more serious, and the relatively low amount of player input into the whole process started to feel a bit odd…
  • … but as some new rules kicked in and a few interrupts surprised people (in particular a father playing an interrupt denying his own child the parcel!), I sensed the vibe picking up as people got into the rhythm of it all
  • The climactic ending of Hall of the Mountain King was once again great, although the winner then immediately began to unwrap the final layer, and I didn’t even realise that was wrong – such is the finality of the song. Players pointed out the error and we quickly played out the ending properly (allowing interrupts to take place).
  • Richard B felt that playing an Ace to take the parcel from the player next to you was such a distinct move it should have its own name; he called it ‘batsy’. I like it!
  • Once again there wasn’t a huge appetite to write new rules, although unsurprisingly a lot of this group were very interested in discussing what they could be

Tarim raised that the ‘same or lower’ rule meant that the higher numbers got very little play indeed. This was more obvious in this playtest with the more tactical play. Players had ‘solved’ this problem in the first two playtests by playing their highest cards as soon as possible and without regard for personal benefit!

My concern about the alternative ‘+1/-1’ rule is that it can encourage very long runs of interrupts draining ‘too many’ cards. That said, it does still have a nice slowing effect as players reach the lowest (or highest) numbers available, since only a single number can interrupt from there…

Where next?

In general the game seemed a lot more fun that I would have guessed, so seems worth developing further. I see two obvious but very different paths to try.

Path 1: Fix what seems broken: less luck, more agency, more strategy

Even with a more strategic group playing, it became clear that the game was barely more strategic, at least in outcome, than just stopping the parcel randomly and opening it. But in a game with many players, where only one will get to open a layer, how much strategy is even possible? Most of it will surely cancel out.

The ‘rich get richer’ problem is also quite annoyingly baked into the concept. Even though ensuring every other layer had a rule that to some extent redistributed cards or allowed new ones to be drawn, that dynamic does feel a bit antithetical to strategy.

My working theory for a solution to both problems is as follows:

  • Each card has an interrupt value AND a special effect, players choose which one to play
  • Players can only play one card per round. Having more cards gives you more choice, but not as much overwhelming power
  • When you open a layer, it contains three cards – choose one for yourself, give one each to the players either side of you
  • Special card effects could follow a ‘tableau’ style of play – you place them in front of you to grant some sort of mild buff that stays with you for the rest of the game. In this way, even though you don’t have much control over a single round, you get to be strategic over the course of the game. Even if you don’t win, it would probably feel better than just playing an interrupt to gain the parcel and then immediately have that play ‘wasted’ when someone plays another interrupt.

Possible example card effects:

  • Interrupt: Play this to reverse the direction of the last interrupt
  • Interrupt: If 3 interrupts have been played this round but you have not touched the parcel since the music stopped, draw 2 cards
  • Tableau: You can add 1 to any interrupt, but must discard a card whenever you unwrap a layer
  • Tableau: When a +2 interrupt is played, once per round you can draw 2 cards and discard 2
  • Tableau: Whenever a player’s hand reaches 6 cards, they must give you 1 and another player of their choice 1

The dual-function of cards (interrupt or effect) mitigates the weakness of high numbers under a ‘interrupts must be lower than the last’ paradigm. But I also like the idea of including a small number of ‘Joker’ interrupts which can be played after an Ace and pass the parcel directly to you – but can be followed by an interrupt of any value.

That all feels quite promising to me, but leaves a lot of unanswered implementation questions, and also just would not work with a large group.

Path 2: Build on what works! Frivolity, group activity, spontaneous alliances / enmities

Leaning in to the fact that apparently just playing cards and having the group do things in response seems fun, perhaps don’t worry about strategy and just have fun with it?! This leans in a direction I’m actually very interested in – pushing away from games and into something I’m currently calling ‘co-ordinated activity’, which I think is a vastly underexplored space.

Perhaps keep the ‘1 card per round’ restriction to stop the game running on too long and avoid the rich-get-richer problem a little bit?

In general have rules that are a lot more varied and surprising, or give the players a different kind of agency. Some example rule cards you might unwrap alongside your interrupts:

  • Ask a quiz question. First player to get it right (excluding you) draws 2 cards
  • Pass the parcel, each player adds one word to the story as they pass it. Parcel stops at the end of a sentence or when it completed a circuit. You choose who to award a bonus card based on their contribution.
  • Everyone High-5 one player next to you. Successful high-5s swap places.

Maybe each of these is great for very different groups – or maybe the right answer really is some kind of strange combination of the two? The experiments continue…

  • Transmission ends

Tim now posts on Bluesky as @metatim.bsky.social and has previously written about making games about sandwich making and blindfold roleplaying

Categories
game

Weird Family Fortunes

Introduction

Family Fortunes (Family Feud in the US) is a brilliantly subversive quiz concept.

They set things up by conducting a large survey asking people all sorts of questions that have multiple answers. On the show, teams (families) then compete to guess the most common answers to those questions. So unlike a conventional quiz where you are rewarded for knowing obscure things, instead you are rewarded for being as similar as possible to the average person! Or at least being good at guessing what an average person would think.

I ran a ‘Weird Family Fortunes’ survey/quiz that compressed this idea into a single step. Players completed a survey of Family Fortunes style questions – but needed to anticipate what other people’s answers might be as they did so. Initially this was quite easy, and then it started to get a bit weird.

There were 3 sections, with 8 questions in each, and 38 people sent in responses – if you were one of them, thanks for that! Here are the results.

Section A: Majority Wins

In this section the rule was similar to Family Fortunes: ‘majority wins’. Players get a point if their answer matches the most commonly given response out of everyone playing.

For example, if 10 people answered ‘Pinball’ and 7 people answered ‘Snooker’, the 10 people who chose Pinball would each get a point, since theirs was the most commonly given response.

I grouped together responses that I consider equivalent, e.g. I would combine ‘Football’ with ‘Soccer’ when considering the totals. This sounds superficially simple, but gets into some quite difficult judgement calls later on, as you’ll see.

A1: What is your favourite colour?

Inspired a little bit by Monty Python and the Holy Grail, this seemed like an easy place to start. Perhaps informed by the first response in that film (but also just possibly what is most popular in general), the winner by quite a long way was ‘blue’.

Some of the responses suggested to me that people either had not realised the goal was to guess what the majority of all players would guess (‘Pastel Pink’ and ‘Aquamarine’ were two ambitious responses), or simply did not care.

I also think the kind of people I sent this to are generally very smart and original, and I often run games that allow them to flex those strengths. In this round at least, this is pretty much the opposite of what you should do, so perhaps it was just force of habit that led to the particularly unlikely response of ‘radio wave’.

A2: What is the best animated Disney film?

Given the demographic of respondents, perhaps ‘The Lion King’ was the inevitable winner. But I was very interested that ‘Toy Story’ came second, given that it was – at the time – a Pixar film, not a Disney one. Years after Toy Story came out, Disney bought Pixar, and the argument could be made that it has now become a Disney film. In a traditional quiz that would probably be debateable, but here it doesn’t matter! If everyone makes the same mistake, it becomes valid. Albeit second place in this case.

A3: Which is the least awful social network?

This one is a bit of a popularity contest, since I think on average most people only use a small number of social networks. Still I was surprised to see Instagram take a convincing lead. Good to know!

A4: How do you distinguish the file name of the final version of a document?

As a subject of personal interest, I find that the version of a document one thinks is final turns out not to be final in the vast majority of cases. This creates the problem of distinguishing the document you first thought was final from the one that is actually final (if there ever even is such a thing).

The results here suggest that other people don’t have that problem, or other people don’t think other other people have that problem, or they don’t care, or think other people don’t care. In other words, these results don’t actually tell us very much.

Still I was disappointed that only 3 of the responses reflected my experience – there is no final so name accordingly – and actually one of them was me, since I was the first respondent to the quiz!

A5: What is your favourite video game?

The big problem here was categorisation. ‘Zelda’ and ‘Mario’ do not uniquely refer to a specific game, and it seems too much of a generalisation to group all games in the Zelda  or Mario series as a single category. Then there is ‘Mario Kart’, which is also a series with many iterations – but ‘Mario Kart’ was what ‘Super Mario Kart’ was generally called (just as Star Wars: Episode 1 – The Phantom Menace was generally known as Episode 1), so probably stands. As noted, in general this quiz isn’t about me judging what is and is not a valid answer, but when it comes to joining up responses I did have to make that call.

There was a surprising breadth of answers here, but perhaps this is more a sign of how huge and varied the topic of games is… and is a sign that a lot of the respondents also work in a video game studio.

A6: Who is the best female character in the Marvel Cinematic Universe?

This question got at the problem of general knowledge (the average casual or non-MCU fan probably could only name Black Widow) vs. strong contenders from elsewhere if someone is more familiar with the material. Of course, this is compounded by people guessing what others will guess, making Black Widow a very likely winner.

A7: Who is the best droid in Star Wars?

Not only is R2-D2 clearly an excellent droid, they will be the obvious choice for the large number of people who haven’t seen the sequel trilogy or other Star Wars spin-off material. In a straightforward poll, I hope a few more might support the excellent L3-37 or BB-8, but when it comes to guessing which will win, R2-D2 is hard to resist.

‘Han Solo’ might look like a rookie error, but makes sense if you consider some additional context. To say more would be to spoil that other context.

A8: Which year was the best?

Another tacit test of the player’s demographics, 1999 and 2000 are the years that most of my peers were in their 1st or 2nd year of university – with maximum freedom and probably the lowest amount of responsibility. The internet was also gaining traction and was exciting and new. Well, I’m guessing that’s what people thought anyway.

Section B: Runner-up wins

For this section, the rules changed: runner-up wins. That means players got 1 point for each answer where their response was the second most commonly given. Of course, players needed to bear in mind that everyone else is trying to guess which will be the second most commonly given as well.

For example, if the question was “What is the best letter?” and 10 people answer ‘a’, 7 people answer ‘b’, and 2 people answer ‘c’, the people that answered ‘b’ will get a point since that was the second most commonly given response.

I originally considered giving people the chance to answer each question twice: once with their ‘honest’ answer, and then again where they are trying to guess what other people guessed. But I liked the elegance of compressing that all into one question, even though it makes it very difficult and probably quite luck-based!

This was made even more difficult since it was hard to tell how many respondents would truly understand what was being asked, as that could influence the responses. Let’s see what happened.

B1: What is your favourite mode of transport? (runner-up wins)

I made the call to separate ‘bike’ from ‘ebike’ and ‘quad bike’, and also ‘tube’ from ‘train’.

Now I think about it, I can actually see that helicopter and zeppelin are pretty excellent forms of transport, but it’s not too surprising that not many people thought of these. Or thought that other people might think of it not quite as much as the most popular thing.

B2: Which tree is the best? (runner-up wins)

At this point I slightly regret not asking the ‘majority wins’ version as well as the ‘runner-up wins’, since I’m curious how similar they would end up being. Oak seems an obvious front-runner, but well done to the 3 people who all agreed on Birch making it the runner-up.

B3: What is your favourite potato-based food? (runner-up wins)

Somewhat contentiously I separated ‘chips’ from ‘fries’, as in my experience they are very different forms of potato, and I have been to establishments offering both as separate options on the menu. In this case I’m very uncertain if the people who won by guessing ‘chips’ were under-thinking it or thinking-it just the right amount.

B4: Pick a word that begins with ‘w’ (runner-up wins)

This could have been very hard, but I deliberately included two possible words in the question itself, creating what I thought might have been a dilemma, as surely one of those would be the runner-up. Sure enough, these showed up close to the top – but out of nowhere, water was the most popular, and I’m very unsure why.

B5: You are invited to a thing you don’t want to do, but technically you are available. What do you do? (runner-up wins)

I’m very interested to know the ‘majority wins’ answer to this question, but as a person who frequently invites people to things, perhaps people would not have answered honestly in any case?

This created some tricky grouping problems, in particular I decided to separate the general (‘make an excuse’) from the specific (‘feign illness’), and nuances of timing (‘decline’ vs. ‘cancel later’; ‘go anyway’ vs ‘go for a bit and see’). I am mysteriously pleased that the successful runner-up was the perhaps surprising ‘go anyway’.

B6: What did Tim eat just before writing this question? (runner-up wins)

For those that want to know, the reality was that I ate a banana just before writing this question. But this is not about accuracy!

Given that you most likely needed to at least come up with something that a few other people would guess, I was surprised at the long tail of very specific responses – I had thought generic answers like ‘a snack’ or ‘breakfast’ would have been the way to go. But in the absence of examples, I can imagine it was not clear that general categories might be a good way to go. It was quite pleasing to end up with a 6-way tie between 12 people!

B7: Where is alien life most likely to be found in our solar system? (runner-up wins)

In this particular case it felt fairest to group up specific responses (e.g. locations on Earth) to a single major planet, but that did end up making Earth the most popular. Perhaps if I do something like this again it would be better to specify (where possible) what kind of grouping I will do?

B8: If you had to kill a vampire but you aren’t sure which vampire rules are in play, how would you first try to do it? (runner-up wins)

With an unlikely tie for first between sunlight and a stake, the successful runner-up was the garlic grouping. My personal favourite was the 3 people who chose not to kill at all, in a category joining ‘Don’t’, with ‘run away’ and ‘fall in love with them’.

Section C: Weird

This section grouped the weirdest majority and runner-up questions with a few that had rules all of their own…

C1: What is the lowest unique whole number that someone will guess in answer to this question? (For example, if two people guess ‘1’, one person guesses ‘2’, and one person guesses ‘3’ then the person who guessed ‘2’ will get the point, as that was the lowest unique response).

I thought ‘Whole number’ was a well-defined term, excluding negative numbers, but I failed to specify this. The fact that several very intelligent people opted to give negative responses made me realise I should have specified that restriction up front (and not just implied it by the example).

Under the intended rules – which is what I’m counting here – the winner was the person who selected ‘2’, funnily enough the winner in the example given! This meant 1 or even 0 could have won.

If we broadened the definition to negative numbers, I find it pleasing that two people both opted for the largest negative number possible in the restrictions of the format, meaning the winner (under those rules, which we are not actually using) was instead the person opted for a merely very large negative number.

C2: What came before the Big Bang? (Majority wins)

In grouping terms, I considered the winner, ‘Nothing’ to be distinct from ‘there is no “before” ’, since the latter implies time itself did not exist / was not meaningfully defined. I separated out ‘The Big Crunch’ from ‘A Big Crunch’, because while superficially similar I felt like the use of “A” much more heavily implied a cyclic behaviour.

Respect to the one person with the very reasonable response of ‘I don’t know’!

C3: What is the opposite of the thing most people will answer to this question? (Any answer that can be interpreted as opposite to the answer most commonly given to this question gets a point)

I asked this without any idea how people might answer, or how possible it would be for me to judge. As it turns out, variations on ‘nothing’ clearly took the majority, making both ‘everything’ and ‘something’ both winners as opposites.

You could argue that the response that simply repeated the question (and perhaps the one that said ‘asking the same question again’) should win, in the sense that they are the opposite of all of the others which are all ‘answers’. But I don’t think a response should dictate how I group things up, so that does not win, but is still a good effort.

I like that one person gave 42 and another -42. I also particularly like the surreal ‘you should put salt on it’, which is possibly trying to do something clever I haven’t been able to figure out.

C4: What is the average of the numbers submitted in answer to this question? Give your answer to the nearest whole number. (Closest to average wins)

With just one person opting for a very large negative number and 4 people going for very large positive numbers, the largest positive number ends up winning!

C5: Pick a whole number from 1 to 8. (Runner up wins)

I wondered if the earlier questions might somehow prime people’s responses here, but I can’t see a clear pattern. Perhaps the similarity of ‘runner-up’ with ‘2’ as a concept made it the most popular choice, allowing 5 to take the true runner-up prize.

C6: What would be a good question to ask in this survey? (Best 3 answers as judged by Tim will win, originality will be highly rated)

After struggling to come up with questions, I figured I should turn the problem back on respondents. I got a very widely varying set of responses, which were as follows:

  • How do you feel? (Majority wins)
  • What would be a good question to ask in this survey? (Best 3 answers as judged by Tim will win, originality will be highly rated)
  • Did anyone else’s brain start hurting around C1?
  • Pick any whole number between 1 and 50 to stand on and a second such number to place a trap on. You score a point if nobody has placed a trap on your number.
  • If you had to be on an island with only one type of bird, which would it be and why?
  • What do you consider to be the greatest virtue?
  • Number of peanuts in this pack of peanuts. Points at peanut pack.
  • Thumbs being incredibly useful, why do we only have 2?
  • How many bees would we need to attach to Tim to be able to fly him to the moon?
  • Why doesn’t glue stick to the inside of the bottle?
  • Why did the chicken cross the road? (runner up wins)
  • If Tim was a phone, which phone would he be? Bonus points, what colour/case would he be?
  • If you wanted to get a Guinness world record, what would be your best shot?
  • If you could choose someone to narrate your life, who would you want to serve as the voiceover?
  • What do you get when you cross a rhetorical question and a paltry attempt at being funny in a quiz answer?
  • What is Tim’s favourite colour?
  • If asked to define yellow, how many people would not use the opposite of a not a banana to explain it? (Wooden spoon wins)
  • If everyone were to answer the best year of their life minus the worst year of their life, what do you think the average of all answers would be?
  • How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  • What colour are the clouds in the sky right now?
  • How many bison dollars is to a single US dollar?
  • Best burger chain
  • What’s your favourite question? (Majority wins)
  • Favorite question in this survey, runner up wins
  • Why does Tim write question that we don’t understand and have no relevant or real answer
  • What is the best number?
  • favourite integer (runner up)
  • what’s your favourite thing about Tim?
  • You have a pet dinosaur and you don’t want to give it neither a human, a dog/cat nor a scientific name. What would you name it? (Majority wins)
  • Name something you find in a bakery that can also be used as a term of endearment
  • Pick a number. If the total of everyone’s numbers is odd, evens win. If the total of everyone’s numbers is even, odds win.
  • Do you think that this format of quiz will be popular enough to repeat (wholly or in part) for next time?7
  • Assuming the location of each person responding to this survey forms the vertex of a polygon, how many KitKats would it take to trace the perimeter? (One point for answers within one standard deviation of the mean)
  • What would be a good question to ask in this survey?
  • Who, What, Why?
  • How many distinct named colours do you think there are?
  • Probably something like what you just asked, so that you could effectively outsource creating the next survey for your next event.

I said originality would be highly rated, but another (but less important) criteria was how well a question might actually work in this context – something I now have a much better feel for. My top 3 were:

  • Pick any whole number between 1 and 50 to stand on and a second such number to place a trap on. You score a point if nobody has placed a trap on your number.
  • If everyone were to answer the best year of their life minus the worst year of their life, what do you think the average of all answers would be?
  • Assuming the location of each person responding to this survey forms the vertex of a polygon, how many KitKats would it take to trace the perimeter? (One point for answers within one standard deviation of the mean)

Also my respect and bafflement to the complicated question about yellow that then reveals that ‘wooden spoon wins’, I am very curious what the responses would have been to that!

C7: Rock, paper, or scissors? (Those who select the option that beats the majority win)

My favourite question! This ended up being a little tricky to resolve, with a two-way tie between Scissors and Paper.

Ultimately though the answer is quite clear: the majority is a tie between Scissors and Paper, so we simply consider the following:

  • Scissors ties with Scissors and beats Paper
  • Paper ties with Paper but loses to Scissors
  • Rock beats Scissors but loses to Paper

A tie and a win is clearly the best result, to ‘Scissors’ is the overall winner.

C8: When considering all responses to these questions, including this one, which whole number is most commonly answered? (Closest to the actual most commonly answered whole number across all questions wins)

Perhaps the ultimate test of figuring out what other people will choose, while ‘1’ feels like a pretty safe choice the winner was ‘5’ – and it would not have been if 3 fewer people chose it in response to this question!

For reference, the actual distribution of numerical responses to all questions is as follows (excluding the very long tail of numbers with only one occurrence):

Conclusion

Did this work? Did it make any sense?

In retrospect the ‘Runner up’ round felt a bit too confusing and random. It would have been neater and also more enlightening to have people give a straightforward response and then go on to guess about the runner up. One reason I chose not to do that was that it would make the survey a lot longer – or the same length with much fewer questions in play. The second reason though was that I liked how mind-boggling it might be!

The grouping problem got pretty difficult. If I did this sort of thing again, I would definitely try to anticipate how grouping might work and specify guidelines for it within the question itself.

The responses to this quiz went on to inform an actual ‘Family Fortunes’ style quiz game I ran in person, specifically because I wanted to do something that was like a quiz but a lot weirder, and I liked how that worked a lot – that will be written up elsewhere.

Thanks again to everyone who participated!

Leaderboard

As promised, I said I would publish the top 10 scorers. There were quite a lot of ties though, so some of these tables get quite long.

In section A, a nice warm-up with 8 ‘majority wins’ question, 5 people managed to get 6 of them right – well done!

In section B, there were 8 ‘runner-up’ questions. Figuring out what will come 2nd out of the options submitted by people who are all trying to guess what will come 2nd is very difficult, but somehow Jordan got an incredible 6 of the 8 correct, a strong lead against the 6-way tie for 2nd with 3 points. Amazing!

Section C had a lot of weird questions, but also a lot where just by design very few players were likely to win the point (with the exception of the rock/paper/scissors one), so it’s especially impressive that Phil got 5 out of a possible 8.

I said I wouldn’t reveal who gave which answer, but I will say Phil did answer 3 Section C questions tactically and this did help him win at least one additional point… quite how he might have done that is left as an exercise for the reader.

Adding those all together, Jordan takes the lead with 13 out of 24!

  • Transmission Ends
Categories
Story

Consensus or Death

The Message

Voice of fire by Barnett Newman, 1967

On Wednesday, nobody died. It took a while for people to figure this out.

On Thursday, the moon disappeared. As far as scientists could determine, in its place was a very small object of equivalent mass. This caused quite an uproar, so when the news about nobody dying came out it didn’t get much attention.

On Friday the moon reappeared/re-grew, and by this point some countries had figured out that the period when nobody died on Wednesday was exactly 00:00-23:59 UTC, which seemed pretty weird.

Then the message came.

All humans over the age of 23 years, 11 months and 6 days received it simultaneously, even if they were asleep. None could recall if they heard it as words, saw it as images, or experienced it some other way – but everyone agreed on the content. After claiming credit for the miracles just witnessed, the message set out the following ultimatum:

  1. In 64 days, everyone who receives this message will vote
  2. The choice is between red and blue
  3. There is no inherent meaning in the choice
  4. If more than 62% of voters make the same choice, regardless of whether it is red or blue, humanity will be welcomed into the galactic community of sentient beings and, gradually, be allowed to share in highly advanced scientific knowledge
  5. If fewer than 62% of voters make the same choice, humanity will be instantly eliminated. The rest of life on earth will carry on.

The message was understood clearly by all, including speakers of languages lacking a word for ‘blue’, the blind, and the cognitively impaired. In news reports the message quickly came to be called the Consensus Ultimatum.

Some individuals – mostly scientists – received a bespoke version of the message, further explaining that in the period leading up to the vote Earth would be protected from certain specific threats; threats that these individuals were best qualified to comprehend. Pooling this knowledge yielded some well-known threats (solar flares, errant asteroids, novel contagious diseases, miscellaneous climate disasters), as well as other threats not yet well understood (most notably: water decay, localised vacuum collapse, catastrophic crust failure, and all categories of crab singularity). Should humanity pass the test, these protections would continue.

Initial Responses

A lot of people thought this would be easy, on the basis that a consensus should quickly become clear, and this would create a feedback loop ultimately driving well over 62% of people to vote for it. Some of the more cynical thought this level of consensus may be hard to achieve; a few judged it impossible. But it was almost universally agreed that an active effort should be made to try to reach consensus.

A very small minority preferred that humanity fail the test and be eliminated.

First Steps

A commitment to vote a particular colour quickly began to trend on social media. Within different language and cultural bubbles, different colours tended to win. Once a winner became clear, a strong social-proof effect meant that a bubble would quickly favour one colour with a consensus of ~80%-90%.

In the US, the social media trends generally aligned with political leaning: left-leaning voters sided blue, right-leaning red. In the UK, these were reversed in accord with their local colour associations. Many other countries followed this pattern.

Within a couple of days, government recommendations – or in some cases mandates – began to come out. After some deliberation, the US democratic president announced that American citizens should choose red. They expected fellow democrats to take this lead regardless of the political undertones, and assumed republicans would be happy to go with the colour their bubble had already selected. They also anticipated that China would favour red.

Indeed, one day later the Chinese government announced that their citizens would vote Red. The cultural associations made it a clear favourite. Other nations made similar announcements, most – but not all – favouring red.

Many nations began to roll out surveys to establish current voting inclinations. Seven days after the message was received, humanity’s collective best guess was that about 65% would vote red, 12% would vote blue, and 23% had not yet heard any guidance. Early modelling suggested the unknown portion would fall out roughly 2/3 for red; this gave a first estimate of an 80% consensus for red.

Collectively there was a feeling of cautious optimism. Some countries began outreach programs to spread voting guidance to isolated populations, and there were several international surveying efforts on top of the quickly established national programs.

Non-Governmental Movements

Various crowdsourced efforts began to gain traction, most notably:

  • Trends in modifying social media avatars to indicate intended vote
  • Raising money to supply guidance to populations being overlooked by official routes
  • Crowdfunding for ad campaigns to swing regions looking least likely to align with the consensus

Several competing DAOs formed with the goal of storing everyone’s publicly-declared vote on a blockchain. The immutability and public nature of the blockchain made this superficially appealing, but arguments about vote-changing rights and a failure to get close to the coverage and reliability of standard survey techniques prevented these solutions from reaching the mainstream.

Some billionaires took the ultimatum as their cue to save humanity. Bill Gates initiated production and distribution of extremely cheap solar/hand-cranked radios, so anyone out of reach from traditional media could be updated on the consensus as the deadline approached – but debates on the editorial control of the broadcast limited the effort to a subset of the target countries. Elon Musk announced a plan to put 8 billion LEDs on the Moon and give every human on earth a digital switch to set one of them to red or blue, so the consensus would be immediately apparent; people said it couldn’t be done, and they turned out to be correct.

Message Interpretations

It did not take long for people to question the true nature of the Consensus Ultimatum. A few theories began to gain traction.

The character test: the belief that the vote would not matter as much as humanity’s response to it. This might explain the rather long period of 64 days between the announcement and the vote. There was not much agreement on what behaviour was more likely to pass the ‘true’ test at this stage.

The prank: the message was just some kind of joke by a highly advanced being or civilisation. They may or may not follow through on the ultimatum; other beings/civilisations may find out about the prank and intercede.

The fix: this theory held that a galactic – or even wider – vote was due to be held, the consequences of which humanity had no conception. The Consensus Ultimatum was a form of lobbying by some part of the galactic community that had analysed humanity and concluded this would push the vote a certain way, without violating some rules of interference. The general conclusion was that this amounted to a variant of Pascal’s Wager, and humanity should take the ultimatum at face value anyway.

The denial: popular among those too young to receive the message (particularly those on TikTok), but also among some adults, this group held that it was all an elaborate hoax that had got out of hand, or perhaps a conspiracy by a shadowy but earth-bound elite to exert control via unclear means.

Anti-consensus: people who suspected that it was a trap, and the consequences might be reversed; that humanity would be eliminated if it reached too high a consensus. Perhaps the galactic community had bad experiences with populations that were too easily aligned to a single cause, and this test identified them for elimination before they caused trouble. Voting against the dominant consensus was the best way to survive.

Moderate consensus: these people held a similar belief, but thought there was likely an upper limit. A consensus above 62% was fine, but if it was too high – over 90% perhaps – this would be taken as a bad sign and humanity would also be eliminated.

Consensus begins to weaken

Just one week after the message, the idea of voting against the emerging consensus for red began to spread among certain groups. This included some believers in the anti-consensus or weak consensus ideas, but also those for whom the general appeal of rebellion was particularly strong. A natural inclination to distrust the government extended to distrusting this apparently higher, presumably alien authority.

A significant minority in the US in particular announced plans to go against consensus; when the ‘aliens’ came to eliminate humanity, they would be ready to defend themselves with their own arsenal. The idea that a culture capable of suspending death and shrinking the moon could be resisted with ballistic weapons did not seem plausible to anyone else. However, among those who favoured this line of thinking, conspiracy theories regarding those earlier ‘miracles’ began to gain traction; a friend of a friend knew someone who died on the supposedly deathless day; photos of the moon purportedly from the day of its disappearance were also widely shared. This group banded together under the banner ‘True Blue’, in defiance of the emerging consensus for red.

Various other conspiracy theories also began to take hold among this group. Core components of these theories generally took the following forms, without regard for self-consistency:

  • Globally, blue is a more popular colour than red, so would win naturally despite any government messaging
  • The push for red was designed to offset the natural inclination towards blue, pushing humanity towards non-consensus and elimination. The elite would escape by travelling to the moon / mars, and then return to inherit the earth.
  • The elite had secret knowledge that the message was a lie, and that red voters would be eliminated while blue voters would survive. The push for red was a plot to thin the herd.
  • Alternatively, the elite had secret knowledge that the message was more of a misdirection, and whoever voted for the minority colour would be eliminated. The push for a red majority was a gambit to eliminate the non-compliant blue-voting population. The best way to fight back was to ensure the majority was blue instead of red.

Three weeks after the message, the last major country finally made their decision: Russia came out in favour of blue, for unclear reasons. With a population under 2% of the global total, this was not perceived as too much of a threat to consensus. That said, consensus certainly had weakened; estimates suggested the results would be:

  • 70% Red
  • 22% Blue
  • 8% Unknown, but perhaps 70% of these for Red

This gave an estimated consensus of ~75.6% for red.

At this point it is worth noting some other large-scale behaviours that deviate from the no-message paradigm, which we term ‘distractions’.

Distractions

Interest in space tourism picked up as some began to investigate the best method to get themselves and their loved ones off planet for the time of the vote, despite this not allowing them to escape their fate in any way. Interest in more traditional but equally ineffective bunkers also boomed.

Scams proliferated almost immediately, such as:

  • Fake bunker investments
  • Elimination insurance
  • Spoof government vote-registration links for phishing purposes
  • Incentive programs offering cash in return for a red vote (with an up-front fee)
  • Gurus who claimed to be receiving follow-up messages and charged for their insights

Progress towards mitigating climate change stalled on multiple fronts as many people considered it moot; it would either be solved, or humanity would be eliminated.

Several new research fields were initiated to investigate the poorly understood threats some had learned of from the message. The obvious approach was to identify the areas of expertise shared by those informed of each threat, and then use that to infer the nature of the threat itself. This proved almost entirely fruitless; for example, those notified of the crab singularities consisted largely of marine biologists, computer scientists, and a professor of experimental literature. The one exception was water decay, which was then accidentally initiated in an experiment. The water sample and the apparatus used to create it disappeared; the experiment was indefinitely postponed.

The overall economic impact was chaotic. Work resignations rose as the vote drew nearer, and there was a marked increase in more expensive recreational pursuits; many expected that humanity would be eliminated when the time of the vote came (or would perhaps be ushered into some sort of golden age), so they might as well enjoy themselves. Consumer saving (especially pensions) began to fall off while house sales and borrowing accelerated. Stocks and other assets were sold off at an increased pace, leading to market crashes. Irrational panic buying affected various goods in different regions – often fuel, but in some cases ammunition, food, e-scooters, and contraception. Some governments accelerated borrowing, others lending, all depending on levels of optimism for consensus and different approaches to risk mitigation.

Many countries with weak consensus proposed emergency laws/elections, and in extreme cases were subject to coups – all with the stated aim of improving the consensus. These political upsets interacted with the unprecedented consumer behaviour to drive novel effects not previously known to human economists, such as predator-prey public/private feedback loops, organised crime microbooms, and, inevitably, non-transitive Gödellian asset violations.

The Turn

As the weeks rolled on, the idea that some nations might be able to profit from the vote began to gain traction.

Most benignly, there was a subtle dilemma regarding surveys. Some advantage could theoretically be gained by being less public about survey results: less clarity on voting intentions could spur other nations or organisation to make greater efforts to ensure consensus. For example, if global consensus was estimated at 75% ± 2%, little effort further effort might be made; if instead it was 75% ± 10%, the motivations would be stronger. Since the direction of consensus (red) was already obvious worldwide, a more exact measure of it might actually do more harm than good.

Less benignly, that reasoning could be extended further: actively underreporting intended votes for red, at least to a certain extent, could lead to better consensus efforts by others.

We refer to the nations with the strongest leverage in these regards as the Quiet.

At the same time, it became clear that any sufficiently large nation with strong control over its media could use this strategically: they could declare that they would only issue guidance for the current consensus if certain international agreements were changed. It was widely assumed that this was why, three weeks out from the deadline, China (representing 18% of humanity) announced it was suspending guidance and would advise citizens how to vote nearer the time. This move was swiftly mirrored by smaller nations with similarly strong media controls. No demands were made; the official reasoning was that they considered it prudent to decide their guidance when greater clarity on consensus emerged. We refer to this group of nations as the Undecided.

(Of note, while India similarly accounted for around 18% of the population, it was considered to lack the survey accuracy or media control to gain significant leverage as part of the Quiet or the Undecided).

The strategy of the Undecided incentivised the strategy of the Quiet and vice versa, resulting in a Nash equilibrium of global consensus uncertainty.

Final weeks

At an individual level, alignment with each colour calcified. Communities that had previously shown flexibility while a dominant colour was emerging now associated the colours with distinct tribal identities. Despite the emerging greater importance of switching to achieve consensus, to do so became a threat to group identity.

The economic effects began to accelerate. Both products and services widely saw a 15%-20% decline in the commencement of any projects due to start after the date of the vote. On top of this, voluntary unemployment was starting to climb above 10% in many nations, causing food/service shortages and significant supply-chain issues. Alongside the tension regarding tribal colour identities and the debates on the Quiet/Undecided strategies, some countries drew close to civil war, but in most cases avoided it.

Among the more powerful nations, serious consideration of more aggressive interventions began shortly after the emergence of the Quiet/Undecided dynamic. In the anticipated negotiations between data access from the Quiet and citizen voting mandates from the Undecided, a stick could be as good as a carrot. Notably, a very simple method could be used to tip the balance: the mass elimination of nations or peoples expected to vote against the desired consensus. With the fate of humanity on the line, some saw this as a moral imperative.

Three types of aggressive intervention were considered.

  • Most obviously, proven technologies or processes, such as nuclear weapons and state blackmail.
  • Speculative research programmes that could potentially make rapid progress with accelerated funding, such as weaponised drone swarms, large-scale sonic weapons, and targeted viruses. However, the timescale limited meaningful progress on these fronts. Attempts to weaponise water decay were naturally fruitless.
  • We will cover the third intervention category in the next section.

Endgame

Two weeks before the day of the vote, the Undecided made their move as a collective: they would direct their citizens to vote red (so exceeding 62% consensus globally) on the condition that the Quiet first reveal their survey data, and – more significantly – that after the vote any newly revealed technologies should be preferentially shared with the Undecided. China’s unique position of an extremely large population and a tightly controlled media meant they could push for concessions most credibly.

There were two clear problems with such a request.

If there was a chance these demands could precipitate a consensus failure, all would be moot, as failing to reach the consensus would mean the elimination of all humans.

Secondly, there was no obvious way to enforce the sharing of new ideas. The widely anticipated method of knowledge dispersal would be a variant on the message: just as those with the knowledge to comprehend them were advised of specific threat protections, presumably those best able to understand the new knowledge would receive it in much the same way.

This drawback was anticipated by the Undecided.

There were layers of proposals to gather this information, ranging from the clearly unacceptable to the rather mild. For example, at the time of the message, many humans were undertaking various forms of brain activity monitoring, almost all of which subsequently showed distinct patterns as the message was being received. One proposal therefore required the continuous surveillance of brain activity of all scientists, followed by the prompt sharing of any information received. As a step down from this, all scientific journals could be required to run paper submissions through a panel composed of top scientists from the Undecided ahead of publication. At the mildest end of the spectrum there was a proposal for an annual, global conference for all new knowledge gained, to be held in Beijing.

The stakes of this negotiation were also lowered thanks to a graduated approach to messaging. The Undecided had earmarked each of their component nations (or individual regions in the case of China) for separate, timed broadcasts on vote guidance that could be issued one way or another, depending on the progress of negotiations. This could play out against similarly incremental concessions from the Quiet.

Negotiations began, with all aspects of the above on the table.

More traditional points of negotiation such as trade and energy agreements were less important, as their existing complexity could not be addressed under the tight deadline of the vote.

The threat of more drastic interventions formed a more subtle part of the negotiation process. The prospect of mass population elimination receded as the ‘character-test’ hypothesis gained general acceptance: it seemed likely that mass warfare would disqualify humanity from the test, regardless of the vote. However, a plausible threat of less deadly interventions could still drive an advantage at the negotiation table.

In the final days, various unusual hazards emerged in the most prominent negotiating countries, and it was widely assumed these could be read as threats; this is the third category of aggressive intervention. The most notable examples of this were as follows.

Disappearance in Australia

A prominent minister on the Australian National Security Committee disappeared overnight, along with their close family. More significantly, their digital presence was erased at the same time: Facebook accounts deleted, Wikipedia entries (and even their edit histories) removed, personal websites deregistered, records removed from Ancestry.com, and historic online articles from many large news sites disappeared. Less obviously, the minister’s pre-emptive obituaries held on standby by many national news organisations also vanished, and even their remarks in the Australian parliament’s Hansard unaccountably disappeared, leaving almost no digital trace they had ever existed.

Demolition in China

In China, the broadcasting system in two residential tower blocks in different cities unexpectedly issued an evacuation order; fifteen minutes after the evacuations, the buildings collapsed without apparent cause. This coincided precisely with the unexpected collapse of three large tower blocks in a ghost city, which had been scheduled for demolition in several months’ time – but before any explosives had been planted.

Isolation in the US

In the United States, a relatively small town (population 8,000) found itself cut off overnight in more ways than one. Seemingly natural incidents blocked all roads in and out (bridge collapse; fallen powerlines; localised flooding due to a burst water main). Telephone, electricity and internet access were also cut off. The mayor and deputy mayor both fell ill with flu-like symptoms, as did the chief of police and half a dozen other local leaders. It was only after contact was re-established 24 hours later that further, more troubling aspects emerged. An unusually large number of people who had planned to visit the city that day had changed their plans for a wide range of reasons, and most of those who stuck to their plans ended up turning back on encountering the obstacles. Many friends and relatives of the town’s population received normal-seeming text messages or saw plausible social media posts from them that they had never actually sent – and often these spoof messages served to dissuade or delay planned visits to the town.

While nobody claimed responsibility for these incidents publicly, subsequent analysis of the ongoing negotiations between the Quiet and the Undecided suggests that these were indeed instigated deliberately and were meant to demonstrate capabilities which could be unleashed at a wider scale; as such they served to boost their respective nations’ negotiation position.

One way or another, progress was made, with both sides offering concessions in the days leading up to the deadline. With just one day to go, global consensus seemed to be converging at around 64%, albeit with an error margin of roughly ± 2%.

Final Problems

A few patterns of behaviour complicated the attempt to converge on consensus through negotiation.

Several independent studies had found that people with a voting intention of red were 57% more inclined to change their vote if they thought it would help consensus, compared to those with a voting intention of blue. Consequently, there were some calls to alter the global consensus to blue, as this could be more easily achieved. While this failed to catch on, some ongoing surveys did show an increased rate of switching in this direction, possibly by individuals hoping to signal their willingness to change should they be called upon to do so.

Relatedly, subconscious bias studies suggested that as many as 4% of people were lying about their intended vote. The lead hypotheses were that red voters might pretend to be blue to spur greater consensus efforts/negotiation; particularly committed blue voters (especially those subscribing to the Moderate and Anti-Consensus theories) might report red to reduce those efforts.

Subconscious bias studies were also used to assess the likely voting behaviour of those declaring themselves unsure, but these were unfortunately not representative. These studies were often conducted on student populations, or in countries where most of the ‘unsure’ were natural consensus/red voters who were choosing to be cautious and await greater certainty ahead of the vote. In contrast, the unsure population globally were largely places where consensus communication was simply poorer. These people were much more likely to choose blue than red, simply because blue was in general a preferred colour among most human populations. As such, estimates of how the unsure would vote were too optimistic.

Finally, both the Quiet and the Undecided undertook clandestine activity to try to tip the balance. Various methods were deployed to try to encourage citizens of the Undecided to vote red (e.g. airborne leaflet drops, social media sock-puppets, illegal radio broadcasts). Meanwhile the major organisations running the surveys among the Quiet saw attacks of many kinds: data-theft (primarily by phishing), but also tampering with stored data, undermining their ability to report survey results correctly. Some organisations detected these attacks and began counter-measures, with decoy data-sets and only a trusted few knowing which could truly be relied on. In the last days of negotiation, some of those trusted few fell mysteriously ill. These effects led both sides to reach inaccurate conclusions on the level of consensus in the crucial final days of negotiation.

On the last day, negotiations finally closed at an estimated consensus of 63.5%.

Results

The vote was conducted in the normal manner, using the same approach as the message. The final tally was 61.8% red, 38.2% blue.

The consensus threshold of 62% was not achieved.

Following our usual protocol, all human memories and all digital and physical evidence of the prior 64 days have been overwritten with our baseline forecast of natural events for the time period.

In the highly unlikely event of humans attaining Type III habitat expansion, a revisit should proceed as standard. Otherwise, we do not see any grounds to recommend a reduction to the standard revisit period of 212 solar cycles.

  • Report ends.

Tim Mannveille tweets as @metatim and doesn’t often write things like this, but you might appreciate Paradoxic Fandom. Thanks to Josie, Matt P, Ben H and Ian H from Hutch for their feedback on this story.

Categories
analysis

Paradoxic Fandom

Q: What’s the difference between Star Wars fans and Star Trek fans?
A: Star Trek fans don’t hate 90% of their movies!

It’s funny because it’s true!

As a life-long Star Wars fan, this joke resonated with me: my position of finding a lot to enjoy in literally every Star Wars film now seems highly unusual. It also raises an interesting question. What does it mean to be a fan of something you are mostly angry about? How does that happen?

I’ve been thinking about this for a few years now.

Here’s what I’ve figured out.

That’s No Moon

This phenomenon is much bigger than Star Wars, if such a thing is possible.

As an area of work and personal curiosity, free-to-play mobile games show a particularly stark example of what I’m calling ‘Paradoxic Fandom’. Studying the games that have found greatest long-term financial success, I noticed that almost all their player communities tended to have the same repeating refrains:

  • Every update makes the game worse
  • The game is dying
  • The developers are out of touch with players
  • The developers only care about ‘x’ players (x is either new players, or the biggest spenders)

Where I work, it was particularly noticeable that we didn’t get much feedback like that until we came up with our first truly successful long-term game.

I’ve seen Paradoxic Fandom elsewhere too.

  • The dominant feedback on most social media platforms (Facebook, Twitter etc) is that all changes to that platform are for the worse… but people keep using them.
  • The ‘Bad webcomics wiki’ seems like the go-to place for fans of a webcomic to complain about how much they hate it.
  • In Future Shock (2014), a documentary about the long-running UK anthology comic 2000AD, one of the historic editors draws a distinction between ‘readers’ and ‘fans’; he implies that the fans were the most difficult to deal with.
  • I keep in touch with developments in LEGO through the blog From Bricks To Bothans; it became apparent that main writer Ace Kim (since 2002!) has a similar love-hate relationship with LEGO (sample: a review of the ‘Ultimate Collector Series’ LEGO Star Destroyer, ending with “Things like this re-affirms my decision to stop collecting LEGO”)

It’s not just statistical regression

There’s two phenomena that look a bit like what I’m talking about, but are meaningfully different: the Sophomore Slump, and its close relative the Sports Illustrated cover jinx.

The Sophomore Slump occurs if someone (or a group of people) perform worse when they have less to prove. Having found success on their initial effort, they may try less hard for the follow-up, be it students in their sophomore year, or bands on their second album.

The Sports Illustrated cover jinx is when an athlete performs exceptionally well, gets featured on the cover of the magazine, then has a disappointing performance immediately after. It’s possible that some athletes find the additional scrutiny difficult to deal with, but this seems much more likely to be a simple case of regression to the mean: in anything where there’s a fairly strong random component to performance, an outlier is most often followed by a more average result.

This can apply to almost anything. For example, I found out the excellent line “I’ve come here to chew bubblegum and kick ass… and I’m all out of bubblegum” was from the film They Live (1988), but when I eventually watched it, literally nothing in it was as good as that line. Regression to the mean!

But Paradoxic Fandom is an ongoing, sustained effect, so it’s not just a statistical regression/slump/jinx.

Paradoxic, not yet toxic

When consumers begin to harass or threaten people they perceive as damaging the thing they love, they have crossed a line into what gets termed ‘Toxic Fandom’. Here, I’m interested in the wider phenomenon, which can easily give rise to Toxic Fandom but is meaningfully distinct. So I’m calling it Paradoxic Fandom, and the key defining features are:

  • The product/service is ongoing over months, years or even decades
  • Fans continue to consume and engage with the product
  • … but they report finding the product/service consistently disappointing, getting worse over time
  • … and they believe this is because the creators are out of touch with the fans

So what is going on here?

I think this actually begs four related questions:

1) Why do companies do things badly?
Why can’t video games harness everything they learned and get better with every iteration? Why, when there is clearly money to be made in satisfying a large audience, does capitalism fail to deliver?

2) Why do consumers misjudge things?
If criticism is unwarranted or short-sighted, why does that happen?

3) Why do people remain fans/consumers of things they seemingly hate?

4) Even when responses are mixed, why does criticism dominate the discourse?

I’ve been thinking about this ever since the borderline-allergic reaction of some ‘fans’ to The Last Jedi (2017). Here’s what I think is behind each of those questions.

1. Why do companies do things badly

What about the money?
If something is a financial success, the budget for follow-ups is likely to be bigger, which in theory should help. But while money helps with execution, in artistic endeavours it’s very clear that money cannot buy ‘quality’ (whatever that is) – if it could, movie and game studios wouldn’t lose the most money on the big-budget failures, but that’s exactly what happens.

The second death star: bigger budget, worse results.

You can’t please all of the people all of the time
The follow-up thing will certainly have similarities to the original thing, but also differences; the people who loved the first thing will have liked different things about it, so inevitably some will be disappointed.

More money, more problems
Companies generally try to make more money – like LEGO looking to grow their business, or a free-to-play game looking to maximise profits. Much as capitalism is built on the lucky fact that in the right environment, competitive self-interest can produce great results for everyone, the end-result is always some kind of compromise between buyer and seller.

One simple factor is that a company will try to make more money out of their customers. Most benevolently this could be by making further content, but there are less benevolent ways too (most simply, releasing a DVD with two different slip-cases to try to get fans to buy two copies).

But that can only go so far. Generally, the best and most scalable way to make more money is to find more customers. It’s quite possible that having attracted all the customers you can with your existing product, you’ll need to make some changes to attract larger numbers, which brings us back to not being able to please all the people all of the time.

2. Why do consumers misjudge things?

I should make one thing very clear: when people complain about things, it is often worth listening, and very productive action can be taken as a result. But sometimes, as consumers, we do misjudge things, and that criticism is less useful. Why does that happen?

One thing I’ve found is that making things – really, almost anything – is always more difficult than you expect.

I see a connection between the Gell-Mann Amnesia effect (read a newspaper article about something you know and see how wrong it is; go on to assume everything else you read is just fine) and the Dunning-Kruger effect (non-experts have a sense of illusory superiority, because they don’t know enough to know better).

In whatever area you work or have expertise, you should be able to identify how wrong most people are about it. Most obviously I find this in any question that begins with “Why don’t they just…?”

A personal example: I want a new mobile phone, and battery life is much more important to me than weight or how thin it is – but I can’t find such a phone on the market. Why don’t they just make a version of a great phone that’s much thicker in order to have a bigger battery?

But if I try to imagine the sorts of things I’d know if I designed phones, I can imagine that there might be complicated cooling issues with a thick battery; or perhaps they make trial-concept versions of potential new designs and have people use them for a few weeks, and discover that even if you think you’d be okay with the weight/thickness trade-off, it’s ultimately too annoying. I can even more easily imagine there’s a reason that I can’t conceive of at all!

Knowing the answer to these questions about one’s own areas of expertise, one should really extend it to other areas. When you start to ask “Why don’t they just…” you should remember Gell-Mann’s newspaper experience and the Dunning-Kruger effect, and consider if perhaps you just don’t have the expertise to spot the problems with your idea.

Another example, from an area of personal expertise: in anything involving code, new updates bring new bugs. The frequent response is “Why don’t they just test it properly, and make sure all the bugs are fixed before they put out the update”. The problem with this is that, on average, it takes longer to find each incremental bug than the last (things that occur one time in ten take longer to find than those that go wrong every time; things that only happen under very particular circumstances only turn up if you test extremely large combinations of actions, etc). As such, to truly guarantee all bugs are fixed would take so long that the customer waiting for their bug-free update will have moved on long it arrives. In the real world, ‘the perfect is the enemy of the good’, and compromises have to be made to get anything done.

Follow ups like “Why don’t they just hire more/better testers”, “Why don’t they let players/customers test it first” run aground in a very similar way. (Don’t forget though, there might actually be reasonable ways to improve things a bit! Customers are great at identifying problems, but it’s on you to figure out the solutions).

So we see why professional efforts aren’t improved as simply as consumers often assume . I think there are four other reasons consumers can be disappointed by updates: expectation failures, familiarity contempt, confirmation bias, and in the longer term, simply ageing. Here’s how those break down:

Expectation failure
A combination of factors tend to make us inordinately sensitive to our expectations not being met. For example, I use the internet on my train commute, and one time the train went into a tunnel, so the internet cut off, and I immediately felt annoyed. I realised this was a ridiculous response: I’ve made the journey hundreds of times, I’ve literally made a spreadsheet to identify where in the journey internet access drops. But in that brief moment, my expectation of continued internet access failed, and therefore in that moment I was annoyed.

In creative media this seems most obvious in movies, where the most significant factor in how much someone enjoyed a film seems to lie less in its objective qualities but more in how it compared to their expectations.

As noted above, a follow-up thing must be at least a little different to the original thing. Expectations are based on the original thing, so some expectations can’t be met – so some disappointment is inevitable.

Familiarity Contempt

‘Familiarity breeds contempt’ sums this aspect up well.

The IMDb ‘goofs’ section for 1977’s Star Wars Episode 4 is one of the longest of any film. But this is not because it was a terribly made film. It’s because so many people love it so much, it has received orders of magnitude more scrutiny than most other films.

I think this applies to the LEGO example, and especially to successful mobile games. These superfans are so close to the material they see every flaw. Something that seems perfectly functional to a casual player could be perceived by the superfan as being riddled with bugs.

So this is another effect: the people that engage the most with a thing will also be the most knowledgeable about its flaws.

In a fan screening, the completely missable moment this stormtrooper bumps his head generates a wave of laughter, as everyone knows where to look for it! It’s great.

Confirmation bias and performative criticism
Reality is complex and nuanced, so I think we make it more digestible by applying confirmation bias. If there’s something/someone we think is good, we are more likely to overlook or discount their flaws; if we judge something to be bad, every fresh piece of evidence is another chance to spot something wrong with it. It takes active, conscious effort to try to maintain a balanced view.

As a thought experiment, when was the last time you changed your mind about something? I find this to be alarmingly rare in myself. How likely is it that my immediate judgement of something was wrong and should have been corrected, once further evidence emerged? If I’m very optimistic, maybe not that often – but certainly much more than actually seems to happen.

I think this applies to many artistic endeavours. If for any reason your judgement on the creator of something has soured, you are likely to apply confirmation bias to their subsequent works and seek out their flaws to confirm your belief.

With art taking many forms, this produces an interesting corollary: the sooner you experience a follow-up work to something, the more likely you are to apply confirmation bias and continue to enjoy it – or hate it. If you are enjoying a TV series, I think a new episode is more likely to benefit from positive confirmation bias than a new series, which in turn is much more likely to maintain positive bias than a revival of the series many years or decades later.

Let the hate flow through you!

Related to this, I noticed that many criticisms of The Last Jedi in particular were bad-faith interpretations of plot, applying a level of scrutiny no screenplay could survive – an example of negative confirmation bias. It seems as if there are incentives to enact a kind of performative criticism once you flip into negative confirmation bias. So as an experiment, I wondered what it would be like to apply this to that sacred text, Star Wars Episode IV: A New Hope (as it was called in the re-release; just ‘Star Wars’ in the original release, which is the one post-release change that interestingly escapes criticism).

Please put on your flame-proof goggles and hold your nose as I have at it.

There’s so much wrong with this film it’s hard to know where to start, but let’s just go from the beginning.

Darth Vader is a terrible villain and an idiot. His “big entrance” is to come in after a bunch of cannon fodder have already done all the fighting, and then he does some kind of useless interrogation on someone that’s already been subdued. More importantly, he’s chased this ship to Tatooine, where the rebels presumably have a contact, and when the Death Star plans are obviously sent down to the planet in an escape ship, what does he do, as the guy with the ability to sense things with the Force? Just send down a bunch of useless stormtroopers to walk around asking if anyone has seen any droids lately?! What exactly is Vader doing while that’s going on – did he have to leave in a hurry to get to the big exposition meeting in the Death Star?!

And then later on, the Falcon, the very ship that escaped Tatooine – obviously with the plans –  gets caught by the Death Star; Vader is right there standing in front of it, he senses ‘something’… and then walks off?! Oh it’s fine, we’ll just send in the guys with the scanning machine instead! But wait, what’s with this scanning machine you have to physically take into a ship to operate? At the start when the droids take the escape pod, the Imperials scan it remotely for life forms, so why can’t they do that in the Death Star?

Okay, how about the heroes. Well, Luke is the most pathetic magical orphan character ever conceived. There’s literally nothing likeable about him. All we know is he apparently wants to get off the planet to… join the academy (what, the Imperial academy? What kind of ideal is this?!), but all he does is slouch around and whine about his chores. His step-parents are then apparently roasted alive (weird tonal imbalance, BTW) and he’s sad for about two seconds. People talk about how he’s a great pilot, but the first time we see him in a vehicle he’s in a land speeder keeping an eye on the scanner – while C-3P0 drives?! Whatever happened to “Show, don’t tell”?

Oh, but he’s magic! Obi-Wan teaches him to ‘stretch out with his feelings’ and suddenly he can hit a target no computer can hit, in a spaceship he’s never flown before! And also, by the way, despite never having fired a gun, he’s actually a crack shot, able to shoot out door controls from across a room, and out-shoot trained soldiers in multiple encounters! Also: he’s given an incredible weapon in the form of a light sabre… and then literally never uses it. Has Lucas never heard of Chekhov’s Gun? Why didn’t Luke use his light sabre to escape the trash compactor?

Oh, but the film has such a great back-story, right? Obi-Wan Kenobi has been apparently waiting for Luke to grow up so he can give him his father’s light sabre, and train him in the force. But, er, what was he waiting for exactly? Could have started a little earlier maybe? If Darth Vader had actually come down to sort things out Luke would have been killed before they even met! As it is, Obi-Wan literally gets in about 2 minutes of training before dying! (Another tragic death which Luke shrugs off in less than a minute, before running off to man the ship’s gun-turrets, which, by the way, is yet another thing he’s never done before that he’s apparently great at).

Alright, how about the Death Star. This whole thing is one giant plot hole. So apparently you can fly around the galaxy at faster-than-light speed in a station the size of a small moon? Why even build a hugely expensive space station then? Just stick a light-speed engine on a moon and fly it straight into any planet you don’t like! Oh, but I guess if that’s possible maybe the planet could just fly out of the way with their own giant engine!?

And how about that Death Star security? Literally any random droid can unlock doors and operate machinery from any random access port? Except tractor beams, which can only be disabled from a lever on a vertiginous ledge for some reason? And doing any of this doesn’t notify anyone anywhere apparently? And there’s so little CCTV that a bunch of random idiots can run around and you don’t even know where they are? They literally escape a dead end by jumping through a waste pipe and nobody can figure out to just throw a grenade down there?!

So probably the most sensible character is Princess Leia. She spends the whole time being rude to everyone she meets, but she’s at least smart enough to figure out they are being tracked when they leave the Death Star – but what does she do about it? Try to find and remove the tracker, or maybe go to another system and switch ships? No, let’s just go straight to home base, it’s fine, we’ll have at least half an hour for our techs to figure out a weakness in this giant planet-destroying space station that can be exploited by, er, about 30 small ships! I’m sure we can do that before it blows us all up. Wow! Good thing the plot wants this to succeed or this rebellion would be over!

And then finally, the big climax: the trench run. First, what even is this trench? And the secret weakness – literally one torpedo in the wrong place on the outside of the station blows the whole thing up? What kind of design is this exactly?

Given this lucky gift, what do the rebels do? Take it in turns to fly really far away from the exhaust port, and then spend ages flying along this trench to reach it, so the enemy can take them out one by one?! If you have the element of surprise, why not go straight to the exhaust port? Or why not all go at once? And even if there was some reason for the long run-up, the day is apparent saved when the Millennium Falcon comes in at the last second and shoots the bad guys at the end of the trench – er, literally any of the other X-Wings could have done that on any of the previous runs?

And…. breathe.

So, what just happened there? Wasn’t that far too long? Why yes, yes it was. Writing it was easy, fun, and made me feel smart, so I wanted to keep going. I can actually now understand why someone might read/write things like this over a prolonged period of time, rather than positively engaging with something they enjoyed. If it seems hard to imagine, I recommend giving it a go with something you know well yourself!

(Edit: I’ve seen cases of people reading this post and then arguing against the points raised in the screed above. This is the opposite of the point! Almost none of those arguments ever occurred to me before – I only thought of them when I considered the film specifically with an intent to pick it apart. I could argue strongly against them myself! Rather, the point is instead to consider a film you love, and then try to find the problems with it. Going through this exercise was very revealing for me as I noted above. It’s also interesting to examine how you’d argue against your criticisms; if you find yourself extrapolating beyond what is explicitly shown in the film (eg. the mechanics of different life-form scanners), or drawing on other material not in the film itself (e.g. Anakin’s history with Tatooine), do consider that these approaches could also benefit films/games/whatever that you are less inclined to extend the benefit of the doubt – T.M. 13th Jan 2022).

It’s not them, it’s you: fandom vs. ageing

On a time scale of multiple years or even decades, a newly significant effect enters: the viewer/consumer/player themselves has significantly changed. I suspect this is where Star Wars suffers the most. The main saga films are (I think) most enjoyed by children; when a new trilogy comes out 16 years after the last, those children have grown up!

I remember many Star Wars fans who were disappointed by the prequel trilogy nonetheless really enjoying the 2003 animated TV series. This had a lot of over-the-top action that would never stand scrutiny in live-action form, but I suspect the animated format bypasses a lot of the adult reality-check apparatus, and allowed these folk to be childishly delighted in the way they originally were.

I particularly remember Mace Windu taking down an army of battle droids in a way that I am confident would not have been received positively in live-action form.

This became particularly evident with Disney’s more recent Star Wars sequel trilogy. I’ve seen many young fans citing the prequel-trilogy as a superior era (echoing – yet reversing – the response to the prequels at the time, when they were reviled by fans of the original trilogy). This is also evident from the comments endorsing the Scene 38 Reimagined video – a fan video replacing the somewhat feeble 1977 Obi-Wan vs Vader lightsaber battle with something more like the prequel or sequel trilogies (and which would be utterly out of place in Episode 4).

So there are many reasons that account for strong criticism from fans, but don’t get the wrong idea: that should never be taken as an excuse to disregard all criticism! Criticism from consumers is often valid and useful for companies to heed (especially that familiarity/mistakes one) – you just have to take these effects into account.

3. Why do people remain fans of things they seemingly hate?

See what I did there

So, companies disappoint people because money can’t buy quality in artistic areas; and also because you can’t please all the people all of the time but companies want to continue and grow. Fans will be disappointed by new works due to expectation failure, confirmation bias (earlier disappointment drives fresh disappointment), familiarity contempt, and ageing; their analysis of the flaws may well be completely valid and useful feedback, or may be flawed due to the Dunning-Kruger effect.

The obvious conclusion is that people would simply stop consuming things they hate and find something new. But evidently, in many cases, that doesn’t happen. Why?

First, I think it’s a clue that everything affected by Paradoxic Fandom is in the arts or services; products don’t seem to have the same issue.

My friend John Broughton referred me to the 1970 book “Exit, Voice and Loyalty” by Albert Hirschman. To paraphrase significantly, “exit” means a customer stops using the service, and “voice” is what happens when “exit” is not possible: the customer voices their feedback on what they don’t like in the hope it will improve. I think this explains why products get off the hook – exit is fairly easy, you just buy something else.

In creative products, there is no easy exit. If you don’t like the sequel to a game or film you love, there is no alternative version you can switch to. In services, switching is either impossible (you can’t take everyone you interact with on a social network with you) or has a high enough cost that it’s better to stay. In the games-as-a-service model, it’s amplified further: when an online game updates, there’s no way to carry on playing the old version you liked better. To keep playing you have to accept the changes.

A fascinating exception that proves the rule: Runescape found a way to work around this. They released ‘Old School’ Runescape, recreating the older version that most fans first fell in love with – while continuing to develop the newer version. They now continue to develop the ‘Old School’ version, but all changes in it must pass a majority vote by the players.

So if you love a thing, and it changes in a way you don’t like, and you can’t switch… well of course you’ll be vocal about it! This is perfectly reasonable!

However, as “Why don’t they just…” is almost always unfounded, and as cultural products must appeal to more than just one person, very often that feedback won’t (or even can’t) be heeded. At this point, things can get pretty rancorous.

4. Why does criticism dominate the discourse?

From Paradoxic Fandom to Toxic Fandom

SamSykesSwears summed up the stages of toxic fandom (I think referencing the pattern of abusive relationships) in this tweet:

  1. I love this
  2. I own this
  3. I can control this
  4. I can’t control this
  5. I hate this
  6. I must destroy this

For example, the changes George Lucas has made with each edition of the original Star Wars trilogy particularly provoke “I can’t control this”. The clue to the irrationality here is how in many reviews, literally every single change is reviled. It seems improbable that literally every change the original creator would make to their creation before it came out was good, and every one after is bad. This looks far more like a near-religious adherence to some sort of holy text.

I think this was insightfully extended in a response from EricVBailey:

  • I can’t destroy this
  • I am even more mad now
  • I can harass those who are part of it though
  • I found a whole community of people doing this
  • I have found my validation
  • I love this


Self-selection and feedback loops

Even given all the above, it seems odd that a casual glance at much online discourse tends towards negativity, and especially the more toxic end of it. I think two things are at work here.

One is self-selection. This is easily seen in Amazon reviews of products: the majority of reviews seem to be people who only just got it (so have nothing useful to add), or who have had some terrible problem. This is because both of those moments are cues to leave a review. Using something and having it work just fine does not prompt you to go write a review. Similarly, playing a mobile game or watching a movie and simply enjoying it does not motivate you to review / talk about it online as much as hating it does.

My armored walker was destroyed by primitive weapons… would give 0 stars if I could

So self-selection skews what people tend to write about things.

The other factor is Feedback loops.

There are a few feedback loops online that end up fomenting more toxic discourse. One that has been well-covered (I thought particularly well by Tom Scott’s Royal Institution lecture) is that algorithms optimising for people to spend time on a platform will tend to find success by showing people more extreme and click-provoking content, which is often negative. So YouTube will naturally take you from “10 Things You Missed In The Last Jedi” to “27 Last Jedi moments that made no sense” to “237 reasons I hate The Last Jedi and You Should Too”.

There’s also a very natural social feedback loop that can work in tandem with this. I think for many people, it’s quite scary to make comments you think others will disagree with. If you found problems with something everyone loves, you’re less likely to shout about it; but if drama-optimising algorithms are showing more people that agree with you, that will embolden you to speak out more (see also: politicians making racist remarks emboldening racists).

My colleague Chris Hohbein saw this play out dramatically in the No Man’s Sky community. After that game’s launch, the community was incredibly toxic (mostly due to the game failing to meet their expectations); as updates to the game improved things, the hate diminished, and positive discussion flipped over to dominate the discourse instead.

Perhaps this doesn’t sound like that strong an effect. As a person who watches the first midnight showings of Star Wars films and feels moved to make comment on them in public right after, I can testify that the feeling of not knowing which way everyone else is going to go does make it feel a bit scary! That said, it’s Star Wars, I really should have figured out the pattern by now…

What I told you was true… from a certain point of view
Paradoxic Fandom certainly doesn’t apply to everything, and some of the above noted effects operate in reverse – for example, as in No Man’s Sky, confirmation bias can be positive; positive online conversation can beget more of the same.

But there are some particularly interesting counter-examples. For example, from that joke at the start, why is Star Trek different? And what about the ever-expanding Marvel Cinematic Universe?Are they doing things right in a way that others don’t?

In the case of Marvel, I’ve seen the argument that they play it “too safe”. In most Star Wars films, someone significant dies or loses a limb; Marvel films are frequently a battle to retain the status quo, and good guys and bad guys alike usually survive. So they entertain in the moment, but don’t do anything that might upset anyone. Over the last 10 years and 23 films, there have been a handful of notable exceptions from that, which seemed acceptable. Do they just have the right frequency of ‘not rocking the boat’**?

For Star Trek, each TV series will also tend to avoid disrupting its own status quo, but it’s unclear to me why subsequent series in different settings don’t seem to get as much hate as other franchise follow-ups*. This one is a mystery to me. Maybe I should become a Star Trek fan.

*Edit: I am reliably informed that I’m guilty of my own familiarity / contempt bias, and that actually Paradoxic Fandom is alive and well among Star Trek fandom. Prior to writing this, I had taken a brief look over various YouTube trailer comments, Reddits and forums to compare the Star Trek vs. Star Wars communities, and this seemed to confirm my hunch (or bias?), but this was hardly a rigorous study, and multiple people have now informed me of my error!

**Edit: This was written in July 2020. Since then the MCU launched Phase 4 which has been much less successful across various metrics, and is a whole other story. Still, it’s impressive they avoided negative feedback loops for so long and across so many films.

Conclusion: What do we do about it?

As a company / content producer

The biggest impacts on us as consumers are when things are better or worse than we expected. To the extent that you can, you want to exceed expectations. In practice this is tricky – if you hold your best bits or biggest surprises back from the marketing, perhaps fewer people will want the product. If you announce updates to an online game in advance and always under-play things, fans will spot the pattern and then be disappointed any time you fail to overdeliver. If you hold everything back, fans will worry nothing is happening at all.

To any extent possible, you can also be honest about things – educate consumers about the challenges you face. If you’re sure you’re right and they’re wrong, can you explain why clearly? Doing that – without making it an attack – can help both producer and consumer get closer to the more useful aspects of feedback.

More generally, I think it’s helpful to be aware of the above effects. Feedback from fans is extremely useful to help learn and improve; discounting it entirely is unwise. Knowing what drives it in different forms can help you unpick what’s most useful.

As a consumer / fan

I think a couple of Star Wars analogies help a lot here. As an engaged fan, do you choose the light side, or the dark side?

Luke: Is the dark side stronger?

Yoda: No. Quicker, easier, more seductive. […] A Jedi uses the Force for knowledge and defense. Never for attack.

You can use your deep knowledge of the material to engage in performative criticism (attack), or for knowledge and defence: what good-faith analysis can explain plot-holes? What practical aspects of production led to these compromises being made? What can we learn more generally about the craft from these decisions?

Luke: What’s in there?

Yoda: Only what you take with you

If you go in to a new experience ready to attack, you will find material to attack. Going in to anything new with an open mind, searching for elements to admire as well as those that justify criticism, is a more enlightening and fun way to go about things.

As an individual trying to find entertainment, I do think this is how you win – not by focussing on what you hate, but what you love. I distinctly recall spending the first 15 minutes watching The Force Awakens alternating between thinking “that’s not very Star Wars” and “that’s just a rip off of this earlier bit of Star Wars”, before realising no film could ever walk that line. With that in mind, I was able to enjoy the rest (of that film and the sequel trilogy as a whole) – while still finding a lot to criticise too! I find it highly rewarding to go in looking to enjoy everything I can, while also gaining enjoyment from engaging my more critical faculties later on.

There was just one more thing…

There’s a very important generalisation of Paradoxic Fandom. Consider these more generic forms of the regular mobile game complaints:

  • Every change makes things worse
  • This group/activity is failing and will die out
  • Those making the changes are out of touch with participants
  • Those making the changes don’t care about ordinary people

Does that sound familiar?

This is almost a textbook description of Populism in politics! The incumbent government is portrayed as out of touch with the people, that the country is failing, the government only care about themselves/the elite/the rich.

The country you live in has that same crucial quality we saw from “Exit, Voice and Loyalty”: it’s something you can’t easily change, and all the above listed effects can apply to politics just as they can to a game or multimedia franchise.

This means something! I just need a few more years to think about that.

Tim Mannveille tweets as @metatim, and sometimes writes about Star Wars on NothingAboutPotatoes