Tuesday, August 31, 2010

Running Your Own "NOVA" - Why Battle Points and Single Undefeated Winners Aren't Necessarily Good Things

OK,

So I'm going to write a series of articles for the number of people out there who've asked about running their own NOVA at RTTs and major GTs, but who have questions about various things ... some of these come up more often, so sharing my thoughts on it might be helpful to all.



First off, for this article, a couple of what I consider myths ...


1) Battle Points (BP) and Swiss Pairing (SP) are superior to Opposite Seed (within w/l bracket) Pairing (OSP) and Rating Points (RP). Let's screw with you guys and use acronyms from here on out.

There's a phenomenon that in an open invite tournament you cannot address effectively - Baby Seals.

In the first rounds of a tournament, depending on the size, the odds are high that a couple of things will happen across a number of players:

A) Skilled players will randomly draw Unskilled players (aka Baby Seals) and score very powerful wins off them. In either the NOVA's RP system or the standard BP system, this invariably results in them being a high seed in Round 2.

B) Skilled players will randomly draw other Skilled players and while one will win, both will score very low in the RP or BP approaches.

The unavoidable nature of these issues brings about the question of what you want your tournament to achieve. Do you want it to advance the largest number of skilled players as far as possible, or do you want it to have players bludgeon each other every step along the way?

Also, how much do you value a loss vs. scale of wins?

Some people, those who propose the "traditional" BP and SP approaches, suggest that quality of win is far more important than simply "barely winning" a bunch of games that go to tiebreaker. The NOVA system advocates undefeated as the "best" unbiased determiner. Why do these conflict and where should the emphasis lie?


Let's look at things for a second here;

Presuming A and B above are true, which system is fairest?

For a system that rewards you for SCALE of victory, A and B yield two results:
A - The player who drew the "baby seal" is greatly rewarded for his large win.
B - The player who drew the tough first rounder is greatly punished for his marginal win.

Does this make any sense?

For a system that rewards you for simple victory, A and B yield two results:
A - The player who drew the "baby seal" advances
B - The player who drew the tough first rounder advances

Hrm, this seems to be an easy one ... win/loss should trump scale of win, b/c scale of win places instant long-term value on how easy your first draw was or wasn't, and that's entirely random.


What about pairing, then?
In the NOVA system, the person who clubbed the baby seal gets a high seed, and plays someone with a much lower seed ... which could be B in the example above, but could also be someone weak who simply barely beat another weak player. In that latter example, the person who clubbed the Baby Seal gets by rule ... ANOTHER BABY SEAL.

What about the traditional swiss approach, where the baby seal clubber in the above example plays the person just below him in rating? Is that superior? Doesn't it prevent the baby seal clubber from being rewarded too greatly for his big win?

Well, maybe. Let's talk this one out a little.

If the person who wins against a weak opponent in Round 1 faces another weak opponent, and crushes him (again), is your system failing? Someone who can beat 2 baby seals advances (we haven't proven he can beat a tough opponent yet, but we haven't proven he can't either), and 2 baby seals are now knocked down to the win/loss brackets they'll be more comfortable in.

What happens if someone at the bottom of the rankings who faces the baby seal clubber actually is there b/c of situation B from a while ago above (he faced another tough opponent and barely won)? Well, the baby seal clubber now gets his litmus test, doesn't he? Either he "belongs" at a high seed / in the winners' bracket, or he doesn't.

The catch-all here is that in a w/l situation (no ties) where you pare it down to a single undefeated finisher, invariably anyone who drew lucky rounds earlier on will face a tough opponent and be knocked out. Until then, he'll at the WORST provide you the service of knocking unworthy winners out of the winners' bracket to the spots they belong in. If you go with traditional swiss pairing, the LOWEST RATED winners constantly face each other (as well as the highest). While the worst result of OSP is someone who drew lucky early getting a couple of easy rounds and knocking weak players out, the worst result of traditional swiss is that numerous weak players CONTINUE TO ADVANCE by beating other weak players. Ooops.

This is one that can be argued both ways ... but the fact remains that any argument oriented around "luck" pairings applies to any system equivalently, and I would contest to the NOVA's approach LESS materially ... simply b/c it does it is designed to limit the risk (as much as it can) of high seeds knocking each other out or down early on.

Now we get to the point and purpose of a tournament system ...
Is it to have "balanced" matches in every round? Well, if you've been following along, you can't really accomplish that ... any number of variables early on WILL cause unfair matches in the first couple of rounds NO MATTER WHAT SYSTEM YOU USE. This is amplified when the field grows larger. No matter your system, your first couple of rounds should always be considered a "weeding out" process, where people firmly establish themselves in the brackets to which they belong. This is why for the NOVA Open next year, we'll be clearing w/l records and maintaining seeds for the 2nd day of games, giving people a fresh start within their "proper" division.

So, what should be the purpose then? The later you get in a tournament, the more ACCURATE your seedings are. This is practically a fact ... the more games people have won, the more likely they each are to be talented. As a result, your goal should be to keep the highest seeds away from each other AS LONG AS POSSIBLE, b/c people who maintain their high seeds through multiple rounds generally tend to be the better players. Those who drew unluckily against them earlier may indeed have been just below them or near them in skill, but it's irrelevant ... they will be to a degree vindicated when they personally go through only losing once or so, and when their conquerors go on to the very final top table matches.

Any tournament should seek to pit the VERY BEST players in each w/l bracket against each other only in the FINAL round. OSP tends to accomplish this BETTER than traditional swiss, and that is why numerous events use it. It is true that you should ideally have a comparable statistical set to go off, and should seed people properly from the first round, but barring that it is never "ideal" to pair the higher rankings against each other early, b/c it emphasizes the flaws of early random pairing, instead of ignoring them (you can't effectively "de-emphasize" them).



Rambled a bit ... next important subject ... "How do I do this in a tournament locally when I don't have the time and resources for enough rounds to pare it down to one????"

This answer is much simpler - reward all your undefeated players equally as Tournament Aces or Best Generals.

STOP. Why not? Why is it so critical to have only ONE competitive "best" when numerous candidates have yet to be beaten? This is where traditional BP systems are the worst. This is where those lucky early seal clubbings come into their own. Now among, say, 4 undefeated players ... the people who CRUSHED THEIR ENEMIES ZEE WORST are the ones who actually win. Uh oh. What does this do?

A) It discourages close games
B) It encourages "cheating" the scores and giving opponents you "like" max points, especially late, in order to advance them over people you don't like
C) It encourages tanking people on soft scores when those are components of it
D) It encourages beating the snot out of people mercilessly, even when you already know that you've got them "beat" (and this is probably the worst part - it encourages over the top power gaming instead of tight competitive friendly gaming to win, and no more)

This is what a Best Overall is for, my friends. This is why we have Renaissance Man. Best Overall and Best General become contentious because they are just de facto #1 and #2 in a system that includes soft scores as part of it, and rewards people for massacring opponents to score maximum points instead of simply winning out.

Simplify it, actually prove your "equal love for all in the hobby" statements aren't just bs lip service. You will know based upon attendance how many undefeated finishers you'll have ... it's not complicated at all.

Let's pick a random number ... 20 over 3 rounds, let's say

20 ---> 10 ---> 5 ---> 2-3 (#1 vs. #4, #2 vs #3, #5 2-0 vs. #6 1-1) undefeated after 3 rounds; have prize support for 3 "Tournament Aces" or Best Generals ready and if you only get 2, either give the guy who beat #5 a prize also, or split #3's prize among #'s 1 and 2 or the Best Overall/Renaissance Man.

It's not complicated, all it takes is a little forethought and a little respect for your players. Do not be a shallow tard and hand your Best General prize to ONE PERSON who numerous other undefeated players have to sit there angrily gnashing their teeth that they didn't get easier draws on the way up or slaughter their foes more mercilessly.

Next, give your Best Overall far more emphasis on appearance and sportsmanship. Why is this important? If the Best Overall is MOSTLY comprised of competitive score, it will almost invariably go to ONE OF THOSE UNDEFEATED PLAYERS. You fall right back into the same problem - the other ones feel punished b/c they didn't have "prettier" armies or because their opponents tanked their sports score. By making competitive finish only ~33% of the composition of the Renaissance Man / Best Overall score, you firmly separate it from Best Generals finishers, and ensure that none of them will feel "gypped" by the result. It's true that one could still win Best Overall, but only by virtue of a truly gorgeous army and phenomenal sports scores. The award itself is mathematically divorced enough that it prevents that rightly angry feeling of being gypped by simply not having enough games to go after each other player.

So reward all. This should be inclusive, all-rewarding, all-fair as a hobby anyway. If you as a TO cannot provide an event that will give them a clear Best General, do not invest all of your prize support into a single Best General award. Fair, simple, etc.

Think outside the box, on everything ... it'll get you somewhere. The number of times I've had people go "WELL WHAT IF I DON'T HAVE ENOUGH ROUNDS" is huge. The number of times when I've replied with "Um, give multiple best generals?" and been answered with "Oh ... duh." Is almost equal.



DIG IN!

21 comments:

  1. mike its a great format... one danny, bobby and myself discussed at lenght coming home from the NovaOpen (had 12 hrs worth of traffic to do so :P )

    if you look here : http://www.battleforsalvation.com/tournament-rules/

    you will see we have already implemented a very similar format to our upcoming tournament.

    day 1 rewards players

    day 2 will consists of 50 % of day 1 field of players coming back and with wiped records playing in 3 tournament brackets seeded from the previous days games.

    ReplyDelete
  2. That's the way to do it.

    By the way - can you get whoever is formally running the BFS to contact me? I need to get them on-line with next year's NOVA Invitational. Using our format for a 48+ player tournament = you get to send some people to compete for big cashy money prizes.

    ReplyDelete
  3. sure mike no problem. Bobby is the man on this. I know he wanted to get in touch with you about a few things anyway. Ill send him a text.Btw any chance of getting a link here to the tourny site ? http://www.battleforsalvation.com/


    ed

    ReplyDelete
  4. Well said Mike,

    I have not understood the relationship between the differing tournement formats and your explaination made a lot of sense to me.

    I have not competed in many years simply because I have been the 'baby seal' in two tournements which turned me off. The live-feed coverage and the after-action reports have me wanting to enter next year.

    I'll be the guy wearing the fur coat and barking...

    ReplyDelete
  5. Dude, That is an awesome exposition of the tournament scene scoring system and some of the problems I have encountered.

    I would love to go to the NOVA open in 2011, and saving up for it now even though I don;t know when or where it will be.

    ReplyDelete
  6. Whay i was saying on YTTH Mike is that in closed group environments like my country there is no such as baby seal- good player great differences. In fact we are all pretty terrible but this is not the point. When you have a constant level of quality players you are on the hand of statistical probability of win/lose in swiss pair and opposite pair. I runned 8 or 9 wournaments (small ones but still tournaments) and i had in only 1 tournament a player going 5-0 in 5 rounds. While i did not had W/L format in math all is possible even not probable so you could have in day 2 no player 4-0 (even the possibility in a 48 player environment is around 0.01%). But do you risk that in a 16 player environment? cause less players = actual higher chances to have noone with 4-0 due to playing in last rounds higher seed players in opposite pairing system.

    And in fact having someone winning the day 2 even with 1 loss in day one is not necessary a bad thing. It speaks about a constant quality among the players. But i agree the day 2 is elimination system....

    now i`m thinking about the tournament system in team sports (groups ->tiebreak if necessary ->winners in elimination). Just imagined this while writing the comment

    ReplyDelete
  7. I always found it fascinating how Swiss style has been adopted by a lot of 40k/Fantasy tournaments - considering that Swiss is all about playing someone with the same record, and often is based on pairing the high end of the bracket with the low. That Mike is having to defend that idea as a unique concept is just crazy.

    For my tournament, everyone gets 6 rounds, but round 5 and 6 will have the built in semi finals/finals for the 4 undefeated players (there will be 64 spots). I might have to push to 7 rounds if the field expands, since that gets messier and might involve byes like what Mike did, but that's neither here nor there.

    Win Loss with high-low pairing within the bracket is the way to go.

    ReplyDelete
  8. Tautalogical argument Mike.

    Consider what you said yourself - high seeds tend to get more baby seals. Thus, they tend to win just as hard in round 2 of your system.

    Thus the people who win hard in round 1 (and thus get easy seeds), tend to keep getting easy seeds, and thus keep getting good win records. While a player in situation B, who might be every bit as good as a player in A, tends to keep getting tough matchups, and thus keeps getting lousy point records.


    Randomness is really the only fair way to do a swiss system (which allows losses).

    BTW, for swiss, top versus bottom occurs with Top 8. As Top 8 is single elimination, rather than a record game, everything you said comes into play. Since, essentially, you ran a single elimination event, what you are saying makes sense, but for the swiss system, it's just wrong, and if you're switching to swiss, randomize pairings in the 1-0 bracket, 2-0 bracket, etc. (Ladder system is easiest way to do this)

    ReplyDelete
  9. Not quite.

    That argument avoids the notional that in the early random pairings, it is as likely that someone beats face on a baby seal as plays against a competitive peer, resulting in players at low seeds who don't "belong" at low seeds. Leaving them not litmus tested enables the possibility that they are actually just baby seals, and barely edge out other baby seals all day long.

    You actually WANT "baby seals" to be smashed as SOON AS POSSIBLE, and knocked into brackets where they will play competitively with peers, instead of hopefully gaining win after win and then seeing them all dashed in a terrible one-sided match late, which happens actually in numerous BP systems - take a big tournament semi-recently where a very average Khornate CSM army did this to get to the final round, only to be mercilessly slaughtered by a TWC heavy optimized wolf list run by a GT champ.

    You want those later rounds to be the close ones, you want the baby seals smashed down to their proper place early. The BEST way to accomplish this is opposite seeding, b/c it does two things reliably:

    1) Quickly identifies and eliminates weak players from the pure win bracket
    2) Ensures "false positive" high seeds and "false positive" low seeds have appropriate opportunities to show that (false highs lose their top spots by barely beating or average beating low seeds, and/or false lows get a chance to climb while knocking out higher seeds)

    The whole issue presumes it's bad for a higher seed to crush a low seed. It's not, as long as it's within the same w/l bracket. You want the last 8 to be the top 8, not the top 2, and the middle 10-11, and the lower 18-19, and the bottom 29-30. Traditional same-seed pairing of 1/2, 3/4, etc. accomplishes this latter result more often than not, resulting in slaughtering / non-close matches late (when they should be their closest).

    This applies not just to the top 8, but through the entire series of brackets in the tournament. You want the best players to advance. If the top two face each other earlier, you end up with way too hard a player beating face in the 1-loss bracket, and that's a negative consequence that a lot of people don't think of.

    ReplyDelete
  10. Having to defend the seed system used for decades by truly competitive events.

    Poor guy. It's like you re-invented fire or something.

    ReplyDelete
  11. Uh, I did.

    No, seriously, I INVENTED FIRE.

    Gore's got nothing on me.

    ReplyDelete
  12. @Mike

    I agreed that if you have a single-elimination, this is the case, but with swiss, this really doesn't happen. Unless your 'baby seal' ratio is truly abysmal, you'll filter them out reasonably quickly in any case. Take a simple system where you have 60 people, 20 competitors and 40 baby seals. Baby seals always lose to competitors.

    In round one, you'll get this (randomly, since each competitor has a 1/3rd chance to face another, and 2/3rds chance to draw baby seal, and each draw of competitor eliminates two competitors from the pool):
    6 competitors face eachother - 3 win
    14 competitors face baby seals - 14 win
    Remaining 26 baby seals face eachother - 13 win.

    Ratio is now 17 competitors, 13 baby seals at 1-0.

    Random draws,
    10 competitors face eachother
    7 face baby seals
    6 baby seals face eachother

    In the 2-0 bracket, there are now:
    12 competitors
    3 baby seals

    There's not going to be any baby seals at 3-0.

    So pure randomness has worked just fine at knocking baby seals out of the X-0 bracket, without any need to have anyone do a 'DREAMCRUSHER' on another army.

    Meanwhile, what about the competitors who faced eachother? Well, first round loss means they're out usually, in swiss, but if they lose round 2 or higher, they can fight their way up to go X-1 and have a shot at the Top 8.

    People try to manipulate randomness, but really, it works just fine, without need to create arbitrary points systems and rule sets and make people play DREAMCRUSHER on baby seals.

    ReplyDelete
  13. As GreyICE said Your gripe with Swiss is unfounded if you have the appropriate number of rounds. The chances of multiple bad players all advancing and never playing anyone good until the final round in a large field is infinitesimal. Yes you might have the guy who lucks out in the first couple of rounds but sooner or later he gets knocked out before reaching the true top tiers.

    I'm with you 100% on Battle Points though. They are an abomination and simply used by TO's who don't know how to create a decent tie breaker system.

    ReplyDelete
  14. Kevin,

    The bigger issue is trying to avoid top tier players knocking each other out. Can't avoid it entirely, but this most accomplishes it.

    The faster the mid tier makes it to the mid, the low tier makes it to the low, and the high tier survives to the end/high, the more balanced the rounds become and the more appropriate at achieving the goal of advancing the best the tournament becomes.

    I'm not so much opposed to GreyICE's commentary, I just don't think it's entirely complete. Randomizing matches puts things OUT of control; your results are extremely variable as a result, and players may get knocked down to brackets they don't belong in earlier than they should (in general, if you're a "3-1" player but you get knocked down there at "1-1" you may slaughter people the rest of the day fruitlessly, which can be frustrating for both the slaughterer and the slaughterees).

    That's the general gist of it ... you want #9 to get knocked to #9 in Round 4, not Rounds 1-3. You can't help that in Round 1, but you can help it in Rounds 2-3. Healthier for the tournament in general, and most rapidly improves competition and fairness at ALL bracket levels throughout the day.

    6 and half dozen in a sense, easily, but therein lies the thought pattern.



    BTW, Hulksmash suggests you as a good person to contact about West Coast Events. We are running an Invitational next year, and I'd like to get in touch with a couple of more tightly-run / fair / open / bigger draw West Coast TO's to give them an opportunity to get on board with it.

    ReplyDelete
  15. One way you can get around getting top ranked players paired early is seeding tournaments or pre-rankings that offer early round byes. This is commonplace in Magic: The Gathering Grand Prix and Pro Tour events.

    Even so, in our RTT's I've seen top players clash in say, round 2, and then watched the guy who lost climb back into it by round 4 and at least have a shot at winning.

    I'd love to discuss. My email: markemn@yahoo.com

    ReplyDelete
  16. In a system where you can climb back into it for the Best General component, that becomes a possibility. I'm not a fan of that - there aren't enough rounds for that to be a really reliably fair thing. That's for LEAGUES, not TOURNAMENTS, I would say.

    That's why we have the Renaissance Man track, and why we'll be splitting down to distinct mini-tournaments based off your Day 1 W/L record next year - so that people have something to compete for, but you don't risk the integrity of the results in a short-game system.

    Remember that if someone is good but knocked down to a lower bracket, he's climbing back up (if he can) by massacring worse players ... which isn't very fun for them, or fair for people he's climbing back up to meet - who are having to stay where they are by repeatedly beating people who haven't lost yet. You know?

    Will e-mail you :)

    ReplyDelete
  17. Mike, I'll happily discuss too (intrusioncountermeasurelectron@gmail.com). But basically, in swiss, your itiebreaker s based on your opponents match win record.

    So if you go win-win-win-loss-win to go 4-1, then you've beaten a random, a player who went 1-0, a 2-0, a 3-0, and a 3-1.

    The random can do anything (lets say he scrubs out, so does 0-3, then goes to get something to eat). The 1-0 guy beats another opponent, then loses the rest, to go a 2-3, not bad. The 2-0 loses his remainder, and also goes 2-3. The 3-0 wins one of his last 2, he's 4-1 too. The 4-0 who beat you wins his last, so he's 5-0. And the 3-1 you beat is obviously 3-2.

    So that's
    0-3 (treated as 0-5 here)
    2-3
    2-3
    5-0
    3-2

    The person who lost in round 1 beats, in order:
    0-1, 1-1, 2-1, and 3-1.

    The guy who beat him in round 1 goes 2-3. 0-1 is obviously 0-2, he rallies a little, but ends 1-4. The next is ends 2-3, the third ends 3-2, and the 4th ends 3-2

    2-3
    1-4
    2-3
    3-2
    3-2

    So the match win records? 1st guy has 12-13. Funny thing? Turns out he was kicking a lot of baby seals on the way up (check out his opponents, they were really scrubby until the later rounds).

    Person 2? 11-14. He loses out.

    Want to see where it gets interesting? This guy loses in round 3.

    His opponent match wins, in order:
    3-2
    4-1 (he was their only loss)
    5-0
    2-3
    3-2

    His opponent match win record is 17-8. So he beats both? Bug? No. He beat in round 2 someone who ONLY lost to him. He lost to a perfect 5-0 (tournament Ace). No one he beat lost out or was totally uncompetitive.

    So opponent match win allows for, if a 5-0 beats you in round 3, for you to get right back into the running in the finals.

    P.S. Quick lowdown on draws - basically, they'd work pretty easy here. Run rounds 1-2 on Friday night (possibly just round 1). Byes get to sit out 1-2 rounds. Base it on past performance at other Nova-qualified events or something (possible: if you won a Nova qualified event some time in the past year, automatic 2 byes, finalist at least once, semifinalist at least twice, 2 byes, semifinalist or finalist once - 1 bye).

    Anyway, I'd be happy to give you some insight, I've been to a LOT of magic events, and seen good systems, bad systems, the entire works.

    ReplyDelete
  18. My numbers don't bear out the multiple baby seal rounds for people with high battle points, as I've told you privately.
    The baby seal hypothesis can work both ways, and the pools don't consist of such simple units of measure. In a field of 128 players, matching #1 against #64 in round two is statistically much more likely to give #1 a significant advantage in the round matching him against #2. Arguing otherwise is disingenuous.
    I'm fine with the way you match people - any system that is consistent for all players is acceptable, but acting as if your system is mathematically more challenging to the #1 seed depends on a fair number of assumptions that I do not agree are correct. And we've had plenty of discussion to recognize that an actually conclusive data set to prove or disprove your hypotheses does not exist, and is likely essentially impossible to generate.
    I prefer the chance for a player to recover from a very bad game 1 or 2 and get back into the race; you prefer not to have it that way. You pair against top players playing early; I pair towards that possibility. Both are fine, and the top players are likely to play each other over the course of the event. I don't even consider the difference to be anything important - we're both looking to get the "best" result, and have different theories (both based in math and actual data) on how to get there. I doubt that we'll agree after year 20 unless we can get the same field of players into the same tournament groups 9 times or so...
    Like I've said, assuming we can avoid being right on top of each other next year (and it looks like we'll get that worked out), I plan to come to the NOVA next year.

    ReplyDelete
  19. Hey Mike,

    It's a bit complicated, eh?

    Since the first round is the most random with the greatest chance of a baby seal clubbing, perhaps scores should be reduced for round one?

    I wanted to share my modular mission system,found here: http://chaosgerbil.wordpress.com/2010/08/31/chaosgerbils-modular-mission-system-for-warhammer-40000/

    It uses battle points, but you can always translate that to a simple W/L/D model. Please let me know what you think.

    ReplyDelete
  20. Gerbil and others ... I've chatted out Wolf's comments w/ him privately, will repost maybe at some point.

    Re: the Missions, I don't have a strong drive yet to change those. While we may "nod" to the KP crowd, even that is not necessarily ensured.

    Remember to note that the missions themselves were playtested over - by now - close to 1,500 games. Our perception, therefore, of their impact upon balance/etc. is extremely well-founded; they achieve close, hard fought matches where skilled generals who can think well in advance benefit the most.

    More importantly, they balance across codices b/c of the readily "drawable" nature of each individual mission - even the VP to a degree (certainly more so than KP) - which enables players whose codex / list aren't as benefitted by an individual mission to position for the 2ndary or tertiary and wait to force a late draw on the primary that isn't beneficial to them.

    These things are well played out ... the issue is more on the lines of scoring / awarding prizes / battle points vs. undefeated / the place for "quality" of win in a limited-game format.

    Not the missions used.

    ReplyDelete