Quote:How would it not be credible? Methinks you just want him to out himself so the pitchforks come out.
And no...he wants to continue his experiment to see if he gets into top 100. They could likely already figure out who it is with the statistics provided, so giving what he did may have already jeopardized the project.
The Tribunal system already assumes a certain number of spam punishes in every case, and has measures to discount their impact. Doing a project like this doesn't prove spam punishers exist--we know they do, which is why the system assumes a baseline number of spam punishers in every case.
Quote:I just judged 20 cases and every single one was a punish. I tried to be fair and almost pardoned a few based on the first games, but every single time they changed my mind.
Will I be disregarded? I don't do the tribunal often because, quite frankly it's an unbearable thankless task. I wish there were some sort of reward, like an icon or a banner or something. Instead you get a rating nobody sees. At least if it displayed on your profile in game or earned you a border or ribbon you could proudly show that you did something so selfless and painful.
Quote:Actually its the opposite.
Assuming its true, spam punish should show HIGHER accuracy than overall. Because they are not taking an account and assuming it will punish. They are taking a case and assuming some will spam punish.
Very troubling indeed.
If the overall system accuracy is still extremely high, and the false positive rate is extremely low... why is this troubling?
This is not to say that we'll just let spam punishers keep voting, but we have been testing different solutions to try to make the system more robust against abusers. When we're done experimenting with alternative solutions, we may just ban spam punishers--they really aren't that difficult to detect.
Quote:Its troubling not because of the effect, but in terms of the explanations that have been provided. Previously (and maybe I mis-remember), there was things said like "a vote weight being 0" for a spam punisher. This implied to me that the way of solving a spam punisher was to make that particular persons weight 0.
Your new explanation is "we deal with it by making majority not just majority". So, for example, if you felt 50%+ is all that is required to punish, then you just push it up to 60%+. This makes the assumption that 10% of the voters are spam punishers.
This is also why we could easily revert things if we saw it perform worse than what we had before.
We've actually changed our stances on issues a lot; for example, we used to give permabanned players a second chance. We used to tell players it was OK to make a 2nd account when they are banned. We used to never reveal reasons for why players were banned, and assumed players knew what behaviors were inappropriate.
Experiments are tried all the time and things change.
Quote:That is quite fair, and I think completely understandable from your point of view.
On the flip side, I still feel a bit of a betrayal feeling as a regular defender of Tribunal.
I think an argument can be made that when you take on a style like crowdsourcing, there are a lot of hidden pitfalls when it comes to the crowd that go beyond just looking at statistics. For example, wikipedia could mitigate its losses (potentially make some money) on having a single ad on each of its pages. Yet, they cannot do this not just because the founder is against it, but it is something that would upset many of their users (users being that subset which regularly update articles). Whether this constant opaque experimenting would be a similar pitfall, I honestly have no clue for majority of users. Just speaking personally, it is.
We're trying to keep players updated as much as possible though--the recent "Updates to the Tribunal" thread is a summary of a handful of experiments we tried in recent months that we felt were successful enough to be long-term additions to the system.
We're also being more transparent with data than ever before--with the launch of Reform Cards we intentionally allowed the data to be scraped and analyzed. We do this because we believe in the system and the data, and want players to dig in and help us develop new ways to improve the Tribunal or help us find flaws in the system. But in a world where things change so rapidly, it's nearly impossible for me to keep up player communications and still lead my teams--but we'll keep trying our best.