Return to site

Scaling Augur Part 2, A Roadmap

A special thanks to Vitalik Buterin for discussion of many of the ideas in this post.
P.S. there's a tldr at the bottom
Basic Scaling Commentary

First off, we need a way to limit the number of events people are required to report on, which is fairly easy to do.  However, we also need a way to select which events these should be (not so easy).  We could have whoever pays the highest fee to list an event (i.e. an auction), however I think a better solution is to use the top n events corresponding to the n (maybe 10000) most popular markets by volume.  The reason for this is not wanting valuable reporting time being spent on spam markets. This maintains the highest security where it’s needed the most.  And additionally, we can randomly select people to report on events based on volume (depending on how this plays out when testing, there’s a good chance we won’t need to have this top n style system at all).

We’ll need a sorted data structure to keep markets in order of liquidity. This solves the “spam the system with clearly indeterminate events/markets” problem as they won’t be in the most popular markets, and some people will happily report indeterminate to nab the bond the event creator put up (but we shouldn’t require reporters to report on markets like these).  Events beyond this core group would not be required to be reported on, and for the most part, would be either spam or really unsuccessful markets (i.e. ones with almost no trading activity).  


We also need a way to limit the number of people reporting on the core events (and randomization happens to solve this too).  Having the number of people reporting on a given event be too big would make on-chain consensus too infeasible (and it’s required to be theoretically possible in case our bonded consensus system discussed in part one of scaling augur gets challenged).  One interesting aspect of off-chain bonded consensus is how to set the new rep vector.  Having the bonded operator pay for it with say 3k reporters, would cost ~$1.50 at current ether prices, so not too bad.  Although to prevent sybil attacks here, the proper production way to do it would probably to have lazy eval rep redistribution, meaning if consensus has been done, and rep needs to be redistributed, the first time you try to send rep (or get balance) it could update your rep to whatever it actually should be.  So you could start off with 5000 rep, never report, and you’d “have” 5000 rep up until you tried to send it or get the balance, and all of a sudden, like Schrodinger’s cat, it’s gone (dead)!  People could ping the network to ding dead accounts (although they wouldn’t need to do so for any good reason).
*Note*: for fancy sorted data structures in eth, we can use something like an order statistic tree.
Old Way of Scaling (skip unless you're interested/heard of this before)

To limit people reporting on this “core set” of events we could have two groups of reporters.  “Core reporters” and “reporters.”  There’d be a balances of rep data structure and then another one for core reporters.  Regular reporter data would be done on a per market basis.  There’d be a maximum number of core reporters, and they would be able to report on events in the “core events” above and events beyond the core ones.  Regular reporters couldn’t report on the core events, but could report on anything else.  For events beyond the core set whoever reported on them would simply get a portion of the trading fees, full consensus would only be done on the core set.  (We’re working on making it easier for people to report on more events, but that’s a topic for another post).  There needs to be a way to “petition to be submitted into core reporters” if you are in the top say 3000 reporters - should just be a function call that checks if your rep is > rep of lowest person then puts you in wherever you fit in the statistic tree and bumps the lowest person out.  If someone ever gets down to 0 rep, bump them out as well.

Take the top x markets by volume (these are the most important to be reported correctly, if cousin Billy’s minor league baseball game gets reported wrong, it’s not the end of the world. If something with $10M does, it’s a huge huge problem).  Then require the reporters to report on this minimum set — it should take in the neighborhood of 2 hours / 2 months is what we’re shooting for, so maybe around 200 or so markets (we can always modify this if it turns out to be wrong).  Rep redistribution would be based off this minimum set.  Beyond the minimum set you can report on whatever you like and get trading fees from those that you report on.  If you look at most financial markets and wagering markets (so things like the Chicago Mercantile Exchange, Intrade, etc.) you’ll notice it’s a handful of markets in each that account for the vast majority of the volume.  The most popular markets are when the system is needed the most (because a collusion attack makes more sense if $100M is on the line, so it needs the full power/incentives of the reporters to make sure that doesn’t happen).  Now if you get something with fairly high volume but it doesn’t make it in the minimum set, it still maintains most of the security properties we want because more reporters will report on it to earn the trading fees than something with lower volume. 
This results in a large incentive to outsource/pool, which is part of why we’re not doing it, as well as that while it scales, it doesn't do so nearly as well as the new way.
New Way of Scaling

To limit people reporting on this “core set” of events we could randomly select people to report on events.  For events beyond the core set whoever reported on them would simply get a portion of the trading fees, and full consensus would only be done on the core set.  (We’re working on making it easier for people to report on more events, but that’s a topic for another post).  Note that this core set may not be needed at all after further testing randomization algos.

One problem that appears is what if someone sybils a market by making a bunch of rep accounts and submitting a bunch of reports, causing consensus to be quite expensive?  The solution is to simply have each reporter pay for the fee to compute their own report in consensus; if they don’t, it wouldn’t be counted. This would prevent someone from simply making a ton of accounts and sticking the network with the fees.

How do we randomly select these people?

A New New New Consensus Algorithm
Randomly Selecting What to Report On

This had been discussed before, but had lots of problems due to the difficulty of randomly selecting people to report based on their rep and then weighting their reports by rep… it didn’t really work.  I had always been trying to get random sampling with weighted ballots and concluded there wasn’t really a way to do it that both scaled & didn’t have a ton of incentive problems.

Vitalik brought up that instead of weighting reports by rep, we could only use rep to select people to report (and calculate trading fees), but once selected they get one vote. (i.e. on a given market, if I have 10 rep and you have 1 I have a 10x higher chance of getting selected to report on it).  If I’m a whale reporter, I should split up my accounts to report (they’ll each get selected to report on less events, and as a whole I’ll have to report less since more accounts means many will get selected to report on the same event).

Selecting people (a nice simple way to randomly select people to report provided by Vitalik):

    We let:

  • V = market volume
  • D = a constant set so that each reporter would report n times on avg (i.e n = 50)
  • R = total rep

Then when the reporting period ends, reporters are allowed to report if sha3(voter addr + eventid + eventExpirationBlockHash) < reporter REP balance * D * V / R

Note: eventExpirationBlockHash is used to limit precomputing attacks

We now have two options for what to do with this data:

  1. Take the reporters with the lowest submitted hash values divided by balances (to allow for weighting of rep).
  2. Set D using D = n events / all volume * 2^256 so that each reporter on average needs to make n reports.  In rare scenarios that a market has almost no volume, it’s best to check with an if statement if this is the case and set things up so that n events * market volume / total volume is always >=1 (so statistically speaking, the market should get reported on).
This seems like a pain from a user perspective “You mean I have to find the hashes of my addr + a bunch of events?”  But in reality isn’t a big deal: the UI would do all that automatically.

Outcomes would be calculated the same way (weighted averages for binaries and weighted medians for scalars/categoricals).

In the rare possibility that no one gets randomly selected to report on a market in a given period, on the last day, we can change the sha3 threshold.  Statistically speaking, this is nigh impossible, but good to check for just in case.  

Penalizing Those Who Reported Incorrectly
Loss and profit function invariants provided by Vitalik.  The reasoning behind them is you want penalties to be worse near contentious decisions, and this maintains that invariant (like the old system had).

We let

  p = proportion who reported correctly

  gL = the loss function for wrong reporters

  fP = the profit function for correct reporters

  g(p) is the new rep function for wrong reporters

  f(p) is the new rep function for correct reporters

Invariants:

  gL(1) = fP(1) = 0

  gL(0.5) = 1

  fP(0.5) = 1

  gL(p) * (1-p) = fP(p) * p

  g(.5) = 0

  f(1) = g(1) = 1

  f(.5) = 2

  p*f(p) + (1-p)*g(p) = 1

We can penalize linearly:

  If we have gL(p) = 2 - 2p  — this is a loss function

  1 - loss = new rep weight for a user or g(p) = 2p-1 * oldRep

  If we solve for fP(p) we get fP(p) = 2 * (1-p)^2 / p — this is a profit function

  1 + profit = new rep weight for a user or f(p) = 1 + 2*(1-p)^2/p * oldRep

Or we can penalize quadratically (a bit harsher: we penalized quadratically in the clustering algo):

  We make a parabola thru (.5, 1) (1, 0), (1.5,1) to get gL(p) or 4x^2 -8x + 4

  4x^2 -8x + 4 — initial loss function

  New rep for wrong reporters is (-4x^2 + 8x - 3)*oldRep

  Solve for fP(p) and we get

  (-4(x-1)^3)/x — initial profit function

  New rep for correct reporters is (1 - (4(x-1)^3)/x)*oldRep

There’s an alternate way to punish people which is to compute the variance for each event from the reports given, then penalize people proportionally to how much variance they added.  This is similar to how we did things in clustering, except we took an indirect measurement of variance (distance).  We’ll determine which has better properties via testing, although I doubt there is much significant difference. 

From our clustering algo, penalization functions would be similar, but instead of d it’s v:

Normalized rep functions:

       1 / (1 + d[i]) ^ 2 - my favorite, penalizes liars quadratically away from the truth

       1 - x/maximum(x) - penalizes liars harshly

       1/(value + epsilon)

       * x / d[i] / value are all the distance (in this new version, variance) a reporter is from the score

Then normalize to add to one, and that's the fraction of "new rep" a reporter has.  To get the current rep taking into account both old and new rep a person's reputation is updated by .8*oldRep + .2*newRep.  Then they're normalized again. 

People would be punished on a per event basis here.

Penalizing Those Who Didn't Report / Didn't Report Enough

We need to also penalize people depending on whether they reported or not.  We should only do this once in a while (say every two months).  We could do this lazy eval style as well so people can’t spam the network with cost by making a ton of reporting accounts (this time, accounts that didn't self report would need to be pinged).  If, statistically speaking, there’s a > 99.999% chance (or higher, if we make this even less frequent/make the % threshold window a bit higher) that you didn’t report as much as you should’ve, you’d be penalized proportionally.  Number of reports made / reports expected * rep would be the new calculation (and we’d only do this for up to 20% of rep like usual).

To calculate approximate number of events you should have to report on per acc.:

sha3(addr+eventid) < addr rep * n events * 2**256 * market volume / (total rep * total volume)

== sha3(addr+eventid) < rep % * n events * 2**256 * market volume / total volume is the probability of reporting on any given event

For example, at 100% rep and 30 events w/ current market @ 1 volume and total volume 150

  we get 1 * 60 * 2**256 * 1/150 as what our hash has to be below

  D = n events * 2**256 / total volume

  if 20 volume, 20/150 * 60 = ~8 reporters on that market, not really enough.  We need about 30 as a minimum for a very low liquidity market.  

So there's two options, get rid of div by volume (not practical or good for security) or increase the numerator (market volume)

  • increasing by > a power i.e. ^2 or ^3/2 results in odd results where it’s way too big or small
  • a better way to do it might be to do things quadratically, i.e. y = -2.8 x^2+5.6 x+0.2 where y(x) is y(fraction of total volume a market is).  So if a market has 20% of all volume it gets a weighting of 1.208 --- this would replace the market volume / total volume part of the equation
  • we could also try taking a power <1, i.e. sqrt or something similar: it'd keep a minimum amount of reporters even as things got small, and have larger amounts with more volume

Rep % * n events * 2**256 * y(f) for each market all added together (if >1*2**256 then the probability is 1) is the number of events one should report on in an average period, unfortunately, the simplest way to calculate this involves summing all events across all reporters.  Actually, upon running the numbers this is super cheap (~$0.00088 / 1000 events at current ether prices per reporter).

Commentary on Randomized Reporting (skip unless interested in differences between old/new algo)

Similar to the old consensus algorithm, if a bunch of people report wrong in a fairly random distribution then outcomes are not affected and they’ll lose rep.  80% of people could vote unrealistically on random events and this consensus algorithm would do fine.

The one thing this misses is by not cross referencing ballots you don’t have the benefit of where if you report incorrectly on one event, then the rest of your reports would be counted as less in that same consensus period (since redistribution is done first prior to outcome calculation in the old version).

However this sort of begs the question: was there ever an instance where that actually affected the results in the old algo?

And the answer seems to be no… every single instance / test case I can come up with when rep redistribution is done and the liars lose rep, they don’t lose enough that period to push the decision outcome very much, and not enough to cause enough of a delta to change the outcome (which requires a 65% threshold).  

By this, let’s say we have a report matrix where in 2 events 4 people tell the truth, 2 lie and in 2 events it’s 50-50 on each outcome.  The old algo looks at that and sees that the people lied on the first 2 events, docks their rep that period, does redistribution, then calculates the outcome.  It moves it from a naive (pre-redistribution) .5 outcome (i.e. indeterminate) to… ~.53, still indeterminate.  And also quite far from the 65% certainty threshold… it’d need to move it to >.65 or <.35 to have the actual results of the outcome change to either 1 or 0, instead it gets caught by the threshold parameter.

Now we take randomization and do a simple weighted average for them all… the first 2 events get reported right and the second 2 get .5 outcomes … which is exactly the same as the old algo.  

This case is the extreme case, as far as I know, which should mean that for other cases doing redistribution a priori to calculating outcomes doesn’t do a whole lot.  

The only instance it actually does anything is if you take >.5 as 1 and <.5 as 0 cutoffs, which was never how any iteration of any consensus algo did this for the system (we always had a bit of a higher threshold).

The reason clustering worked better was because the false positive rate was lower than pca consensus and it had a nice harsh way of directly & proportionally punishing liars (similar to pca).

The key reason you’d really want to cross-reference is because it allows you to punish liars a priori across all events taking into account how much variance/distance (in the case of clustering) you added to the system, but per above, even with this taken into account it doesn’t a) change the actual outcomes in reality and nor does it b) substantially alter the punishment of liars (i.e. we can punish them just as harshly with a randomized system).

3 Levels of Consensus

1: traditional consensus (off & on chain --- off chain bonded consensus may be irrelevant after these new changes, needs testing to be sure we won't need it).

2: this level was proposed on reddit by /u/vbuterin --- essentially anyone can pay a bond to have an event put up for reporting on again, but this time by all reporters.  If the second round returns the same result, the bond is lost.  If the previous decision is overruled, then whoever posted the bond would get double the bond back.  To not have to deal with conversion issues, it’s simplest to keep the bond as a rep bond.  The rep to reward the bonded challenger would come from the people who reported wrong in level 1.  The benefit of doing this is, it essentially allows us to have a security model very similar to the original, but only when necessary (allowing us to do things cheaper, faster, and with less work most of the time).

The easiest way to set the bond would be to make a flat cost, say.. 100 rep.  There can’t be an extremely low minimum (i.e. 5 cents) because then you could spam the system by paying to put a bunch of events up for re-adjudication cheaply and repeatedly.  100 is large enough not to get spammed, and small enough that people could create a dominant assurance contract to put something up for re-adjudication if no one with >100 rep is willing to step up.  

We can’t simply pay out 2x though, since there theoretically may not be that much rep which reported on an event, so the easiest way to code things is probably to a) return the bond and b) reward the bonded challenger with whatever rep would normally be taken from the liars up to 2x the bond, then beyond that the people who originally reported whatever the actual truth was would get the rest. 

3: this level was originally proposed albeit indirectly in a post on p+epsilon attacks on the Eth blog (also by Vitalik), and has been in the roadmap a while, but hasn’t been implemented yet.  Essentially, we could allow anyone to pay some amount significantly greater than the bond amount to force a branching event, splitting rep into two classes.*  In one class the reported outcome for whatever event was the cause of dispute is said to be right, and rep would be redistributed accordingly.  In the other class/branch, the other outcome would be considered right and rep would be redistributed accordingly.  Whichever outcome was truly the correct one would determine which branch had rep that actually held value.  This would be akin to a Bitcoin hard fork scenario.  The winning fork, of course, would be the one with the most voluminous markets.  Market makers should then be able to move their market to whichever fork they want, if they fail to act, it'd go to the most voluminous fork by default after a couple weeks.  

* Note: traditionally branching would be done to split into two subjects say for scaling reasons or for differences (i.e. people don’t feel like translating their markets from Chinese -> English every time, so they just make a Chinese branch).  And normally rep is just copied into both branches (you can choose to hold both, or sell off one). 

These levels of consensus provide multiple lines of defense in case something goes wrong (i.e. if someone tried to make a voting pool, and they had >65% rep and reported wrong, we’d get to level 2, and if people still hadn’t left the pool yet they’d report wrong again, we’d get to level 3, a branch would happen, and the market would sell off the rep in the branch with the wrong outcomes and buy the rep in the branch with the correct ones.  An average rep holder/reporter trying to play it safe would simply hold on to both until the dust settles (just as if Bitcoin forked due to XT / core, you’d have two sets of Bitcoin). 

A Bonus Scaling Note / Idea

What if we made the soft expiry system a little softer?  Currently markets are allowed to trade up until the two month reporting period.  Since their odds will tend to 1 after an event happens for the winning outcome, it’s not a big deal that we don’t resolve more frequently.

An interesting idea would be to only resolve ones that had odds <99% for one of the outcomes.  So, come reporting time, if a given market has odds >99% for one outcome, anyone who wants to exit that market can, and there isn’t much of a reason to spend reporters’ time reporting on such markets.  The incentive of reporting is still there, if for whatever reason the market doesn’t go near 1 on its own (which is why something like this sort of system without reporting wouldn’t work --- there’d be little incentive to sell your losing shares at the end if you knew the market had no mechanism for resolving without your aid). 

Note that this is not auto-resolving.  Auto resolving would be if we checked the current price and said (if price > .99) then doPayouts();.  Instead, this is simply looking at a market and seeing if anyone who wants to can close their position/exit at a full payout - epsilon, and if so, there’s no reason to officially resolve the marke

This also has the benefit of making the system scale much better.  If there are 10000 events every two months and only say 5% of them are contentious outcomes (i.e. Bush-Gore, Santorum-Romney Iowa caucus, etc.) where the reporters need to step in, then that’s only 500 events to be randomly selected to report on.  We should probably still have an option to pay to resolve in case something somehow goes wrong here.  This also doesn't work for scalar markets (although it does for categorical).  

A Note About Reporting Requirements (also a TLDR of everything above)
Essentially, we say there's a random chance you get selected to report on events proportional to volume and how much rep you have.  Only about 5% of these markets won’t be at >99% odds for one outcome at the end (meaning 500 actually need to be reported on).  Let’s say we have 10000 events every two months, so 10000*.05 = 500 events across all few thousand reporters that need to be reported on.  This means we can have each reporter report on say 50 on average and still have a lot of reports per event (50 events* say 2000 reporters/500 events equals around 200 reports per event for level 1 of consensus).  These are much better numbers than the old system or the old way of scaling things, and that's if there are 10000 events every two months!  With branching the system can scale even further. 
Fee Structure Notes
For initial liquidity we want a minimum fee of ~$50 to ensure profitability for reporters.
Fee to make a new branch in level 3... currently thinking ~$1000, should be very rare and used quite sparingly.  New branches outside of fork scenarios could be cheaper, maybe a couple hundred dollars, or could even be triggered by a vote of reporters.
We need to implement the new market fee structure described by evand, making fees at the edges less insane.  
Whoever “closes” the market should get a small portion of fees to simply cover the gas cost to close the market and do payouts. (or lazily eval this)
Appendix Q&A / Random Todo

Q: What prevents you from making a bunch of acc to do the reporting eligibility rng game over and over?

A: The simple solution is to add the block hash of the expiring block of the event to the sha3 reporting calc and make people commit ahead of time what address/rep they want to use when reporting.

Q: Can we do lazy eval claiming of trading fees?

A: Yes:

      if(addrWasInThisConsensusPeriod):

          send them cash of amount equal to fees from that period * rep owned by addr in that period / total rep in that period
  • We should consider making the second half of the reporting period a few days or week long period every two months to allow trading more.  The reason being, you cannot trade rep after submitting your plaintext ballot and salt, so limiting this portion of the period is best for liquidity.  Since you’ve already created your ballot in the first part of the period, you can simply log in to an Augur client and it’ll auto submit your plaintext report + salt when the time comes.  

  • Need to make sure we close out markets where one doesn’t happen (i.e. Hillary or Bush? and GDP>3% growth 2017?) properly.
  • Add def moveMarket(market, newBranch, marketAuthor) in case of a fork
  • Use consistent 1 and 2 fixed point numbers as min and max for close market, make market, make event, buy/sell shares, and consensus on binary events - really, just use base 64 fixed point everywhere
  • Make sure there are no "while loop out of gas" issues
  • if .5 due to catch param push back once (as .5 outcome), if same on next consensus no more push backs, that’s the final outcome
  • if event gets pushed back due to 65% thing make so people can still buy / sell
  • We still need to make support for generic subcurrencies as opposed to cash.  This simply involves generalizing our current play cash functions to allow the market creator to set a subcurrency and use that cash function whenever specified.  Also need to add regular ether support for buying/selling.