Binary Rating Systems in Digital Products: a view through the lens of fairness and game theory

The concept of fairness in the field of cryptography is something that I find fascinating. The idea of “​​creating a process that uses a system of incentives and/or disincentives to ensure fair outcomes for participants who don’t trust each other”, seems to me nothing less than extraordinary.

In the book “Mastering the Lightning Network”(2022), the authors illustrate this concept with a very simple example: a mother has two children, both of them fanatics of french fries. Every time she prepares them and serves each child a portion, problems come up, as both of them always complain about the portion they received. The authors suggest that a “fairness protocol” that would solve this problem would be one where, instead of the mother serving the portions, one of the children divides the portions and the other one chooses which one they keep. In this way, not only is the need for a third party that officiates in authority (the mother) eliminated, but the system itself makes the result as fair as possible through incentives and disincentives.

“In any type of trade, fairness is essential. Since trade is founded on trust, a participant expects that he gets what he wants in exchange for the traded items.
A technology or protocol that does not discriminate against the honest and correctly participating members is said to be fair. Transacting on a fair protocol ensures that honest parties receive the goods for their payment while dishonest parties get penalized for their misbehavior.” (Fairness in blockchain, B Sriram, 2019)

As a product designer, this has led me to wonder if there aren’t any cases where certain product features we interact with daily could be improved with concepts like this in mind. Here I present a brief exercise where I will analyze and find a way to optimize the results of a binary-type rating system commonly found on platforms that we use every day.

Many thanks to Sebastián Martínez and Ana M. Smith for feedback and review.


Rating systems in digital products help us draw conclusions about the quality of a product, whether something is right for us, or demonstrate our satisfaction or dissatisfaction towards particular content.

Whether the system is a binary-type (like / don’t like) or a numerical scoring (rating from 1 to 5) it is usually very useful but, without a doubt, they are far from being perfect. They often generate currently outdated averages or severely punish reputations because of failures that could be considered minor.

Since these rating systems fall within what we commonly call gamification, we can carry out the exercise of analyzing them from the perspective of Game Theory in order to better observe their dynamics and the positive and negative aspects that they could present.

What is Game Theory

Game Theory “studies situations in which both the actions carried out by individuals and the results that come out of them depend on the actions that others can carry out” (Juan Carlos Aguado).

Since these types of situations, called interdependence, are related to what others might do, they will give rise to different strategies and, therefore, it will be possible to try to determine what actions the different participants could take in search of the best results.

In our case, the users and the platforms are the ones acting as players and the rating is the game that will bring different results.

The system we will analyze

In this case, we are going to analyze a binary-type rating system taken from Binance: how a user is rated after a P2P sale.
We will see what the rating dynamics of the current system are and if there are any alternatives that could achieve a fairer outcome.

Dynamics of a P2P transaction and its consequent rating

In a P2P sale, the transaction takes place as follows: the buyer transfers their fiat money to the seller’s bank account directly with the data provided and notifies it. After confirming the effectiveness of the transfer, the seller releases the tokens to the buyer.

This is how the “happy path” of a sale would end, but this may not always be the case. The seller may not receive the agreed amount of fiat money or this may not release the tokens. In both cases, they could solve the conflict by communicating with each other through chat, but if there is no resolution, the next step would be to appeal and require the platform to solve the dispute.

In this case, we will focus on a transaction where the sale is completed successfully and both participants are invited to rate the other.

The rating game

At the end of the sale, the participants are invited to vote on each other’s performance positively or negatively.
Although the transaction may have been successful, there are reasons why they could be rated negatively, such as one of them not being happy with the execution times or that there is a case of disrespectful behavior through chat.
Given that both the time that one has delayed and what one or the other could consider offensive in the communication is relative perceptions, a negative vote could not make sense for the participant who gets it, according to their judgment, unfairly affecting their reputation on the platform.

Let’s put this system into a graph in a very simple way to better observe the results of the vote.

Considering that Vote positive (P) > Vote negative (N)

We could say that:
P = 9
N = 3

In this graph, we can see that one of the main problems of this rating system would be that what one rates the other has no consequences on the vote received, in other words, there is no room for strategy, therefore there would be no interest or any motivation for the participants to previously meditate their vote.
The final result, in the long term, will be an average of ratings per participant with a margin of error that could be questioned.

Incentive and compensation system

What would happen then if, in order to achieve a fairer outcome for each of participant, we alter the previous system by introducing the possibility that the rating given also has consequences on one’s final result?

As in the example provided at the introduction of the two children who competed for the best portion of french fries, knowing that what the other decides will directly affect us, will make us think better before making a decision.

Let’s add to the values ​​P and N a new value Q = 6 that will be obtained by whoever casts a vote P but receives an N in response. The values, then, would be ordered as follows:
P > Q > N

And therefore the previous matrix would now be the following:

What we are looking for with this new proposal is that if, for example, P1 acted badly and admits it, this one has the possibility of reducing the impact of the probable N rating that he will receive from P2 and, in turn, compensate P2 by guaranteeing him a P vote.

However, we can see that if a participant votes for N, the results that they could receive are 3 and 9, and if they vote for P, they could receive 6 or 9. It is worth asking why, if the best results are obtained by voting for P, someone would vote for what is the opposite.

A participant will always be faced with the possibility of receiving the lowest rating (3) or, at the other extreme, the highest (9). He only has an assumption, to a greater or lesser extent, of what the other participant will vote, but not a certainty. What he is certain about is how to rate the other’s performance while also knowing that there is an incentive on the other side to rate him P and not N, so the probability of voting N and still receiving P is increased.

On the other side, the situation is the same, so both participants must meditate on the decision, appealing to their good judgment. The truth will not be known until after the voting is over.

Of course, we could never foresee a case in which a participant, let’s say P1, acts improperly, and knowing that he will receive a rating N also chooses to vote N unfairly punishing P2 but this will tend to resolve itself.

In the short term, a score of 3 will be unfair for P2 but fair for P1. In the long run, however, P2 will have opportunities to improve his average, but P1 will not unless he changes his course. This irresponsible action on the part of P1 indicates that he does not care about his reputation on the platform and in this way the same system will keep him out unless he changes.

With this, we want to make it clear that the previous proposal does not achieve a perfect system, but one that reduces the burden and the probability of imprudent voting N in order to obtain fairer averages.

Conclusion

Just as it is possible to improve a simple aspect within a product such as a rating system, the concept of fairness could also be considered in other stages and even as a base philosophy at the moment of designing it. The idea that the system itself minimizes or even fully eliminates erroneous results does nothing more than take work and worry away from the user, greatly improving confidence in the product and, of course, their experience with it.