The Free TON community held a contest in which participants had a unique opportunity to swap places with the judges and evaluate those who usually evaluate the contestants — the Free TON jury.
Free TON is distinguished by the active involvement of competitive procedures for the development of the community and the project as a whole. Contests are designed to distribute tokens among winners, improve blockchain tools, and create interesting and useful products for the community. The community jury evaluates the contest entries by giving points or rejecting the work, thereby ranking entries and determining the winners.
Contests with a large number of participants (about a hundred or more), with a creative focus (essays, heroes, videos) suddenly drew the attention of many members of the Free TON community to the jury. Some of the judges’ decisions caused a wave of heated debate, which eventually led to a request for a review of the jury activity.
Analytics & Support Subgovernance took the initiative to study and evaluate the work of the Juries. The Subgovernance held a contest in which it invited the participants to develop metrics of the Jurors efficiency — unified or specific for the Subgovernance or the professional area. Based on the metrics, the efficiency of several or all Free TON Juries should be analyzed.
We present to your attention the main results and conclusions of the contest.
Work № 31 which took the first place, represents the implementation of several key tasks:
- collection of statistical data related to Jury activity;
- converting data into indicators of Jury activity;
- development of a graphical interface that provides a simple and clear interaction with users.
The interactive interface model is the most important and strongest aspect of the work. Moreover, such a development is an absolutely unique proposal among all the ideas implemented by the contestants.
In general, the work is presented in the form of interactive panels (Dashboards), where the data is divided into three main sections: Indicators, Judgment, Jury member.
On the first page, information is available both for the community as a whole and for each Subgovernance separately. Users decide which data group to display and set the desired parameter. In addition to statistical data on Subgovernances, contests, applications and voting, the overall performance indicators of the jury are presented here. According to the author’s idea, the result is displayed graphically in the form of an overall jury rating (TOP-10 or the entire list) based on several parameters.
The second page contains general statistics of contests and the main parameters of judging. All data can be filtered by the project as a whole, by each Subgovernance and by a separate contest. As an additional study of judging, a cross-tabulation is offered, which presents the percentage of coincidences in marks between different members of the jury in specific contests or generally. The author believes that such data allow establishing the facts of synchronous voting and hidden agreements of judges.
The third page shows the personal statistics of each Jury member. There is detailed information about personal activity available on the Free TON forum and in Telegram.
There is also a separate section dedicated to the Jury activity. It is presented through the following indicators:
- percentage of voting;
- percentage of votes fullness;
- number of casted votes;
- number of «abstain» and «reject»;
- average mark given by a Jury member;
- the amount of Jury rewards.
Also added a visualization that shows us how each member’s average voting is compared to the average voting results.
Thus, the work is a promising tool for collecting, analyzing and displaying data on judging in Free TON. Despite the fact that it does not yet contain fully developed solutions for evaluating the Jury efficiency, it has the potential to integrate any new approaches to the analysis and evaluation of judging. In addition, good visualization reduces the threshold of entry into the project for all interested parties and increases the degree of interaction with the community.
Work № 29 which took the second place, identifies the main problems that reduce the Jury efficiency, as well as options for solving the problems. In addition, the author developed the criteria for the analysis of judging, and demonstrated their use in the example of studying voting in the Free TON Virtual Hero contest.
The author believes that the main obstacle to the effective implementation of judicial tasks is an incorrectly drawn up Contest Proposal, which does not specify the evaluation methodology. Therefore, a special system of criteria should be developed for each contest, which guides the jury in the tasks and priorities of the contest.
The following are also mentioned as factors of reducing efficiency: the lack of influence of the «reject» mark on the average score of the submission, poor-quality (uninformative) Jury comments, as well as the possibility of judges not to get acquainted with the submissions, copying the marks given by other juries. Appropriate solutions are proposed:
- take into account the «reject» option in the calculation of the average score;
- oblige the jury to write detailed comments;
- hide the points until the end of the voting;
- provide the community with more information about the time and sequence of voting for each Jury member;
- introduce a system of fines for judges up to exclusion from the jury.
The author uses the following two criteria for evaluating effectiveness:
- The percentage of work completed by the judge, i.e. the number of evaluated works to the total number of submissions.
- Deviation of the score given by the judge from the average score of other judges.
The second indicator is key and for more detailed analysis, it is used in variations, taking into account deviations in the direction of underestimation (in the table in red) or overestimation (in the table in green), both in all works and separately in prize.
The work demonstrates a consistent and reasoned approach to analysis. First, a general algorithm for using the criteria is described, and then it is applied to the results of the Virtual Hero competition, showing how unscrupulous judges can be identified.
Submission № 18 is a work with a critical and step-by-step approach to the development and application of Jury activity evaluating criteria.
Initially, the author tests five main voting indicators using the example of the Essay Contest data analysis. Then he discusses the analytical potential of these metrics in relation to the evaluation of the jury’s work and forms a set of indicators for the final report of the Jury activity in Subgovernances.
The obtained results allow us to draw conclusions about the degree of the Juror influence in voting, the rigor in the evaluation of submissions, the degree of their influence on the distribution of grades and the formation of the rating. In addition, it is possible to conclude about independence and versatility in the assessment based on the analysis of the uniqueness and argumentation of the judges’ comments. At the same time, the author critically evaluates his own results and claims that they only characterize the involvement of judges in voting, and do not allow a direct evaluation of their effectiveness.
The main problems of voting, which were described by many contestants, relate to several aspects of Jury activity.
Firstly, insufficient participation and involvement of the jury in the form of ignoring voting as such, as well as voting at the last moment, giving random points or copying points of colleagues due to a superficial study of the submissions.
Secondly, the excessive subjectivity of the jurors. This is manifested in the demonstration of jury personal preferences without reference to the community interests and the contest requirements. In addition, the level of subjectivity increases in cases where jury members have to judge contests that are not related to their qualifications and professional experience.
Third, the lack of transparency in the evaluation. This aspect is closely related to the problem of subjectivity described above. The approach to scoring is at the discretion of the judge and is not public information. This provides the basis for speculation and accusations of jury incompetence. Сarelessness and inconsistency of assessment is inevitable when there are no clear criteria and guidelines.
Suggestions for improving judging
The authors of some works not only analyzed data on current Jury activities and identified problematic aspects, but also put forward proposals for improving the process.
There are proposals to introduce the qualification levels of the jury and a differentiated approach to remuneration of judges, depending on their level, as well as on the number of evaluated submissions. It is also proposed to use penalty points and sanctions against the jury for non-participation in voting and unfair work (the corresponding examples are well described in work № 17).
In addition, special attention is paid to the jury’s comments on the points awarded. Contestants express their views on the need to improve the quality of comments. In particular, to increase the degree of their uniqueness, informativeness, argumentation and ethics.
Perhaps the most sensible suggestion is to use scoring tables that form the final grade.
The Problem of Jury’s Comments
The Jury’s comments on the works are not only considered in the contest papers, but are also actively discussed in chats and on the community forum. Therefore, we will outline several important points.
Why do comments get so much attention?
Obviously, comments are an interesting piece of communication between the judges and the community as such, but more important is the fact that comments are the only clue to understanding the scores given by the jury. There is nothing else that can clarify the situation. It is not surprising that more and more demands are being made on comments. Requirements that the judges cannot fulfill, especially in competitions with a large number of submissions.
How can we remove focus from comments?
By offering a transparent, open criteria-based grading system. In this case, commenting either completely disappears, or it can remain as an addition, as a fact of live feedback, when it is quite acceptable to express a subjective impression.
Efficiency Indicator and Community Opinion
As a result of the contest, we received materials of various approaches and criteria for evaluating the activities of the jury. However, not all of them are necessary for the Jurors Efficiency Indicator. Effectiveness should include clearly measurable criteria and be based on requirements for judges. If the Efficiency Indicator is to be used to reward or penalize jury members, then these requirements should not be difficult to fulfill. The current stage of development of the Free TON project and the real possibilities of judges should be taken into account.
While many metrics should not be included in the Efficiency Indicator, they still remain an important and good demonstration of Jury activity. With their help, we can reveal interesting facts about the quality of the jury’s work, draw the attention of the community to it and influence the reputation of judges. Thus, the activities of the jury members can be regulated both by an automated Efficiency Indicator and by public opinion.
We will continue to inform you about the progress of the Jury Efficiency issue and will soon present the opinions of significant people in the community on the problems and prospects of the Free TON judging system.