Algorithm Description
We form an estimate of the skill of each member in each event. This estimate has several free parameters, which we set via leave-one-out cross-validation. The resulting skills are then used to assemble a team with a high predicted capability while satisfying preferences.
Estimating Skills for Members on Events
There are four steps to estimating skills. In the first step, we compute a skill observation based on event data from past ranks or scores from competitions or practice tests. In the second step, we adjust skill observations to factor in the difference in difficulty between practice tests and/or competitions. In the third step, we use leave-one-out cross-validation to set the free parameters that are necessary to calculate adjusted skill observations. In the fourth step, we calculate the skill estimates to be displayed to the user.
Step 1: Skill Observations
We process the ranks from past competition data and the scores of practice test data to create skill observations.
For each member \(m\), event \(e\), and competition \(c\), let the skill observation be:
$$s_{m e c}=1-\frac{r_{m e c}}{t_{c}}$$
where \(r_{m e c}\) is the observed rank member \(m\) got in event \(e\) for competition \(c\) and \(t_c\) is the total number of teams in competition \(c\).
Similarly, for each member \(m\), event \(e\), and practice test \(p\), let the skill observation be:
$$s_{m e p}=\frac{g_{m e p}}{t_{e p}}$$
where \(g_{m e p}\) is the grade in points that \(m\) got for event \(e\) for practice test \(p\) and \(t_{e p}\) is the maximum number of points a member could get for practice test \(p\) in event \(e\).
Let \(X_{m e}=\{x_1,x_2...x_n\}\) be the set of competitions \(c\) and practice tests \(p\) where member \(m\) participated in event \(e\) and let \(X=\{x_1,x_2...x_n\}\) be the set of all competitions \(c\) and practice tests \(p\).
Let \(s_{m e x}\) be the normalized skill for event \(e\) and member \(m\) in either a competition \(c\) or practice test \(p\).
Step 2: Skill Observation Adjustments
To factor in the difference of difficulty between practice tests and/or competitions when estimating skill, we use free parameters to adjust skill observations.
Each competition \(c\) or practice test \(p\) has a weight \( 0 < w_x < 1 \) and a bias \( -1 < b_x <1 \).
The final adjusted observed skill is:
\(f_{m e x} = w_x \cdot s_{m e x} + b_x \)
Step 3: Leave-One-Out Cross-Validation
We use leave-one-out cross-validation to estimate the difficulty weights and biases of each competition and practice test.
- If \(|X_{m e}\text{ } \backslash\{x_k\}| =0\), then \(\widehat{f_{m e x_k}}=a \), where \(0< a <1\) is the initial adjusted observed skill with no experience.
- Otherwise, for each instance \(x_k\) left out, we calculate predictions of all observed skills \(\widehat{s_{m e x_k}}\) by averaging the adjusted observed skills of the other competitions and practice tests and converting the average to the estimated skill observation.
$$ \widehat{f_{m e x_k}} = \frac{1}{|X_{m e}\text{ } \backslash\{x_k\}|} \sum_{x \in X_{m e}\text{ } \backslash\{x_k\}} f_{m e x}$$
$$ \widehat{s_{m e x_k}} = \frac{\widehat{f_{m e x_k}} - b_{x_k}}{w_{x_k}}$$
-
Let \(\mathcal{M}_{x} \) be the set of all member-event pairs \((m, e)\) for a competition or practice test instance \(x\). For any instance \(x_k\), the loss \(L\) is:
$$ L_{x_k} = \frac{1}{|\mathcal{M}_{x_k}|} \sum_{(m, e) \in \mathcal{M}_{x_k}} (s_{m e x_k} - \widehat{s_{m e x_k}})^2$$
-
The total loss \(L\) that we are trying to minimize is:
$$L = \frac{1}{|X|} \sum_{x_k \in X} L_{x_k}$$
-
We minimize the total loss with respect to \({\{w_x, b_x\}}_{x \in X} \) and \(a\) using SciPy's optimize.minimize function.
Step 4: Calculating Outputs
Using the weights and biases calculated from the previous step, we compute the skill estimates which are displayed.
- Competition or practice tests instances without overlapping member-event pairs \((m, e)\) are given the average weight (\(w_{avg}\)) and bias (\(b_{avg}\)) of the other instances that do have overlapping pairs. If there are no instances with overlapping pairs, \(w_{avg}\) is set to \(0.5\) and \(b_{avg}\) is set to \(0\).
- We then predict the next competition or practice tests' adjusted skill observation from the other competition or practice test adjusted skill observations:
$$\widehat{f_{m e x_{n+1}}}= \frac{1}{|X_{m e}|} \sum_{x \in X} f_{m e x}$$
Or if there are no other data on an \((m, e)\) pair, then we let \(\widehat{f_{m e x_{n+1}}}=a\).
$$\widehat{s_{m e x_{n+1}}}=\frac{\widehat{f_{m e x_{n+1}}} - b_{avg}}{w_{avg}}$$
- We then attempt to add an experience factor that gives members with more experience better estimated skill and prevents members who were carried during one or two competitions to get relatively high skill scores. Let \( 0 < P < 1 \) be a parameter that a user can specify, which indicates the slope of the linear increase in skill relative to experience. Let \(E_{m e}\) represent the number of competitions or practice tests member \(m\) participated in event \(e\) in total in the past.
$$\widehat{(s_{m e x_{n+1}})'}=\widehat{s_{m e x_{n+1}}}+P\cdot E_{m e}$$
- Let \(S\) represent the set of all \(\widehat{(s_{m e x_{n+1}})'}\) for \(\forall (m, e) \in \mathcal{M}_{x} \forall x \in X \) We then calculate \(s_{U} = \max{S}\) and \(s_{u} = \min{S}\) and scale every skill from \(0\) and \(1\):
$$\widehat{(s_{m e x_{n+1}})''}=\frac{\widehat{(s_{m e x_{n+1}})'}-s_{u}}{s_{U}-s_{u}}$$
\(\widehat{(s_{m e x_{n+1}})''}\) represents the skill estimates for any next competition/practice test for a member \(m\) in an event \(e\).
Generating Teams
There are multiple steps to generating teams. In the first step, we define restrictions for adding a member to a particular event on a team. In the second step, we define the team score function which describes how good a team is in terms of both preferences and skill. In the third step, we generate teams that maximize the team score.
Step 1: Team Validity Constraints
Define \( v(T, m, e) \) to check if adding member \( m \) to event \( e \) in team \( T \) is valid:
-
Event Capacity: Let \(T_e\) be the set of members on event \(e\) for team \(T\) and let \(E_3\) represent set of events with a maximum of 3 members (instead of the usual 2). Each event \( e \) can have at most \( k \) members, where \( k = 2 \) for standard events and \( k = 3 \) for events that explicitly allow three members.
\[
|T_e| < \begin{cases}
2 & \text{if } e \notin E_3, \\
3 & \text{if } e \in E_3
\end{cases}
\]
-
Unique Assignment: A member \( m \) cannot already be assigned to event \( e \).
\[
m \notin T_e
\]
-
Maximum Events per Member: Each member \( m \) can only participate in a limited number of events out of all the available events \(E\), defined as \( o_{\Gamma} \).
\[
|\{e \in E : m \in T_e\}| < o_{\Gamma}
\]
-
Maximum Build Events per Member: A member \( m \) can only participate in at most \( o_B \) events in the set of all build events \( B \).
\[
|\{e \in B : m \in T_e\}| < o_B
\]
-
Conflict-Free Scheduling: A member \( m \) cannot be assigned to events \( e_1 \) and \( e_2 \) that conflict in the competition schedule. Let \(C_e\) be the set of all events that conflict with event \(e\).
\[
\forall e' \in C_e, m \notin T_{e'}.
\]
-
Max Team Size: There cannot be more than \(o_M \) members on a team \(T\). Let \(A_T\) represent the set of unique members on a team \(T\).
\[
|A_T| \le o_M
\]
-
Senior Constraint: Let \(M\) represent the set of all the members \(m\) on team \(T\). Let \(q_m\) be a binary indicator bit for whether a member \(m\) is a senior (\(1\)) or not (\(0\)). There can be at most 7 seniors across all events in the team.
\[
|\{m \in M : q_m = 1, m \in T_e \}| \leq 7.
\]
Validity Check: To check if assigning \( m \) to \( e \) on team \(T\) is valid, evaluate:
\[
v(T, m, e) = \text{True if all constraints above are satisfied, False otherwise.}
\]
Step 2: Team Scoring Function
Define the function \(h(T)\) to score how good a team \(T\) is, as a combination of preference scores, skill scores, partner preferences, and required members on specific events.
-
Preference Score:
Let \(d\) represent the maximum number of preferred events any member listed.
Let \(p_{me}\) represent the preference rank member \(m\) listed for event \(e\). If member \(m\) didn't list \(e\) at all when listing preferences then \(p_{m e}=d+1\).
$$h_P(T)=\sum_{e \in E} \sum_{m \in T_e} \frac{d - p_{m e} + 1}{d}$$
-
Skill Score:
$$h_S(T)=\sum_{e \in E} \sum_{m \in T_e} (\widehat{s_{m e x_{n+1}}})'' $$
-
Partner Preferences Bonus:
Let \(Y\) represent the set of user-specified member-member-event triples \((m_1, m_2, e)\), which means members \(m_1\) and \(m_2\) want to be partners for event \(e\). If event \(e\) for members \(m_1\) and \(m_2\) is not specified originally, then it is equivalent to making the triples \((m_1, m_2, e) \forall e \in E \).
$$h_M(T)=\sum_{e \in E} \sum_{(m_1, m_2, e) \in Y} I(m_1 \in T_e \land m_2 \in T_e)$$
(\(I\) is the indicator function, which returns \(0\) if the boolean expression is \(False\) and \(1\) if it is \(True\))
-
Required Pairs Bonus:
Let \(K\) represent the set of user-specified required member-event pairs \((m, e)\) that require member \(m\) to be on event \(e\).
$$h_K(T)=\sum_{(m,e) \in K} \begin{cases}
10000 & \text{if } m \in T(e), \\
0 & \text{otherwise}
\end{cases}
$$
-
Final Team Score:
Let \(0 < w < 1\) represent the parameter that can be changed by the user. If \(w\) is closer to \(1\), then the algorithm values skill more over preferences and if \(w\) is closer to \(0\), then it values preferences more over skill.
$$h(T) = w \cdot h_S(T) + (1-w) \cdot (h_P(T)+h_M(T)) + h_K(T)$$
Step 3: Optimization Problem
-
Find Optimal Event Addition:
Define a function \(j_E(T, m, z)\), which finds an optimal event addition \(e\) to put member \(m\) on for a team \(T\) and a boolean parameter \(z\) which represents whether the event addition is necessary or not.
Let \(T_m\) represent the set of events member \(m\) is assigned to for team \(T\).
Let \(o_{\gamma}\) represent the minimum number of events any member \(m \in M\) can be assigned to at once.
- For a member \(m \in M\), iterate over all \(e \in E\)
- if \(v(T, m, e)=True\), create \(T'\), which has member \(m\) added to \(T_e\). Then, define \(\Delta h = h(T') - h(T)\).
- if \(v(T, m, e)=False\), iterate over all members \(m_o \in T_e, m_o \ne m \):
- If \(|T_m| > o_{\gamma} \): Define a \(T'\), which has \(m_o\) removed from \(T_e\). If \(f(T', m, e)=True\), then define a \(T''\), which has member \(m\) added to \(T'_e\).
- If \(z=False\), then consider replacing \(m_o\) with \(m\) in \(T_e\) as valid only if \(\Delta h = h(T'')-h(T') \ge 0\)
- Otherwise, always consider replacing \(m_o\) with \(m\) in \(T(e)\) as valid (even when \(\Delta h = h(T'')-h(T') < 0\))
The output of \(j_E(T, m, z)=T^*\), where \(T^*\) is the team which maximizes \(\Delta h\) by adding an event to a member. If there are no valid event additions for member \(m\), it just returns the original team \(T\).
-
Find Optimal Member Addition:
Define a function \(j_M(T, M_j)\) for finding an optimal member addition \(m\) out of a set of possible members \(M_j\) to a team \(T\).
-
For a team \(T\), iterate over all members not already on the team (\(m \in M_j \backslash A_T\)).
- Let \(T^@\) be a copy of \(T\)
-
Repeat \(o_{\gamma} \) times:
- \(T^@=j_E(T^@, m, True)\)
Repeat \(o_{\Gamma} - o_{\gamma}\) times:
- If possible, \(T^@=j_E(T^@, m, False)\)
The output of \(j_M(T, M_j)=T^\#\), where \(T^\#\) is the team which maximizes \(h(\ldots)\) by adding a member.
-
Make a Team:
Now, we generate a team using the two functions defined above.
Let \(\mathcal{R}\) represent the set of all members required to be on the team (including all the members in \(K\))
-
Start with empty \(T\)
-
using \(j_M(T, \mathcal{R})\) exactly \(|\mathcal{R}|\) times, fill the events with required members (people that need to be on the team \(T\))
-
using \(j_M(T, M)\) exactly \(o_M-|R(T)|\) times, fill the events with any other available members
-
In the case where some events are empty (due to it being a new event and members neither have experience nor prefer to fill the event) we randomly assign a member who can fill it without breaking any constraints.
If you have ideas on how to improve the algorithm, please contact us!