skills17

Live marking during the competition

Published on 23 September 2020, 13:58

In our previous two blog posts, we wrote about automated testing and our streaming setup. Now, in the last part of this 3-part series, we will show you how we combined those two parts to be able to provide a real-time ranking of the Speed Challenge at ICTskills2020.

Untitled (2).jpg Infrastructure, explained in detail below

Accessing the competitor's source code

The first challenge was, how can we provide access to the competitor's source codes as that is obviously needed to run tests against. The solution we came up with was to build a private network where all competitor clients and the marking server are connected to. Then, in an interval of 3 seconds, the marking server rsyncs the task files from the competitors to the server. This process is done over SSH with public key authentication. Also, every task contains a src folder, and competitors are only allowed to change files within that folder. So, rsync only has to synchronize the src folders and because of the way rsync works - it only copies changed files to the server - there was not much network traffic going on and we could keep the short synchronization time of 3 seconds to provide a near real-time marking.

Running the tests

Although competitors had the tests for all tasks and were able to run them on their own, all tests were also run on the marking server and only the results of those were stored and displayed on the scoreboard - to prevent competitors from changing the unit tests.

The tests were actually executed in a dedicated docker container per competitor. The used docker image was built in advance with the same environment and tools that the competitors had installed on their VM (Ubuntu, Node.js 14, PHP 7.4, ...) to ensure everything works the same way on the client VM and on the server. As the docker image also already contained all tests, only the src folder had to be added as a volume to the docker container. And because the containers removed themselves after one test run, everything was reset to the initial state again for the next test run. As you can see, docker helped us a lot in this part and it made managing and parallelizing the tests very easy.

Incremental tests

Running the tests for all tasks took about 30-40 seconds per competitor. We parallelized the competitors so this was also roughly the time for all competitors, not just one. Although this timeframe would certainly be okay, we still wanted to further improve it and provide the results as real-time as possible.

We were able to achieve this by running incremental tests - meaning only run tests for tasks that have been modified since the last test run. As rsync also synchronizes the file modification timestamps when applying the correct flags, we already had the information which files were changed at what time. The only thing left to do was to break down the modification timestamp to the separate tasks and allow to pass in a list of tasks to run tests for in the docker container.

As our tests for PHP, JavaScript, and Regex took only a few milliseconds, we reduced the delay as much as possible to ~4 seconds (3 of them are because the files are only synchronized every 3 seconds). Only the end-to-end tests with Cypress for the HTML/CSS tasks took a bit longer - about 15 seconds because it first needs to boot up Cypress and then the Chrome browser. But as they were now only executed when a competitor has changed something in those two tasks, we could definitely live with it and assume that ~80% of all test runs complete in less than 4 seconds.

Storing the test results

The last part is to store the results and display/update them on the website and on the stream.

For that, we used an independently hosted server that was publicly available. We didn't want to store the results on the marking server that was on the competition-site, as it could have introduced some potential security risks when it would have been accessed from the public internet. For example, if the IP address would have been known because the website refreshes the scoreboard from it, it would have been exposed to DDoS attacks that could have affected the whole competition and live-stream. Instead, the marking server called an API provided by the independently hosted server to store all test runs in the database, so data was only passed from our internal network to the public server and not the other way round.

Additionally, a WebSocket server using socket.io was running on that public server and all users of our website, as well as the live-stream overlays, were connected to that WebSocket server. So, when the score of a competitor has changed, the server pushed the new ranking to all connected clients and they could update it in real-time without any additional requests.

To make sure the website will always be available, even in case of a large number of live-stream viewers, the website was served from a global CDN that cached everything on their edge servers so only very few requests actually hit our public server. Because of that, the chance that our website would go down was very small - and it really didn't. Even if that would have been the case, the CDN would have returned the last available version until the server would have been reachable again, so our users would have still been able to watch our live-stream.

Going further

The Speed Challenge was the first time that we had real-time automated marking that was also displayed on the public website. We are looking to further expand the real-time automated marking to the whole competition next time as this also provides us valuable insights. For example, it would be possible to generate live performance graphs of every competitor and get a more detailed insight of their skills in each part (like how fast are they at a specific JS task or how fast can they debug errors). Although that data could also be displayed publicly, we will probably restrict it to a single part of the competition - like the Speed Challenge - and use the real-time information internally for the other parts. That's for two reasons: we want to be able to make adjustments in case an automated test does not work as expected and doesn't correctly award points, and we still want to have a medal ceremony where the candidates are excited and don't already know their ranking in advance.