It is now week 4 of Open and I have judged over 2000 videos. 2/3 of those are invalid videos. As I wrote in my last post, I am putting my effort into making the Open as fair to everyone as I possibly can, in the short amount of time I have.
I have been super busy with other things than the Open the last two weeks, but when I had some spare time, I voted a couple of videos invalid. But I realized I needed to make this even easier for myself.
Automatic rejection of videos
After I started integrating YouTube scraping into my application, I had the option to invalidate a whole lot of videos, and to this day, I still dont understand why HQ didnt implement this from the beginning of the open. Yes, it might perhaps be a couple of hundred thousand API requests, but HQ is charging a fee for every single contestant, and I have been able to extract info from the videos for free during the last weeks. Now there are several things that I am looking for and heres the run down of how “easy” it is.
Is the video even retrievable?
The simplest thing to look for is to check if the video can be retrieved from YouTubes API. If I get an error for any reason, it can happen due to a couple of things, some of them being:
- The athlete simply entered a bogus id on YouTube to make it look like a video
- The video is private (probaly 150-200 videos are marked as this)
- The video is not embeddable
or the video cannot be played due to the fact that
- The video is using any other video source than youtube or vimeo (the player on the website cannot handle anything else)
Out of 15809 videos, 1626 or 10.28% of submitted videos cannot be played.
Is the videos upload date correct?
This one is pretty simple to look for as well.
- For 14.1, the upload time has to be or be after the 28th feb
- For 14.2, the upload time has to be or be after the 7th mar
- For 14.3, the upload time has to be or be after the 14th mar
- For 14.4, the upload time has to be or be after the 21st mar
If the video is against one of these rules, the athlete has submitted a video that has been recorded prior to the announcement.
Is the videos length correct?
This one is a little harder to check for, as we need to take some logic into consideration. Luckly, 3 of the current 4 workouts are AMRAPS with a set time.
- For 14.1, the video has to be longer than 10 minutes at minimum
- For 14.3, the video has to be longer than 8 minutes at minimum
For 14.4, the video has to be longer than 14 minutes at minimum
- For 14.2, the amount of reps is between 41 and 88 and is longer than 6 minutes at minimum
- For 14.2, the amount of reps is between 89 and 144 and is longer than 9 minutes at minimum
- For 14.2, the amount of reps is between 145 and 208 and is longer than 12 minutes at minimum
- For 14.2, the amount of reps is between 209 and 280 and is longer than 15 minutes at minimum
- For 14.2, the amount of reps is between 281 and 360 and is longer than 18 minutes at minimum
This introduced some errors and I modified the checks above, so that they would not tag submissions, that had a score of 30 or lower. I did this to “accept” videos that are short recordings of people doing a limited amount of reps due to missing equipment, injuries or any other reason. Those kind of videos should be accepted, right?
Submitting the judging of the video
I have manually submitted quite a lot of the videos tagged with the rules set above, but I looked into ways of judging these in a clever way. CrossFit HQ made the website in a oldschool fashion with invalidations being submitted via raw http requests, so I could implement a scraper to check if I had submitted a judging, the same way as I do with fetching data from the submissions. I have to log in to submit the judging though, but due to the open community around the programming language I use, I quickly found a way to imitate a user and “log in” to submit a judging. How this exactly is done, can be seen in this gist from github (this code is ugly, but works)