How I used data analytics to pass my exam

Feb 1, 2017

This is a story about how I got accepted for master’s degree studies at Warsaw School of Economics (SGH) — passing its test without hard-learning economics beforehand.

Exams, exams, exams

Small background: SGH is one of top universities for economics in Poland as well in the CEE region. When going for master’s degree studies, you either have to have bachelor’s degree from SGH (then it depends on your grades) or pass a qualifying test (then it depends on your test score).
For me (a computer science graduate), I only had the second option. If you prepare for a qualifying test (knowledge of economics and foreign language ) you can study from:

Existing notes.
On your own — books.
Preparation course run by SGH itself (I recommend it)
A few test exams. Previous ones are not published at all.

I bought a book (read about 30 pages of it — and as you might guess, that’s not even a full chapter), took the weekend course (four months before the exam as I didn't know there were two dates… and took no notes) and most important… Wikipedia. However, my biggest issue was the lack of more sample tests, as this is a best way for me to learn when studying alone.

Solution

As that time I’d just graduated with a bachelor’s degree from PJATK in Warsaw, I had an overblown ego and I was sure there should be a better way to prepare! This qualifying exam is run twice a year (once since 2016) and even if the questions are not made public by the university itself, students must share them by word of mouth, friends or on Facebook groups. We live in an era of wide Internet access and if I encounter a hard question, I will most likely google it right away(ok, maybe not everything) and if I look for an educational topic, I will use most likely Wikipedia at some point… and Wikipiedia has a visit counter — Chewie we’re home!

A small code block later (as everything is usually more interesting than just studying), I was getting all articles related to economics from Polish Wikipedia (for example, articles related to economic policy). A moment later, I was getting views per day for each article! If you go deeper, you can get precise hours after playing with database files a bit. Cool! But what is the purpose?

Views per day for an article about Nash equilibrium. This was a small inconvenience as exams are not held on the same date each year.

Results

Optimistic version —we know what phrases people study before exams (previous years’ notes, etc.), what they are checking just after the exam (most likely it was on the test itself), or in next couple of days (checking correct results, talking with friends etc.). If we observe a sudden spike in popularity of articles that are going to be on the exam (just before it), most likely exam is compromised and leaked somewhere online. In my case, I focused on articles that suddenly got popularity after the exam but not before — potentially unexpected topics.

Realistic version — as you might expect, Wikipedia is not only for SGH students and, for example, in 2014, just before the qualifying test itself, there was a nation-wide knowledge competition about economics. As you can guess, results in such case were useless. So any such anomaly is very problematic. Also, as you go back in time, results are less and less significant (due to not much internet access, fewer smartphones and smaller FB groups).

Results from 2014, sorted by home-calculated deviation from expected. The last title is the name of a nation-wide competition.

So, can I stop learning?

For now… no. The above work can be treated as an interesting fact with some other potential uses — detecting leaked exams, or trends in topics over the years for any kind of ‘mass’ exams (qualifying exam for SGH is for 2000+people). In my case, I don’t recall any topics or articles on my exam that I learned thanks to this way of studying. But mixing passion with learning was a perfect solution for me — I was learning economics while doing interesting stuff! Also, you don’t expect to get the same questions each year, so overall trends might be more useful than single articles from Wikipedia.

Last words

I’m very curious if with more and more smartphones (googling questions just after exam) and Facebook groups to share who-remember-what from test, this method will over time be more and more accurate. Maybe even to the point that universities will start to ‘mask‘ important exams with other tests or end-of-term examinations to generate noise in the analysis? Also, it would be very interesting to see trends over the years on what people learn before exams (like Brand24, but for education), or just monitor it to detect too many correct searches in the hours before the test itself to watch for potential leaks. For sure the future for Big Data is still very interesting as we uncover more and more interesting patterns, and myself, despite not going for a Big Data specialization, I’m very happy at E-Business.

And now I’m going back to old-school learning for my exams tomorrow. Let me know if you have any other possible applications for such Wikipedia-visit-monitoring strategy in a comments below!

About me

I’m a SGH student with E-Business specialization. Passed the qualifying test with score of 69 by 100, when 69 was a minimum to pass. Doing Android apps as a day-to-day job, such as: SGH Daily, Pola, Kanarek or 12Hours. Loves startups and I’m currently involved in Yawn and Icelytics.