I am using Telegram a lot for chatting. Since it is possible to export the chat history with everyone you have been chatting with I wanted to create a word cloud featuring the most common words with my chat partners.
I use several tools for this. There is telegram-history-dump which itself uses telegram-cli for exporting the chat history. Then I use Matlab for some postprocessing (or preprocessing, depending on how you view it) to bring the dumped output in a (for me) manageable form. Lastly I use R to generate a word cloud like the one at the top of this post.
Continue reading “Analyzing Telegram chats with R and Matlab”
Actually, it is just every finisher of the Tour de France between 1903 and 2017. It was quite a pain to gather all the data and had to be done manually to some extend. Therefore it is entirely possible that some errors were made. I will give a detailed description for the generation of the data set. All used scripts and the final data can be found on Github. The picture above shows that the Tour is getting shorter and faster.
Continue reading “Every cyclist of the Tour de France in a single CSV file”
Getting the data – again
Similar to this post, I again gathered my data. This time however, I bulk exported everything from Polar instead of Garmin (there is an app called SyncMyTracks that synchronizes different services).
Continue reading “Running distance and pace distribution with R”
I struggled to obtain a compressed data set that contains the outcomes of every soccer game ever played in the Bundesliga so I created this data myself, available here at Github. The CSV file can also be directly downloaded here: fullBundesligaMatchHistory.
Continue reading “Gathering every Bundesliga game ever played in a single CSV file”