Categories
General

LDA, Probabilistic Topic Modelling and RMarkdown

This week, I started off learning about RMarkdown. RMarkdown allows you to combine code and text. Pretty cool! You can analyse data in R or Python, and then use the results as variables in your text. This text can easily be compiled to various formats, including pdf, docx, and html. Hence, this article, too, was written in RMarkdown. I could upload it with ease. Find the pdf version here:

Probabilistic topic modelling is a method to categorise a given set of documents, that is a collection of texts. Such a set of documents could be a collection of news articles. Every news article would then be assigned a topic. A topic is a group of documents that use similar words. An example of a topic could be “parliament”, “government”, “spending”, so a politics topic.