Aneesha Bakharia
1 min readSep 21, 2017

--

Great that you have LDA up and running. NMF I think works better for smaller datasets with LDA scaling to larger ones. In regards to getting phrases included there are 3 things you can try: 1. Instead of a bag of words matrix, try a matrix of tri-grams and bi-grams, 2. try to identify phrases and include these as words in the bag of words matrix — see my Rake algorithm implementation (https://github.com/aneesha/RAKE) and 3. try this add-on to LDA which labels topics using phrases (https://github.com/xiaohan2012/chowmein). I will be posting a blog post soon on labeling topics derived from topic models.

--

--

Aneesha Bakharia
Aneesha Bakharia

Written by Aneesha Bakharia

Data Science, Topic Modelling, Deep Learning, Algorithm Usability and Interpretation, Learning Analytics, Electronics — Brisbane, Australia

No responses yet