Intertopic Distance Map & Top-30 Most Salient Terms


Open the Graph

The Intertopic Distance Map represents topics as circles and each topic’s prevalence in the document as the area of its circle. In the case of the digital humanities items, the size of the circle represents the number of items assigned to the respective topic with respect to the whole list. The location of circles on the grid is determined by computing the similarity between the probability distributions of the words found by the the Latent Dirichlet Allocation (LDA) algorithm and projecting these inter-topic distances onto a two-dimensional space.

The Top-30 Most Salient Terms bar chart shows the top-30 most relevant terms for each topic and the estimated frequency of each term within the whole document as well as within the given topic. Hover over a word in this bar chart to show the topics that include that word in its top-30 list. While hovering, the circles of the topics that include the word appear in the Intertopic Distance Map, while the circles of topics that don’t include the word, disappear.

The Intertopic Distance Map shows that most topics are clearly separate from the others; there is some overlap of topics but not a significant amount given the number of topics.