Skip to contents

check-standard

What is SegmentR?

SegmentR is the collective name for the techniques SHARE uses to find latent groups in data. It has two main aspects:

  • Topic modelling
    • Finding latent groups in a corpus of text using Latent Dirichlet Allocation.
    • E.g. What the topics being discussed in these tweets?
  • Cluster analysis
    • Finding latent groups in any data set using distance metrics.
    • E.g. What are the different types of customer present in our database?

So far, this R package only contains functions to help with the topic modelling workflow.

Installing the package

Install the package using the install_github function from the devtools package (you’ll need an auth_token).

devtools::install_github(repo = "Avery-Island/SegmentR",
                         auth_token = "let_me_in_please")