Skip to contents

R-CMD-check pkgdown

SegmentR is the collective name for the techniques SAMY Data Science uses to find latent groups in data. It has two main aspects:

  • Topic modelling
    • Finding latent groups in a corpus of text using Latent Dirichlet Allocation.
    • E.g. What are the topics being discussed in these tweets?
  • Cluster analysis
    • Finding latent groups in any data set using distance metrics.
    • E.g. What are the different types of customer present in our database?

So far, this R package only contains functions to help with the topic modelling workflow.

Installation

Install the package using the install_github function from the devtools package (you’ll need an auth_token).

devtools::install_github(repo = "Avery-Island/SegmentR",
                         auth_token = "let_me_in_please")

SAMY Data Science