Text variables

DSS has various ways to handle text variables:

  • Count vectorization
  • TF/IDF vectorization
  • Hashing trick (producing sparse matrices)
  • Hashing trick + Truncated SVD (producing smaller dense matrices for algorithms that do not support sparse matrices)
  • Custom

For the specific case of deep learning, see text features in deep-learning models