This tokenizer uses stringi::stri_split_boundaries() to tokenize a
character vector. To be used with [explain.character()`.
default_tokenize(text)text to tokenize as a character vector
a character vector.
data('train_sentences')
default_tokenize(train_sentences$text[1])
#> [1] "although" "the" "internet" "as" "level"
#> [6] "topology" "has" "been" "extensively" "studied"
#> [11] "over" "the" "past" "few" "years"
#> [16] "little" "is" "known" "about" "the"
#> [21] "details" "of" "the" "as" "taxonomy"