Package: sbo 0.5.0

Valerio Gherardi
sbo: Text Prediction via Stupid Back-Off N-Gram Models
Utilities for training and evaluating text predictors based on Stupid Back-Off N-gram models (Brants et al., 2007, <https://www.aclweb.org/anthology/D07-1090/>).
Authors:
sbo_0.5.0.tar.gz
sbo_0.5.0.zip(r-4.7)sbo_0.5.0.zip(r-4.6)sbo_0.5.0.zip(r-4.5)
sbo_0.5.0.tgz(r-4.6-x86_64)sbo_0.5.0.tgz(r-4.6-arm64)sbo_0.5.0.tgz(r-4.5-x86_64)sbo_0.5.0.tgz(r-4.5-arm64)
sbo_0.5.0.tar.gz(r-4.7-arm64)sbo_0.5.0.tar.gz(r-4.7-x86_64)sbo_0.5.0.tar.gz(r-4.6-arm64)sbo_0.5.0.tar.gz(r-4.6-x86_64)
sbo_0.5.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html✨
card.svg |card.png
sbo/json (API)
NEWS
| # Install 'sbo' in R: |
| install.packages('sbo', repos = c('https://vgherard.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/vgherard/sbo/issues
Pkgdown/docs site:https://vgherard.github.io
- twitter_dict - Top 1000 dictionary from Twitter training set
- twitter_freqs - K-gram frequencies from Twitter training set
- twitter_predtable - Next-word prediction tables from 3-gram model trained on Twitter training set
- twitter_test - Twitter test set
- twitter_train - Twitter training set
natural-language-processingngram-modelspredictive-textsbocpp
Last updated from:75374a5bf5 (on v0.5.0). Checks:12 ERROR, 1 OK. Indexed: yes.
| Target | Result | Time | Files | Syslog |
|---|---|---|---|---|
| linux-devel-arm64 | ERROR | 206 | ||
| linux-devel-x86_64 | ERROR | 203 | ||
| source / vignettes | ERROR | 258 | ||
| linux-release-arm64 | ERROR | 184 | ||
| linux-release-x86_64 | ERROR | 178 | ||
| macos-release-arm64 | ERROR | 117 | ||
| macos-release-x86_64 | ERROR | 289 | ||
| macos-oldrel-arm64 | ERROR | 113 | ||
| macos-oldrel-x86_64 | ERROR | 215 | ||
| windows-devel | ERROR | 205 | ||
| windows-release | ERROR | 172 | ||
| windows-oldrel | ERROR | 144 | ||
| wasm-release | OK | 133 |
Exports:as_sbo_dictionarybabbledictionaryeval_sbo_predictorkgram_freqskgram_freqs_fastpredictorpredtablepreprocessprunesbo_dictionarysbo_kgram_freqssbo_kgram_freqs_fastsbo_predictorsbo_predtabletokenize_sentencesword_coverage
Dependencies:briocallrclicpp11crayondescdiffobjdplyrevaluatefsgenericsgluejsonlitelifecyclemagrittrpillarpkgbuildpkgconfigpkgloadpraiseprocessxpspurrrR6Rcpprlangrprojrootstringistringrtestthattibbletidyrtidyselectutf8vctrswaldowithr
Readme and manuals
Help Manual
| Help page | Topics |
|---|---|
| Coerce to dictionary | as_sbo_dictionary as_sbo_dictionary.character |
| Babble! | babble |
| Evaluate Stupid Back-off next-word predictions | eval_sbo_predictor |
| k-gram frequency tables | kgram_freqs kgram_freqs_fast sbo_kgram_freqs sbo_kgram_freqs_fast |
| Plot method for word_coverage objects | plot.word_coverage |
| Predict method for k-gram frequency tables | predict.sbo_kgram_freqs |
| Predict method for Stupid Back-off text predictor | predict.sbo_predictor |
| Preprocess text corpus | preprocess |
| Prune k-gram objects | prune prune.sbo_kgram_freqs prune.sbo_predtable |
| Dictionaries | dictionary sbo_dictionary |
| Stupid Back-off text predictions | predictor predtable sbo_predictions sbo_predictor sbo_predictor.character sbo_predictor.sbo_kgram_freqs sbo_predictor.sbo_predtable sbo_predtable sbo_predtable.character sbo_predtable.sbo_kgram_freqs |
| Sentence tokenizer | tokenize_sentences |
| Top 1000 dictionary from Twitter training set | twitter_dict |
| k-gram frequencies from Twitter training set | twitter_freqs |
| Next-word prediction tables from 3-gram model trained on Twitter training set | twitter_predtable |
| Twitter test set | twitter_test |
| Twitter training set | twitter_train |
| Word coverage fraction | word_coverage word_coverage.character word_coverage.sbo_dictionary word_coverage.sbo_kgram_freqs word_coverage.sbo_predictions |