Package: sbo 0.5.0

Valerio Gherardi

sbo: Text Prediction via Stupid Back-Off N-Gram Models

Utilities for training and evaluating text predictors based on Stupid Back-Off N-gram models (Brants et al., 2007, <https://www.aclweb.org/anthology/D07-1090/>).

Authors:Valerio Gherardi

sbo_0.5.0.tar.gz
sbo_0.5.0.zip(r-4.5)sbo_0.5.0.zip(r-4.4)sbo_0.5.0.zip(r-4.3)
sbo_0.5.0.tgz(r-4.4-arm64)sbo_0.5.0.tgz(r-4.4-x86_64)sbo_0.5.0.tgz(r-4.3-arm64)sbo_0.5.0.tgz(r-4.3-x86_64)
sbo_0.5.0.tar.gz(r-4.5-noble)sbo_0.5.0.tar.gz(r-4.4-noble)
sbo_0.5.0.tgz(r-4.4-emscripten)sbo_0.5.0.tgz(r-4.3-emscripten)
sbo.pdf |sbo.html
sbo/json (API)
NEWS

# Install 'sbo' in R:
install.packages('sbo', repos = c('https://vgherard.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/vgherard/sbo/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:

On CRAN:

natural-language-processingngram-modelspredictive-textsbo

17 exports 10 stars 1.39 score 40 dependencies 126 downloads

Last updated 4 years agofrom:75374a5bf5 (on v0.5.0)

Exports:as_sbo_dictionarybabbledictionaryeval_sbo_predictorkgram_freqskgram_freqs_fastpredictorpredtablepreprocessprunesbo_dictionarysbo_kgram_freqssbo_kgram_freqs_fastsbo_predictorsbo_predtabletokenize_sentencesword_coverage

Dependencies:briocallrclicpp11crayondescdiffobjdigestdplyrevaluatefansifsgenericsgluejsonlitelifecyclemagrittrpillarpkgbuildpkgconfigpkgloadpraiseprocessxpspurrrR6Rcpprematch2rlangrprojrootstringistringrtestthattibbletidyrtidyselectutf8vctrswaldowithr

Text prediction via N-gram Stupid Back-off models

Rendered fromsbo.Rmdusingknitr::rmarkdownon Jul 10 2024.

Last update: 2020-12-05
Started: 2020-08-01

Readme and manuals

Help Manual

Help pageTopics
Coerce to dictionaryas_sbo_dictionary as_sbo_dictionary.character
Babble!babble
Evaluate Stupid Back-off next-word predictionseval_sbo_predictor
k-gram frequency tableskgram_freqs kgram_freqs_fast sbo_kgram_freqs sbo_kgram_freqs_fast
Plot method for word_coverage objectsplot.word_coverage
Predict method for k-gram frequency tablespredict.sbo_kgram_freqs
Predict method for Stupid Back-off text predictorpredict.sbo_predictor
Preprocess text corpuspreprocess
Prune k-gram objectsprune prune.sbo_kgram_freqs prune.sbo_predtable
Dictionariesdictionary sbo_dictionary
Stupid Back-off text predictionspredictor predtable sbo_predictions sbo_predictor sbo_predictor.character sbo_predictor.sbo_kgram_freqs sbo_predictor.sbo_predtable sbo_predtable sbo_predtable.character sbo_predtable.sbo_kgram_freqs
Sentence tokenizertokenize_sentences
Top 1000 dictionary from Twitter training settwitter_dict
k-gram frequencies from Twitter training settwitter_freqs
Next-word prediction tables from 3-gram model trained on Twitter training settwitter_predtable
Twitter test settwitter_test
Twitter training settwitter_train
Word coverage fractionword_coverage word_coverage.character word_coverage.sbo_dictionary word_coverage.sbo_kgram_freqs word_coverage.sbo_predictions