Changes in version 0.5.0 (2020-12-05)                  

API and UI changes

  - Former kgram_freqs class is now called sbo_kgram_freqs. The
    constructor kgram_freqs() is still available as an alias to
    sbo_kgram_freqs().

  - Former sbo_preds class is now substituted by two classes:
    
      - `sbo_predictor`: for interactive use
      - `sbo_predtable`: for storing text predictors out of memory (e.g. 
      `save()` to file)

  - sbo_predictor and sbo_predtable objects are obtained by the homonym
    constructors, which are now S3 generics accepting character input,
    as well as sbo_kgram_freqs and sbo_predtable (for the
    sbo_predictor() constructor) class objects. In particular, these
    allow to directly train a text predictor without storing the
    intermediate sbo_dictionary, and kgram_freqs objects.

  - The behaviour of the dict argument in kgram_freqs() and
    kgram_freqs_fast() has changed, now accepting either a
    sbo_dictionary, a character or a formula (see also 'New features').

  - The sbo_predictor implementation dramatically improves the speed of
    predict() (by a factor of x10). A single call to predict() now
    allocates a few kBs of RAM (whereas it previously allocated few MBs,
    c.f. issue #10).

  - Metadata of sbo_kgram_freqs and sbo_pred* objects is now stored via
    attributes (#11).

New features

  - New S3 class sbo_dictionary.
  - New S3 class word_coverage with generic constructors and a
    preconfigured plot() method.
  - Dictionaries in kgram_freqs() and sbo_pred*() can now be built also
    with a fixed target coverage fraction of training corpus.
  - Added prune() generic function for reducing -gram order of
    kgram_freqs and sbo_predtable's.
  - Added summary() methods for sbo_kgram_freqs and sbo_pred* objects;
    correspondingly, the output of print() has been simplified
    considerably (#5).
  - The object of class sbo_kgram_freqs, sbo_dictionary, sbo_predictor
    and sbo_predtable can be constructed either through the homonymous
    constructors, or through the aliases kgram_freqs(), dictionary(),
    predictor(), predtable().

Other improvements and patches

  - sbo now has SystemRequirements: C++11, for correct integration with
    C++11 code (in particular std::unordered_map).

  - Model training (with sbo_predictor()) is now considerably faster,
    due to optimizations in the algorithm for building Stupid Back-Off
    prediction tables.

  - The Stupid Back-Off algorithm is now thoroughly tested, and small
    inconsistencies between the predict.kgram_freqs() and
    predict.sbo_predictor() methods have been fixed, including:
    
      - Proper handling of unknown words
      - Consistent handling of ties in prediction probabilities.

  - Model evaluation in eval_sbo_predictor() is now carried out by
    sampling a single sentence from each document in test corpus.

  - Removed unnecessary dependencies from Depends and Imports package
    fields.

                 Changes in version 0.3.2 (2020-11-09)                  

  - Patch addressing unexpected behaviour of erase argument in
    preprocess() and kgram_freqs_fast(), c.f. issue #17.

                        Changes in version 0.3.1                        

  - Changed leading to trailing underscore in private variables
    definition of C++ kgramFreqs class, as per §1.6.4 of the "Writing R
    extensions" guide.
  - Removed Catch tests infrastructure for C++ code.

                 Changes in version 0.3.0 (2020-11-04)                  

  - Added kgram_freqs_fast() for fast and memory efficient kgram
    tokenization using the default text preprocessing utility.

                        Changes in version 0.2.0                        

  - The infrastructure of kgram_freqs(), get_word_freqs(), preprocess(),
    and predict.sbo_preds() has been entirely rewritten in C++.
  - Added tokenize_sentences() function for sentence level tokenization.
  - kgram_freqs() now accepts any user defined single character EOS
    token, through the EOS argument.

                        Changes in version 0.1.2                        

  - Added preproc argument to kgram_freqs() and get_word_freqs(), for
    custom training corpus preprocessing.
  - The dict argument of kgram_freqs() now also accepts numeric values,
    allowing to build a dictionary directly from the training corpus.

                        Changes in version 0.1.1                        

  - Added predict method for sbo_kgram_freqs class.