python-crfsuite is a python binding to CRFsuite.
pip install python-crfuite
python-crfsuite is licensed under MIT license. CRFsuite C/C++ library is licensed under BSD license.
Development happens at github: https://github.com/tpeng/python-crfsuite
- ItemSequence wrapper is added;
- tox tests are fixed.
- Switch to setuptools;
- wheels are uploaded to pypi for faster installation.
- Exceptions in logging message handlers are now propogated and raised. This allows, for example, to stop training earlier by pressing Ctrl-C.
- It is now possible to customize pycrfsuite.Trainer logging more easily by overriding the following methods: pycrfsuite.Trainer.on_start(), pycrfsuite.Trainer.on_featgen_progress(), pycrfsuite.Trainer.on_featgen_end(), pycrfsuite.Trainer.on_prepared(), pycrfsuite.Trainer.on_prepare_error(), pycrfsuite.Trainer.on_iteration(), pycrfsuite.Trainer.on_optimization_end() pycrfsuite.Trainer.on_end(). The feature is implemented by parsing CRFsuite log. There is pycrfsuite.BaseTrainer that is not doing this.
- (backwards-incompatible) training parameters are now passed using params argument of pycrfsuite.Trainer constructor instead of **kwargs;
- (backwards-incompatible) logging support is dropped;
- verbose argument for pycrfsuite.Trainer constructor;
- pycrfsuite.Trainer.get_params() and pycrfsuite.Trainer.set_params() for getting/setting multiple training parameters at once;
- string handling in Python 3.x is fixed by rebuilding the wrapper with Cython 0.21dev;
- algorithm names are normalized to support names used by crfsuite console utility and documented in crfsuite manual;
- type conversion for training parameters is fixed: feature.minfreq now works, and boolean arguments become boolean.
python-crfsuite now detects the featue format (dict vs list of strings) automatically - it turns out the performance overhead is negligible.
- Trainer.append_stringslists and Trainer.append_dicts methods are replaced with a single pycrfsuite.Trainer.append() method;
- Tagger.set_stringlists and Tagger.set_dicts methods are removed in favor of pycrfsuite.Tagger.set() method;
- feature_format arguments in pycrfsuite.Tagger methods and constructor are dropped.
Many changes; python-crfsuite is almost rewritten.