0.99 · 标签 · show ai / spacy

0.99

49aa9b3d · * Adjust travis.yml, so that we don't test with so much memory · 11月 09, 2015

Improve span merging, internal refactoring

* Merging multi-word tokens into one, via the doc.merge() and span.merge() methods, no longer invalidates existing Span objects. This makes it much easier to merge multiple spans, e.g. to merge all named entities, or all base noun phrases. Thanks to @andreasgrv for help on this patch.
* Lots of internal refactoring, especially around the machine learning module, thinc. The thinc API has now been improved, and the spacy._ml wrapper module is no longer necessary.
* The lemmatizer now lower-cases non-noun, noun-verb and non-adjective words.
* A new attribute, .rank, is added to Token and Lexeme objects, giving the frequency rank of the word.

下载源代码