Pay attention to: cases (all letters to lower case?) lemmatization This is often done with morphological parsing, for instance, you can try stemming.
Pay attention to: cases (all letters to lower case?) lemmatization This is often done with morphological parsing, for instance, you can try stemming.