Details
-
Type: Sub-task
-
Status: Closed (View Workflow)
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 1.6.1
-
Fix Version/s: 1.7
-
Component/s: i18n - Translation, o.c.common.util, o.c.jsword.index
-
Labels:None
-
Environment:
All
Description
After applying fix JS-192 (iso639full.properties was always used and iso639.properties always ignored)
I now find that JS-189 (SnowballAnalyzer configured for unavailable stemmer Spanish (Español)) is occurring again.
Reason
The reason appears to be that iso639full.properties contains
es=Spanish
But iso639_en.properties contains
es=Spanish (Espa\u00F1ol)
Also iso639.properties contains
es=Espa\u00F1ol
(There are also a lot of other differences e.g. French, German, ..)
ConfigurableSnowballAnalyzer contains a list of language stemmers that only match the language names in iso369full.properties and no other iso* file:
private static Pattern allowedStemmers = Pattern.compile("(Danish|Dutch|English|Finnish|French|German2|German|Italian|Kp|Lovins|Norwegian|Porter|Portuguese|Russian|Spanish|Swedish)");
which only matches the country names in iso369full.properties.
The fix looks non-trivial; I tried using the language code instead of the name but got the error:
java.lang.ClassNotFoundException: org.tartarus.snowball.ext.esStemmer
I am going to roll back the fix for JS-192 until DM has a chance to look at this.