Uploaded image for project: 'Modules'
  1. Modules
  2. MOD-380

Wrong non-ascii characters in DutKant module

    Details

      Description

      And Bible user reports:

       

      (Warning: Non-ASCII characters in this bug report.)

      *Describe the bug*
      Using DutKant, the Dutch "Kanttekeningen" (side notes) for the Dutch Statenvertaling, all non-ASCII characters (often e-with-diaeresis / ë / Unicode U+00EB) are shown as a replacement character (looking like a crossed square / ⛝ / Unicode U+26DD).

      *Bug was found on And Bible version*
      Any build I've ever used, including the most recent beta 3.3.382.

      *To Reproduce*
      Steps to reproduce the behavior:
      1. Install DutKant (language: Dutch/Nederlands, type Commentary/Commentaar, from CrossWire), and choose that document.
      2. Go to Matthew 1:12.
      3. Observe something that looks like "11) Salathi⛝l [...]",

      *Expected behavior*
      I would expect to see "Salathiël" instead.

      *Screenshots*
      [I might add a screenshot later.]

      *Smartphone:*

      • Device: Nokia 6.1 / TA-1043
      • OS: basically-stock Android, language Nederlands (Dutch).
      • Version 10 (patch level May 1, 2020).

      *Additional context*
      Copying this word to QuickEdit on Android, stores the character as single 0x89 byte. (See https://www.fileformat.info/info/unicode/char/eb/charset_support.htm and https://www.fileformat.info/info/unicode/char/eb/codepage_support.htm for the character sets and code pages that do that: IBM 437 (aka PC-8 or DOS Latin US, see https://en.wikipedia.org/wiki/Code_page_437) and also IBM 850...865.)

      Also, copying this word to the Gmail app turns it into a 'per mille' sign (U+2030), which has 0x89 as its encoding in Windows code pages 1250..1258.

      So the underlying byte seems to actually be 0x89, in the source document? So that document seems to be in code page 437 encoding, and AndBible seems to interpret it as the control character [U+0089](https://codepoints.net/U+0089) instead, perhaps?

      *My questions*

      • Is this an issue in AndBible? Or in the DutKant document? Or possibly in both?
      • If in DutKant: Where is the original DutKant document? So where is AndBible getting the binary from, and from which source is that built?
      • What can I do to help troubleshooting this?

        Attachments

          Activity

          Hide
          lafricain Cyrille added a comment - - edited

          In which repository did you find this bible? I can't find it in any repository?

          Ok commentary?

          Show
          lafricain Cyrille added a comment - - edited In which repository did you find this bible? I can't find it in any repository? Ok commentary?
          Hide
          tuomas Tuomas Airaksinen added a comment -

          It is in standard Crosswire's repository (as the URL in above comment suggests).

          Show
          tuomas Tuomas Airaksinen added a comment - It is in standard Crosswire's repository (as the URL in above comment suggests).
          Hide
          marnix.klooster Marnix Klooster added a comment -

          For the record, as suggested on Freenode IRC channel #sword, I asked the same question to sword-support@crosswire.org and received the following response:

          This is an ancient module and we do not have any record from where it was derived. Usually (now) we record the source in our conf files, but this being from 2002 , it has no such information.

          You could look through our mailing lists from that time, but other than this, the best is probably simply to go on the search again.

          Irrespective of the shortcomings of this text we would now only accept a source text which is guaranteed to be better, I.e. not just replacing this error with another one. In particular modules from other programmes are not acceptable sources.

          So I will go over these mailing lists (https://wiki.crosswire.org/Help:Mailing_Lists), go over the modules part of the Developer's Wiki (the 'Module Development' part of https://wiki.crosswire.org/Main_Page), and see if I can find out more about a potential successor to DutKant+DutSVV which I found: STV (and STVA) over at https://github.com/Isidore-Guild/statenvertaling.

          Show
          marnix.klooster Marnix Klooster added a comment - For the record, as suggested on Freenode IRC channel #sword, I asked the same question to sword-support@crosswire.org and received the following response: This is an ancient module and we do not have any record from where it was derived. Usually (now) we record the source in our conf files, but this being from 2002 , it has no such information. You could look through our mailing lists from that time, but other than this, the best is probably simply to go on the search again. Irrespective of the shortcomings of this text we would now only accept a source text which is guaranteed to be better, I.e. not just replacing this error with another one. In particular modules from other programmes are not acceptable sources. So I will go over these mailing lists ( https://wiki.crosswire.org/Help:Mailing_Lists ), go over the modules part of the Developer's Wiki (the 'Module Development' part of https://wiki.crosswire.org/Main_Page ), and see if I can find out more about a potential successor to DutKant+DutSVV which I found: STV (and STVA) over at https://github.com/Isidore-Guild/statenvertaling .
          Hide
          lafricain Cyrille added a comment -

          @Marnix, you find a wonderful source text with cc copyright. I investigate too about dutKant. I wrote to https://statenvertaling.nl/index.html I'm waiting an answer.

          But your resource is really better. When I will have free time I'll use this source text.

          Thank you for your investigation!

          Show
          lafricain Cyrille added a comment - @Marnix, you find a wonderful source text with cc copyright. I investigate too about dutKant. I wrote to https://statenvertaling.nl/index.html I'm waiting an answer. But your resource is really better. When I will have free time I'll use this source text. Thank you for your investigation!
          Hide
          lafricain Cyrille added a comment -

          Solved in DutSVVA in beta. (Merge both dutKant and dutsvv)

          Show
          lafricain Cyrille added a comment - Solved in DutSVVA in beta. (Merge both dutKant and dutsvv)

            People

            • Assignee:
              refdoc Peter von Kaehne
              Reporter:
              tuomas Tuomas Airaksinen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: