Uploaded image for project: 'Module Tools'
  1. Module Tools
  2. MODTOOLS-1

osis2mod: reversified material not being written to files... sometimes

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: osis2mod
    • Labels:
      None
    • Environment:

      Win32, VC2008-compiled utilities, based on SVN (r2473)

      Description

      There's a very odd bug in osis2mod that is resulting in some re-versified material not being written to files. Here's some example output from osis2mod:

      INFO(V11N): Hosea 12:15 is not in the KJV versification.
      INFO(V11N): Hos.12.15 is not in the KJV versification. Appending content to Hos.12.14
      INFO(WRITE): Appending entry: Hos.12.14: הכעיס אפרים תמרורים ודמיו עליו יטוש וחרפתו ישיב לו אדניו<lb/> <chaptereID="gen23199" osisID="Hos.12"/>
      INFO(V11N): Hosea 14:10 is not in the KJV versification.
      INFO(V11N): Hos.14.10 is not in the KJV versification. Appending content to Hos.14.9

      The first line indicates that a non-KJV verse has been found. Line 2 indicates that it will be written to the previous verse. Line 3 indicates that it is being written to that verse.
      The fourth line indicates that a non-KJV verse has been found. Line 5 indicates that it will be written to the previous verse. But there is no indication that it is being written to the verse. (And it doesn't get written.)

      There's nothing in the markup to suggest why this is occurring, and I have no idea how long it has been a problem. We might have to do many many module re-issues once we find the source of the problem.

        Attachments

          Activity

          Hide
          dfh David Haslam added a comment -

          Has there been any further progress towards solving this issue?

          Show
          dfh David Haslam added a comment - Has there been any further progress towards solving this issue?
          Hide
          dmsmith DM Smith added a comment -

          I'm not able to fix it, as I cannot find it.

          Show
          dmsmith DM Smith added a comment - I'm not able to fix it, as I cannot find it.
          Hide
          chrislit Chris Little added a comment -

          The problem comes down to the line:
          SWBuf currentText = module->getRawEntry();

          For compressed modules, currentText will sometimes be an empty string despite the fact the associated verse has already been assigned some text.
          For uncompressed modules, this is never a problem.

          So it does seem to be an issue with flushing.

          Show
          chrislit Chris Little added a comment - The problem comes down to the line: SWBuf currentText = module->getRawEntry(); For compressed modules, currentText will sometimes be an empty string despite the fact the associated verse has already been assigned some text. For uncompressed modules, this is never a problem. So it does seem to be an issue with flushing.
          Hide
          chrislit Chris Little added a comment -

          This bug makes sense to me now, I suppose.

          The line SWBuf currentText = module->getRawEntry(); needs to be able to read text that has already been written. But that text doesn't get written until we have moved to the next compression block and have called module->flush();.

          Adding module->flush(); immediately before this line "fixes" the problem in that it eliminates the loss of data. However, it also amounts to compression at the verse level, which means the output files are much larger than the input.

          I'm not sure how to fix the issue under the current architecture.

          (Obviously there's a workaround: produce uncompressed modules with osis2mod then compress them with mod2zmod.)

          Show
          chrislit Chris Little added a comment - This bug makes sense to me now, I suppose. The line SWBuf currentText = module->getRawEntry(); needs to be able to read text that has already been written. But that text doesn't get written until we have moved to the next compression block and have called module->flush();. Adding module->flush(); immediately before this line "fixes" the problem in that it eliminates the loss of data. However, it also amounts to compression at the verse level, which means the output files are much larger than the input. I'm not sure how to fix the issue under the current architecture. (Obviously there's a workaround: produce uncompressed modules with osis2mod then compress them with mod2zmod.)
          Hide
          chrislit Chris Little added a comment -

          fixed in r3127:
          Basically, I used SWModule::hasEntry to determine if we've assigned text to a given verse. If so, I flush the cache, get the current verse text (which is now non-empty), concatenate, etc.

          It's not ideal, but since it only flushes the cache in the rare cases of reversified material, the final file size still good.

          Optimal compression still requires (re)compressing with mod2zmod.

          Show
          chrislit Chris Little added a comment - fixed in r3127: Basically, I used SWModule::hasEntry to determine if we've assigned text to a given verse. If so, I flush the cache, get the current verse text (which is now non-empty), concatenate, etc. It's not ideal, but since it only flushes the cache in the rare cases of reversified material, the final file size still good. Optimal compression still requires (re)compressing with mod2zmod.

            People

            • Assignee:
              chrislit Chris Little
              Reporter:
              chrislit Chris Little
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: