Opened 9 years ago

Closed 8 years ago

Last modified 8 years ago

#6808 closed bug (invalid)

Drop the checksum field from catkeys file header

Reported by: rq Owned by: pulkomandy
Priority: normal Milestone: R1
Component: Kits/Locale Kit Version: R1/alpha2
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

From discussions with PulkoMandy, it seems to me that the checksum field in the catkeys header is superflous and could be dropped (PulkoMandy doesn't think that way though). My reasons for dropping it are:

1) it's being used at compile time to ensure that the source strings have not been altered, which is IMO not the best way to ensure integrity. Instead, checks could be done at compile time, or a special script could be developed to make checks for missing/obsolete strings easier. Here's an example output of such app, used for Mozilla: rq@sugar:~/src/mozilla/mercurial$ compare-locales mozilla-central/browser/locales/l10n.ini . lt lt

browser/feedback/main.properties

+testpilot.welcomePage.backgroundSurvey +testpilot.welcomePage.pleaseTake

toolkit/chrome/global/headsUpDisplay.properties

+jsWorkspaceTitle

lt: keys: 984 unchanged: 257 changed: 5257 missing: 3 95% of entries changed

2) the checksum is also used at runtime for the same purpose. Again, I think it's even less useful in this case. At runtime, the application may have five of fifty new strings that don't exist in my file anyway, and my invalid string may have already been dropped. What's the point on invalidating whole translation just based on that? I see none. Also, PulkoMandy said that the checksum may at some point used to block outdated translations from being used. If that were ever to happen, I suggest to compute the effective checksum of a localization at runtime instead (which has to be done anyway, at least for languages with fallbacks, like pt_BR).

3) if someone were to ever edit the catkeys files manually, the checksum isn't easy to calculate.

4) the checksum doesn't say anything about the target language strings, which means these can have any amount of errors, bad encoding and anything else. This again mandates a compile-time check, which, once existing, could be expanded to take over the checksums job anyway.

Change History (7)

comment:1 Changed 9 years ago by diver

Since HTA in a semi working state for several months now I had to edit the catkeys files manually and the need to change checksum field every time after small correction is really a PITA. So +1 from a translator POV.

comment:2 Changed 9 years ago by pulkomandy

The checksum is computed only on the english strings, so it shouldn't change. current hta is messing with it and creating problems.

For reference, here are the reasons I want to keep this checksumming :

  • It's used for symmetry between the .catalog and .catkeys files. This way, they share more code
  • It allows to ensure that a catkeys file is correct and was not corrupted. Cause of corruption can be bug of hta, or user errors (reencoding the file to something else than utf-8, for example)
  • Version matching of catkeys. By comparing the fingerprint in en.catkeys and the one in a translation, one can immediately tell wether the translation is up to date.

The first item is a lame developper excuse. The two others are more important. Note that the pain you feel as a translator for having to updating this field, would be transferred to me, for having to check everything by hand, and this, for all languages. My goal is to make the process fully automatic from translator to svn. This needs some sanity checks on the submitted files, and thecatkey allows to check for file validity instantly. If the key isout of sync, it means something went wrong somewhere, and this will have consequences down the line. Hiding the symptom is not a good cure.

Now, the reasons mentionned by RQ are also good points in the other direction. There are other, more complicated ways of doing these checks, but in no case I'm continuing handling this manually. So don't remove the catkey until there's a proven, working way of doing the same checks, with as much automatisation as possible.

comment:3 Changed 8 years ago by pulkomandy

Resolution: invalid
Status: newclosed

Ok, Pootle now has catkey computation working fine, so I consider this closed.

comment:4 Changed 8 years ago by rq

Pootle wasn't even mentioned as the reason for this bug. I still believe the checksum could be safely dropped without notable negative effects.

comment:5 in reply to:  4 Changed 8 years ago by siarzhuk

Replying to rq:

Pootle wasn't even mentioned as the reason for this bug. I still believe the checksum could be safely dropped without notable negative effects.

Regardless of mentioning HTA, Pootle or other translator applications - checksum protection of catkeys is not only helpful, but necessary to prevent from using bad-formatted files and let to catch the problem on catalog compilation stage.

comment:6 Changed 8 years ago by rq

Siarzhuk, I covered that use-case in my initial comment...

comment:7 Changed 8 years ago by siarzhuk

Regardless of theory it saves me lot of time in real life during hunting for invalid translation lines coming from HTA. If you any objections - please initiate discussion about improved format in haiku-development mail list. At the current state the checksum is required to catch invalid formatted catkeys on the compilation stage.

Note: See TracTickets for help on using tickets.