From Fedora Project Wiki

< Changes

Revision as of 08:26, 11 January 2018 by Mfabian (talk | contribs)


Glibc collation update and sync with cldr

Summary

Update collation data in glibc to an ISO file from 2015 (in sync with Unicode 8.0.0) and sync collation rules of the locales with CLDR.

Owner

  • Name: Mike Fabian
  • Email: <mfabian@redhat.com>
  • Release notes owner:

Current status

  • Targeted release: Fedora 28
  • Last updated: 2018-01-11
  • Tracker bug: <will be assigned by the Wrangler>

Detailed Description

The collation data in glibc is extremely out of date, most locales base their collation rules on an iso14651_t1_common file which has not been updated for probably more than 15 years. Therefore, all characters added in later Unicode versions are missing and not sorted at all which causes bugs like [[1]]. This change is about updating that iso146541_t1_common file to the latest available version from ISO which is from 2015 and up-to-date with Unicode 8.0.0. Because additions and changes in the syntax of the new iso146541_t1_common file, updating that file requires changing the collation rules of almost all locales. Because all these collation rules have to be touched anyway, this is a good opportunity to fix bugs in the collation ruies and sync them with the collation rules in CLDR.

Benefit to Fedora

This will fix many bugs in the collation and make glibc sort more correctly according to current standards.

Scope

  • Proposal owners: Work with upstream, file bugs and provide patches where required.
  • Other developers: This change will impact glibc and everything which sorts strings using the collation functions from glibc. Other Developers do not need to make any changes from their end, but they need to watch how their application behaves with improved localedata. We need proper testing to see that it does not break any application.
  • Policies and guidelines: No, this change does not require any updates to Policies or packaging guideline updates.
  • Trademark approval: N/A (not needed for this Change)

Upgrade/compatibility impact

The sort order of strings in many locales will change somewhat.

How To Test

Test if locale specific sorting works correctly according to the sorting rules for a locale. Test if characters added up to Unicode 8.0.0 sort correctly.

User Experience

Better sorting of strings by glibc, more up-to-date with current standards.

Dependencies

  • Upstream release schedule.
  • If our patches does not come in upstream, we will not try to patch it in Fedora. So this change will make it into Fedora 28 only if glibc 2.27 is released in time for Fedora 28.

Contingency Plan

  • Contingency mechanism: Will move change to Fedora 29 release.
  • Contingency deadline: Fedora 29 Beta release.
  • Blocks release? No. Yes/No
  • Blocks product? No.

Documentation

[[2]]

Release Notes