Localization lab specs

From Inforail
Jump to: navigation, search

Keywords

l10n, i18n, localization, internationalization, Unicode, ASCII, BOM, multibyte, TCHAR, widechar, translation, GUI

Objectives

Develop a mechanism that enables a program to display its interface in multiple languages, depending on how it is configured.

A GUI is not mandatory, it is sufficient to develop a command line program that will print some strings in one language or another (depending on the command line arguments). Ex: myProg.exe /FR will display French strings, while myProg.exe /EN will display English strings.


Warnings:

  • "Solutions" such as this one are not solutions: if (param == "EN") {print "English";} else if (param == "FR") {print "French";}. What happens here is another form of hard-coding, avoid it like the plague.
  • You'll use this mechanism in your future assignments.


Requirements
  • The program must use Unicode for all string-related procedures
  • The strings for each language must be loaded from a file
  • The program must display the paths to the special folders in the system on which it is ran (for the currently logged on user):
    • Program Files
    • My Documents
    • On POSIX systems, look for the home directory; if you're running a fancy desktop environment, get the paths to the Photos and Music folder
  • The program must display the current date and time using the system's regional settings formatting


Typical problems with localization

  • Hard-coding strings into the code instead of loading them from an external resource
  • Using ASCII instead of Unicode - this makes the program unable to display special characters if the operating system is not configured accordingly
  • Special folder names, ex: "Documents and Settings", "Program Files" will have different names on different versions of Windows. Instead of hard-coding the path, use a function that determines the path to this folder in the current locale: SHGetSpecialFolderLocation
  • String parsing functions can behave incorrectly when dealing with multibyte characters, the first byte of which is NULL
  • Hard-coding symbols or formats into your parsing functions, ex:
    • ',' as a separator in CSV files is sometimes replaced with ';' in other cultures
    • Date and time formats
  • Calendars - some cultures use different ones
  • Units - some cultures use the metric system, but not all of them
  • Hard-coding coordinates of widgets. Not only that this may make the program look ugly on screens with a different resolution, but it may look entirely unnatural for people from other cultures (see 'Arabic interfaces' in the References)
  • Using exact sizes when rendering windows, widgets or texts may result in strings in a different language not fitting into the given space
  • If strings are not managed in "one place to rule them all", there is a great chance that inconsistency will arise, ex: http://habrahabr.ru/blogs/ui_design_and_usability/70762/
  • Hardcoded order of elements in a string.



References