Unicode Standards

From Code::Blocks
Revision as of 20:14, 5 September 2005 by Jmccay (talk | contribs) (Place for the Unicode information)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

This page is meant to be a location for developers to find all the current Unicode standards, or good practices, when editor and developing the Code::Blocks program. I am going to try and summarize the discussions that I was pointed to here, but I am leaving out the original author. Sorry Feel free to edit this to improve it, or keep it up to date. I am new to using wiki, so please exscuse the bad designs. This is a a VERY rough draft with no clear organizational pattern.
-- Joe M.

reference: [1]


Macros

{NOTE: bullet list would look better here, but bold is used for now}
__TFILE__ = wxWidgets provide equivilant to __FILE__
__TDATE__ = wxWidgets provide equivilant to __DATE__
__TTIME__ = wxWidgets provide equivilant to __TIME__
_U() = Use it to convert non-literal char* strings to wxString. Use it for reading attributes from TiXmlNode's. If you deal with functions that return strings, you must use our _U macro.

Code:

  #ifdef wxUSE_UNICODE
    #define _U(x) wxString((x),wxConvUTF8)
    #define _UU(x,y) wxString((x),y)
  #else
    #define _U(x) (x)
    #define _UU(x,y) (x)
  #endif

i.e.: Code:

  const char* incompatible = "This is an incompatible string";
  wxString compatible = _U(incompatible);
  // wxString conftype = conf->Attribute("ConfigurationType"); // before
  wxString conftype = _U(conf->Attribute("ConfigurationType")); // after :)

_C() = multibyte C string see wxhelp (wxMBConv classes overview)
Is defined in code as:

  #if wxUSE_UNICODE
     #define _UU(x,y) wxString((x),(y))
     #define _CC(x,y) (x).mb_str((y))
  #else
      #define _UU(x,y) (x)
      #define _CC(x,y) (x)
  #endif
  #define _U(x) _UU((x),wxConvUTF8)
  #define _C(x) _CC((x),wxConvUTF8)

_wxT() = fixed text's - like XRC resources object names (only adds an L before the string (ONLY if you're in a unicode build).)
wxT() is a macro which can be used with character and string literals (in other words, 'x' or "foo") to automatically convert them to Unicode in Unicode build configuration. Please see the Unicode overview for more information.

This macro is simply returns the value passed to it without changes in ASCII build. In fact, its definition is:

  #ifdef UNICODE
  #define wxT(x) L ## x
  #else // !Unicode
  #define wxT(x) x
  #endif

_T() = fixed text's - like XRC resources object names (only adds an L before the string (ONLY if you're in a unicode build).).
This macro is exactly the same as wxT and is defined in wxWidgets simply because it may be more intuitive for Windows programmers as the standard Win32 headers also define it (as well as yet another name for the same macro which is _TEXT()).

Don't confuse this macro with _()!

  wxChar _T(char ch)
  const wxChar * _T(const wxChar ch)

_() = text's which might be translated to other user-languages
This macro expands into a call to wxGetTranslation function, so it marks the message for the extraction by xgettext just as wxTRANSLATE does, but also returns the translation of the string for the current locale during execution.

Don't confuse this macro with _T()!

wxPLURAL = This macro is identical to _() but for the plural variant of wxGetTranslation.
const wxChar * wxPLURAL(const char *sing, const char *plur, size_tn)





Guidlines

char & wxChar:
Do not use wxChar when is not a text character, because a wxChar in unicode is an int of 16 bits (not 8 bits):

Example for text:

  wxChar im_a_character = _T('f');

Example for not text (not character):

  char im_a_byte = 254;

but perhaps better would be to use:

  byte im_a_byte = 254;

so it's clear that it's a byte and not a character.


Other:
Problem code:

  // indent code accordingly
  wxString code = it->second;
  code.Replace("\n", '\n' + lineIndent);

Solution: If the input is a const char*, use "normal strings". If the input is a wxChar or wxString, use the _T("macros"). For example:

  // indent code accordingly
  wxString code = it->second;
  code.Replace(_T("\n"), _T('\n') + lineIndent);


Some of the strings already converted in C::B, use _( when they should be _T(.

Example:

  WRONG: wxXmlResource::Get()->LoadDialog(this, parent, _("dlgGenericMultiSelect"));

dlgGenericMultiSelect is a reference to a resource. Therefore it must use _T instead.

  RIGHT: wxXmlResource::Get()->LoadDialog(this, parent, _T("dlgGenericMultiSelect"));

And don't forget to test for single characters, too!



All operations with wxStrings (not char*'s) should have _("string") for strings to be displayed to the user, and _T("string") for strings used internally.


Printf-like functions is - use c_str() (in examples in wxwidgets.org there are used different arguments for unicode and non-unicode versions where formating string was both "%s"). For example:

  tmpkey.Printf(_T("%s/editor/keywords/%d"), key.c_str(), i);


XRCID and XRCCTRL macros:
XRCID and XRCCTRL macros must _NOT_ be converted! They're pre-converted already!

  WRONG:   XRCCTRL(*this, _T("lblLabel"), wxStaticText)->SetLabel(label);


  RIGHT:   XRCCTRL(*this, "lblLabel", wxStaticText)->SetLabel(label);


concatenated strings:
_() is macro which calls one of wxWidget's internal function so concatenating should look like this:

  _("string 1" "string2" ... )

_T() macro simply adds 'L' before string given as a param (in Unicode of course, in normal mode it do nothing with the string) so concatenation should be:

  _T("string1") _T("string2") ...


This need to be rewritten. If nobody else improves on this, I will try and rewrite this once I have used these macros more. Joe M.