Difference between revisions of "Creating a custom lexer for Code::Blocks editor"

From CodeBlocks
Jump to: navigation, search
Line 91: Line 91:
  
 
* '''Documentation''' contains the documentation keywords (if any). If you look at the <tt>lexer_cpp.xml</tt> file, you'll see that the documentation keywords defined are those of [[wikipedia:Doxygen|Doxygen]].
 
* '''Documentation''' contains the documentation keywords (if any). If you look at the <tt>lexer_cpp.xml</tt> file, you'll see that the documentation keywords defined are those of [[wikipedia:Doxygen|Doxygen]].
 +
 +
 +
==Sample Code==
 +
 +
The tag left is <tt>SampleCode</tt>. This is much pretty self explanatory:
 +
 +
        <SampleCode value="lexer_cpp.sample"
 +
                    breakpoint_line="20"
 +
                    debug_line="22"
 +
                    error_line="23"/>
 +
 +
* '''value''' is the filename of the code that will be shown in the Preview window.
 +
When creating a <tt>lexer_*.sample</tt> try to do it with simple code (like the ones found in a typical "Hello world!"), yet include all the keywords of the lexer, so the user can preview how they'll be applied.
 +
 +
There are other optional options ('''Note: this index was removed completely?''')
 +
* '''breakpoint_line''' is the number of the line in which a breakpoint line will be previewed.
 +
* '''debug_line''' is the number of the line in which a debug line will be previewed.
 +
* '''error_line''' is the number of the line in which an error line will be previewed.

Revision as of 08:46, 22 January 2006

The files that add syntax highlighting support for specific files are found under sdk/resources/lexers. They're simple XML files named as lexer_*.xml.

Let's take lexer_cpp.xml as an example and disect it.

Lexers

<Lexer name="C/C++"
       index="3"
       filemasks="*.c,*.cpp,*.cc,*.cxx,*.h,*.hpp,*.hh,*.hxx,*.inl">

Pretty much self explanatory, except for the "magic" index number (we'll come to it in a sec).

  • name is the lexer's configuration name. This will appear in the editor's configuration dialog, in the languages drop down box (in colors editing page).
  • filemasks is a comma separated list of the extensions that this lexer should be used for. This is case-insensitive.
  • index corresponds with the wxSCI_LEX_* constants, found in sdk/wxscintilla/include/wx/wxscintilla.h. In this example, if you look in sdk/wxscintilla/include/wx/wxscintilla.h, you'll see that index 3 matches wxSCI_LEX_CPP. That is the lexer id for C/C++ syntax highlighting.

If we were building a lexer configuration for let's say, XML (random choice) we would look up the constant wxSCI_LEX_XML which is defined to be number 5. So index=5. Simple.

Styles

Next follows many <Style> tags defining the different styles:

       <Style name="Default"
              index="0"
              fg="0,0,0"
              bg="255,255,255"
              bold="0"
              italics="0"
              underlined="0" />
  • name is the style's name. It appears in the editor's configuration dialog, in the colors editing page.
  • fg is the foreground color. Comma separated list of three numbers from 0 to 255. In order: red, green and blue (RGB).
  • bg is the background color.
  • bold is "0" for disabled, "1" for enabled.
  • italics is "0" for disabled, "1" for enabled.
  • underlined is "0" for disabled, "1" for enabled.

You don't have to define all of these attributes. It's good to define them all for the "default" style (all lexers have a default style), but only the attributes needed should be defined for the rest of the styles.

  • The index number in the <Style> tags, comes from a different set of constants defined in sdk/wxscintilla/include/wx/wxscintilla.h.

For each language supported by scintilla, there is a set of styles (lexical states) defined (these are what we're trying to configure with these files).

For example, for C/C++ files (wxSCI_LEX_CPP, remember?) the styles are defined as wxSCI_C_*.

For the "default" style shown above, this would be wxSCI_C_DEFAULT which is defined to be 0. Hence index=0 for "default".

       <Style name="Comment (normal)"
              index="1,2"
              fg="160,160,160" />

This is the style definition for normal comments. As you can see you can define a single style for more than one style index, in this case two: 1 and 2 (always comma separated).

1 is for wxSCI_C_COMMENT (the C comment /* */) and 2 is for wxSCI_C_COMMENTLINE (the C++ comment to end of line // ).


There are some special styles defined by Code::Blocks and are available to all lexers:

  • index -99 is the selected text style.
  • index -98 is the active line style (the line the caret is on).
  • index -2 is the breakpoint line style.
  • index -3 is the debugger active line style (while stepping the debugger).
  • index -4 is the compiler warning/error line style. (Note: this index was removed completely?)

Keywords

Now on to the keywords.

       <Keywords>
              <Language index="0"
                        value="if int long try while and-so-on" />
              <User index="1" />
              <Documentation index="2"
                             value="param remarks return $ @ \ & < > # { } and-so-on" />
       </Keywords>

If the language you're defining a lexer configuration for, has keywords they should be added in the <Keywords> tag. This tag can contain the following tags:

<Language>, <User> and <Documentation>.

  • Language contains the language keywords. These are usually at index 0.
  • User is not used right now but might be in the future.
  • Documentation contains the documentation keywords (if any). If you look at the lexer_cpp.xml file, you'll see that the documentation keywords defined are those of Doxygen.


Sample Code

The tag left is SampleCode. This is much pretty self explanatory:

       <SampleCode value="lexer_cpp.sample"
                   breakpoint_line="20"
                   debug_line="22"
                   error_line="23"/>
  • value is the filename of the code that will be shown in the Preview window.

When creating a lexer_*.sample try to do it with simple code (like the ones found in a typical "Hello world!"), yet include all the keywords of the lexer, so the user can preview how they'll be applied.

There are other optional options (Note: this index was removed completely?)

  • breakpoint_line is the number of the line in which a breakpoint line will be previewed.
  • debug_line is the number of the line in which a debug line will be previewed.
  • error_line is the number of the line in which an error line will be previewed.