Difference between revisions of "Creating a custom lexer for Code::Blocks editor"

From Code::Blocks
Line 2: Line 2:
  
 
Let's take <tt>lexer_cpp.xml</tt> as an example and disect it.
 
Let's take <tt>lexer_cpp.xml</tt> as an example and disect it.
 +
 +
==XML==
 +
 +
<?xml version="1.0"?>
 +
 +
Basically this says "I am an XML file". '''Very Important'''
 +
 +
==DOCTYPE==
 +
 +
<!DOCTYPE CodeBlocks_lexer_properties>
 +
 +
This needs to be in every lexer file. Code::Blocks '''will not''' load the lexer if this is not present.
  
 
==Lexers==
 
==Lexers==

Revision as of 01:05, 3 February 2006

The files that add syntax highlighting support for specific files are found under sdk/resources/lexers. They're simple XML files named as lexer_*.xml.

Let's take lexer_cpp.xml as an example and disect it.

XML

<?xml version="1.0"?>

Basically this says "I am an XML file". Very Important

DOCTYPE

<!DOCTYPE CodeBlocks_lexer_properties>

This needs to be in every lexer file. Code::Blocks will not load the lexer if this is not present.

Lexers

<Lexer name="C/C++"
       index="3"
       filemasks="*.c,*.cpp,*.cc,*.cxx,*.h,*.hpp,*.hh,*.hxx,*.inl">

Pretty much self explanatory, except for the "magic" index number (we'll come to it in a sec).

  • name is the lexer's configuration name. This will appear in the editor's configuration dialog, in the languages drop down box (in colors editing page).
  • filemasks is a comma separated list of the extensions that this lexer should be used for. This is case-insensitive.
  • index corresponds with the wxSCI_LEX_* constants, found in sdk/wxscintilla/include/wx/wxscintilla.h. In this example, if you look in sdk/wxscintilla/include/wx/wxscintilla.h, you'll see that index 3 matches wxSCI_LEX_CPP. That is the lexer id for C/C++ syntax highlighting.

If we were building a lexer configuration for let's say, XML (random choice) we would look up the constant wxSCI_LEX_XML which is defined to be number 5. So index=5. Simple.

Styles

Next follows many <Style> tags defining the different styles:

       <Style name="Default"
              index="0"
              fg="0,0,0"
              bg="255,255,255"
              bold="0"
              italics="0"
              underlined="0" />
  • name is the style's name. It appears in the editor's configuration dialog, in the colors editing page.
  • fg is the foreground color. Comma separated list of three numbers from 0 to 255. In order: red, green and blue (RGB).
  • bg is the background color.
  • bold is "0" for disabled, "1" for enabled.
  • italics is "0" for disabled, "1" for enabled.
  • underlined is "0" for disabled, "1" for enabled.

You don't have to define all of these attributes. It's good to define them all for the "default" style (all lexers have a default style), but only the attributes needed should be defined for the rest of the styles.

  • The index number in the <Style> tags, comes from a different set of constants defined in sdk/wxscintilla/include/wx/wxscintilla.h.

For each language supported by scintilla, there is a set of styles (lexical states) defined (these are what we're trying to configure with these files).

For example, for C/C++ files (wxSCI_LEX_CPP, remember?) the styles are defined as wxSCI_C_*.

For the "default" style shown above, this would be wxSCI_C_DEFAULT which is defined to be 0. Hence index=0 for "default".

       <Style name="Comment (normal)"
              index="1,2"
              fg="160,160,160" />

This is the style definition for normal comments. As you can see you can define a single style for more than one style index, in this case two: 1 and 2 (always comma separated).

1 is for wxSCI_C_COMMENT (the C comment /* */) and 2 is for wxSCI_C_COMMENTLINE (the C++ comment to end of line // ).


There are some special styles defined by Code::Blocks and are available to all lexers:

  • index -99 is the selected text style.
  • index -98 is the active line style (the line the caret is on).
  • index -2 is the breakpoint line style.
  • index -3 is the debugger active line style (while stepping the debugger).
  • index -4 is the compiler warning/error line style. (Note: this index was removed completely?)

Keywords

Now on to the keywords.

       <Keywords>
              <Language index="0"
                        value="if int long try while and-so-on" />
              <User index="1" />
              <Documentation index="2"
                             value="param remarks return $ @ \ & < > # { } and-so-on" />
       </Keywords>

If the language you're defining a lexer configuration for, has keywords they should be added in the <Keywords> tag. This tag can contain the following tags:

<Language>, <User> and <Documentation>.

  • Language contains the language keywords. These are usually at index 0.
  • User is not used right now but might be in the future.
  • Documentation contains the documentation keywords (if any). If you look at the lexer_cpp.xml file, you'll see that the documentation keywords defined are those of Doxygen.


Sample Code

The tag left is SampleCode. This is much pretty self explanatory:

       <SampleCode value="lexer_cpp.sample"
                   breakpoint_line="20"
                   debug_line="22"
                   error_line="23"/>
  • value is the filename of the code that will be shown in the Preview window.

When creating a lexer_*.sample try to do it with simple and concise sample code (like the ones found in a typical "Hello world!"), yet include all the keywords of the lexer.

There are other optional options (Note: this index was removed completely?)

  • breakpoint_line is the number of the line in which a breakpoint line will be previewed.
  • debug_line is the number of the line in which a debug line will be previewed.
  • error_line is the number of the line in which an error line will be previewed.


Adding support for code-folding

Note: support for code-folding can't be done from the lexer files. It must be done right in the C::B code.

See the XML or C/C++ code for code-folding for an example of how to do it.

This might help also: http://sphere.sourceforge.net/flik/docs/scintilla-folding.html

Adding support for a lexer not supported in Scintilla

Note: support for a lexer not supported in Scintilla is out of scope of Code::Blocks. It must be done right in the Scintilla code.

Here are some instructions: http://scintilla.sourceforge.net/Lexer.txt

  • After that, make any necesary change to wxScintilla and sumbit the files to the wxScintilla tracker or send a mail to the autor wyo@users.sourceforge.net (Otto Wyss).