Difference between revisions of "Creating a custom lexer for Code::Blocks editor"

From CodeBlocks
Jump to: navigation, search
Line 1: Line 1:
The files that add syntax highlighting support for specific files are found under ''sdk/resources/lexers''. They're simple XML files named as <tt>lexer_*.xml</tt>.
+
The files that add syntax highlighting support for specific files are found under <tt>sdk/resources/lexers</tt>. They're simple XML files named as <tt>lexer_*.xml</tt>.
 +
 
 
Let's take <tt>lexer_cpp.xml</tt> as an example and disect it.
 
Let's take <tt>lexer_cpp.xml</tt> as an example and disect it.
  
Line 9: Line 10:
  
 
Pretty much self explanatory, except for the "magic" index number (we 'll come to it in a sec).
 
Pretty much self explanatory, except for the "magic" index number (we 'll come to it in a sec).
The name is the lexer's configuration name. This will appear in the editor's configuration dialog, in the languages drop down box (in colors editing page).
 
The filemasks is a comma separated list of the extensions that this lexer should be used for. This is case-insensitive.
 
The index corresponds with the wxSTC_LEX_* constants, found in wx/stc/stc.h. In this example, if you look in wx/stc/stc.h, you 'll see that index 3 matches wxSTC_LEX_CPP. The lexer id for C/C++ syntax highlighting Smile
 
If we were building a lexer configuration for, say, XML (random choice Wink ) we would look up the constant wxSTC_LEX_XML which is defined to be number 5. So index=5. Simple Smile
 
  
Next follows many <Style> tags defining the different styles:
+
* The '''name''' is the lexer's configuration name. This will appear in the editor's configuration dialog, in the languages drop down box (in colors editing page).
 +
 
 +
* The '''filemasks''' is a comma separated list of the extensions that this lexer should be used for. This is case-insensitive.
 +
 
 +
* The '''index''' corresponds with the <tt>wxSTC_LEX_*</tt> constants, found in <tt>wx/stc/stc.h</tt>. In this example, if you look in <tt>wx/stc/stc.h</tt>, you 'll see that ''index 3'' matches <tt>wxSTC_LEX_CPP</tt>. The lexer id for C/C++ syntax highlighting.
 +
 
 +
If we were building a lexer configuration for let's say, XML (random choice) we would look up the constant <tt>wxSTC_LEX_XML</tt> which is defined to be number ''5''. So ''index=5''. Simple.
 +
 
 +
Next follows many <tt><Style></tt> tags defining the different styles:
  
 
<code>
 
<code>
Line 26: Line 31:
 
</code>
 
</code>
  
Name is the style's name. It appears in the editor's configuration dialog, in the colors editing page.
+
* '''Name''' is the style's name. It appears in the editor's configuration dialog, in the colors editing page.
fg is the foreground color. Comma separated list of three numbers from 0 to 255. In order: red, green and blue (RGB).
+
bg is the background color.
+
bold is "0" for disabled, "1" for enabled. Same goes for italics and underlined.
+
You don't have to define all of these attributes. It's good to define them all for the "default" style (all lexers have a default style), but only the attributes needed should be defined for the rest of the styles.
+
  
The index number in the <Style> tags, comes from a different set of constants defined in wx/stc/stc.h. For each language supported by scintilla, there is a set of styles defined (these are what we 're trying to configure with these files). For example, for C/C++ files (wxSTC_LEX_CPP, remember?) the styles are defined as wxSTC_C_*.
+
* '''fg''' is the foreground color. Comma separated list of three numbers from 0 to 255. In order: red, green and blue (RGB).
For the "default" style shown above, this would be wxSTC_C_DEFAULT which is defined to be 0. Hence index=0 for "default".
+
 
 +
* '''bg''' is the background color.
 +
bold is <tt>"0"</tt> for disabled, <tt>"1"</tt> for enabled. Same goes for italics and underlined.
 +
 
 +
You don't have to define all of these attributes. It's good to define them all for the <tt>"default"</tt> style (all lexers have a default style), but only the attributes needed should be defined for the rest of the styles.
 +
 
 +
* The '''index''' number in the <tt><Style></tt> tags, comes from a different set of constants defined in <tt>wx/stc/stc.h</tt>. For each language supported by scintilla, there is a set of styles defined (these are what we're trying to configure with these files). For example, for C/C++ files (<tt>wxSTC_LEX_CPP</tt>, remember?) the styles are defined as <tt>wxSTC_C_*</tt>.
 +
 
 +
For the <tt>"default"</tt> style shown above, this would be <tt>wxSTC_C_DEFAULT</tt> which is defined to be ''0''. Hence ''index=0'' for <tt>"default"</tt>.
  
 
<code>
 
<code>
Line 41: Line 50:
 
</code>
 
</code>
  
This is the style definition for normal comments. As you can see you can define a single style for more than one style index, in this case two: 1 and 2 (always comma separated).
+
This is the style definition for normal comments. As you can see you can define a single style for more than one style index, in this case two: ''1'' and ''2'' (always comma separated).
1 is for wxSTC_C_COMMENT (the C comment /* */) and 2 is for wxSTC_C_COMMENTLINE (the C++ comment to end of line // ).
+
 
 +
''1'' is for <tt>wxSTC_C_COMMENT</tt> (the C comment <tt>/* */</tt>) and ''2'' is for <tt>wxSTC_C_COMMENTLINE</tt> (the C++ comment to end of line <tt>// </tt>).
  
 
I just want to add that there are some special styles defined by Code::Blocks and are available to all lexers:
 
I just want to add that there are some special styles defined by Code::Blocks and are available to all lexers:
 
  
 
* Index -99: the selected text style
 
* Index -99: the selected text style
Line 55: Line 64:
  
 
Now on to the keywords.
 
Now on to the keywords.
If the language you 're defining a lexer configuration for, has keywords they should be added in the <Keywords> tag. This tag can contain the following tags:
+
If the language you're defining a lexer configuration for, has keywords they should be added in the <tt><Keywords></tt> tag. This tag can contain the following tags:
<Language>, <User> and <Documentation>
+
 
Language contains the language keywords. These are usually at index 0.
+
<tt><Language></tt>, <tt><User></tt> and <tt><Documentation></tt>.
User is not used right now but might be in the future.
+
 
Documentation contains the documentation keywords (if any). If you look at the lexer_cpp.xml file, you 'll see that the documentation keywords defined are those of doxygen.
+
* '''Language''' contains the language keywords. These are usually at ''index 0''.
 +
 
 +
* '''User''' is not used right now but might be in the future.
 +
 
 +
* '''Documentation''' contains the documentation keywords (if any). If you look at the <tt>lexer_cpp.xml</tt> file, you'll see that the documentation keywords defined are those of [http://en.wikipedia.org/wiki/Doxygen doxygen].

Revision as of 06:36, 22 January 2006

The files that add syntax highlighting support for specific files are found under sdk/resources/lexers. They're simple XML files named as lexer_*.xml.

Let's take lexer_cpp.xml as an example and disect it.

<Lexer name="C/C++" index="3" filemasks="*.c,*.cpp,*.cc,*.cxx,*.h,*.hpp,*.hh,*.hxx,*.inl">

Pretty much self explanatory, except for the "magic" index number (we 'll come to it in a sec).

  • The name is the lexer's configuration name. This will appear in the editor's configuration dialog, in the languages drop down box (in colors editing page).
  • The filemasks is a comma separated list of the extensions that this lexer should be used for. This is case-insensitive.
  • The index corresponds with the wxSTC_LEX_* constants, found in wx/stc/stc.h. In this example, if you look in wx/stc/stc.h, you 'll see that index 3 matches wxSTC_LEX_CPP. The lexer id for C/C++ syntax highlighting.

If we were building a lexer configuration for let's say, XML (random choice) we would look up the constant wxSTC_LEX_XML which is defined to be number 5. So index=5. Simple.

Next follows many <Style> tags defining the different styles:

<Style name="Default" index="0" fg="0,0,0" bg="255,255,255" bold="0" italics="0" underlined="0"/>

  • Name is the style's name. It appears in the editor's configuration dialog, in the colors editing page.
  • fg is the foreground color. Comma separated list of three numbers from 0 to 255. In order: red, green and blue (RGB).
  • bg is the background color.

bold is "0" for disabled, "1" for enabled. Same goes for italics and underlined.

You don't have to define all of these attributes. It's good to define them all for the "default" style (all lexers have a default style), but only the attributes needed should be defined for the rest of the styles.

  • The index number in the <Style> tags, comes from a different set of constants defined in wx/stc/stc.h. For each language supported by scintilla, there is a set of styles defined (these are what we're trying to configure with these files). For example, for C/C++ files (wxSTC_LEX_CPP, remember?) the styles are defined as wxSTC_C_*.

For the "default" style shown above, this would be wxSTC_C_DEFAULT which is defined to be 0. Hence index=0 for "default".

<Style name="Comment (normal)" index="1,2" fg="160,160,160"/>

This is the style definition for normal comments. As you can see you can define a single style for more than one style index, in this case two: 1 and 2 (always comma separated).

1 is for wxSTC_C_COMMENT (the C comment /* */) and 2 is for wxSTC_C_COMMENTLINE (the C++ comment to end of line // ).

I just want to add that there are some special styles defined by Code::Blocks and are available to all lexers:

  • Index -99: the selected text style
  • Index -98: the active line style (the line the caret is on)
  • Index -2 : the breakpoint line style
  • Index -3 : the debugger active line style (while stepping the debugger)
  • Index -4 : the compiler warning/error line style


Now on to the keywords. If the language you're defining a lexer configuration for, has keywords they should be added in the <Keywords> tag. This tag can contain the following tags:

<Language>, <User> and <Documentation>.

  • Language contains the language keywords. These are usually at index 0.
  • User is not used right now but might be in the future.
  • Documentation contains the documentation keywords (if any). If you look at the lexer_cpp.xml file, you'll see that the documentation keywords defined are those of doxygen.