Difference between revisions of "Code Completion Design"
(→Tokenizer class: more specific information added) |
(add usefull links) |
||
Line 125: | Line 125: | ||
then a Code::blocks debug panel will be shown to display the log. | then a Code::blocks debug panel will be shown to display the log. | ||
[[Image:CbDebugLog.png|frame|none| Debug Log output panel]] | [[Image:CbDebugLog.png|frame|none| Debug Log output panel]] | ||
+ | ==Usefull Links== | ||
+ | |||
+ | A discussion on search tree in the forum [/index.php/topic,1696.0.html] and [/index.php/topic,1581.0.html] | ||
+ | |||
+ | Another opensource IDE [http://vcfbuilder.org/?q=node/139 VCF builder]or[https://sourceforge.net/projects/vcfbuilder/ vcfbuilder on sf] and based on [http://www.antlr.org/ antlr parser]. | ||
+ | |||
+ | online book[http://www.macs.hw.ac.uk/~alison/alg/details.html Data Structures and Algorithms II Course ] and [http://www.macs.hw.ac.uk/~alison/alg/lectures.html pdf lectures] |
Revision as of 03:01, 25 February 2009
How to build
Get the source code
When you download the svn source code of code::blocks,(see here Installing_Code::Blocks_from_source_on_Windows#Code::Blocks_sources the source code of CodeCompletion plugin was already included.
See a screen shot of these code opened in code::blocks under windows below.
Build the code completion plug in
Note, you should use "update.bat" to copy the new generated dll to the destination and strip the debug information. Here is the modified bat file which only update CodeCompletion.DLL.
@echo off setlocal echo Creating output directory tree set CB_DEVEL_RESDIR=devel\share\CodeBlocks set CB_OUTPUT_RESDIR=output\share\CodeBlocks set ZIPCMD=zip xcopy /D /y %CB_DEVEL_RESDIR%\plugins\codecompletion.dll %CB_OUTPUT_RESDIR%\plugins\codecompletion.dll echo Stripping debug info from output tree strip %CB_OUTPUT_RESDIR%\plugins\codecompletion.dll
see Installing_Code::Blocks_from_source_on_Windows for more information.
Low level parser
For someone haven't heard what does "Token" and "Tokenize" means, you should read the wikibooks article A brief explain of what does a parser do and Tokenize on wikipedia. Shortly, a parser treats your C++ or C code as a large array of characters, then this big string was divided to small atom strings, meanwhile "spaces" and "comments" were ignored.
for a simple c++ program like below
int main() { std::cout << "hello world" << std::endl; return 0; }
After Tokenized it should give these 15 tokens
1 = string "int" 2 = string "main" 3 = opening parenthesis 4 = closing parenthesis 5 = opening brace 6 = string "std" 7 = namespace operator 8 = string "cout" 9 = << operator 10 = string ""hello world"" 11 = string "endl" 12 = semicolon 13 = string "return" 14 = number 0 15 = closing brace
Tokenizer class
A class named "Tokenizer" was introduced in "tokenizer.cpp" and "tokenizer.cpp". There are several steps to running the Tokenizer class.
parser thread
A thread must be created to parse a source file. see parserthread.cpp and parserthread.h
Read a source file
Open the source file and convert the file buff to Unicode mode.(since we are all using Unicode build of code::blocks, and ANSI mode is outdated).
Get or Peek a token
The class contains a Pointer to the current position of the character, you can Get or Peek the current character.
//Get the current Token and increase the Tokenindex wxString GetToken(); //Peak the current and NOT increase the index wxString PeekToken();
For example, if the Tokenizer was parsing the example code above.
After initialize the Tokenizer, call the GetToken() function will return a wxString "int" and increase the token index to pointing to "int", at this time, call the PeekToken() will return a wxString "main", but the tokenindex was still pointing to "main". If you call the GetToken() again, then it will return a "main" and increase the file pointer.
Nested Value
This value was keep to indicate your are in the correct brace pair.If the Tokenizer meets a {, it will increase the nestValue, and if it meets a }, it will decrease the nestValue.
Return a correct token
Special token should be replaced for parsing correctly. For example, in the standard c++ header (mingw), there are a string named "_GLIBCXX_STD", this should be replaced to "std". See the dialog below.
The inline function in the Tokenizer class will check whether a token should be replaced before return.
//This is a map, check the first string and return the second string inline const wxString& ThisOrReplacement(const wxString& str) const { ConfigManagerContainer::StringToStringMap::const_iterator it = s_Replacements.find(str); if (it != s_Replacements.end()) return it->second; return str; }
Setting the replacement mapping. Note that two many replacement mapping will slow down the parsing performance.
High level parser
Token
For boosting the speed of allocating Tokens, the "new" and "delete" operator were overloaded in it's base class say "class Token : public BlockAllocated<Token, 10000>". In BlockAllocated class, there is only a static member say "static BlockAllocator<T, pool_size, debug> allocator;" to keep all the pre-allocated memorys for all derived class.10000 means 10000 Tokens were allocated.
TokenTree
Each identifier will be recorded in the TokenTree for later usage.
UI issue
Debug Log output
If you want to debug your plug-in, you may need to Logout the debug information. Mostly, here is the code Manager::Get()->GetLogManager()->DebugLog(_("XXXXX "));
Also, you need start the codeblocks with the command line argument. For example in windows.
codeblocks.exe --debug-log
then a Code::blocks debug panel will be shown to display the log.
Usefull Links
A discussion on search tree in the forum [/index.php/topic,1696.0.html] and [/index.php/topic,1581.0.html]
Another opensource IDE VCF builderorvcfbuilder on sf and based on antlr parser.
online bookData Structures and Algorithms II Course and pdf lectures