“If debugging is the process of removing bugs, then programming must be the process of putting them in.” – Edsger Dijkstra
This is the Bug Tracking System for the Flyspray project. This is not a demo! Before opening a new task, please read the guidelines!
Do not issue bugs reports against versions earlier than 0.9.9.6
Security problem? Check the security section.
FS#1605 - Regex "Leak" in DokuWiki Lexer as result of Mode Caching
Opened by Slash Bunny (Slash) - Wednesday, 16 September 2009, 17:02 GMT+1
|
DetailsI have discovered what I call a regex “leak”. Every call to dokuwiki_TextFormatter::render() (done by templates/details.tabs.comment.tpl) causes each dokuwiki’s rendering mode lexers to progressively append more and more regular expressions to it’s internal pattern array. They are duplicate patterns, not new ones. Under normal circumstances, the number of patterns should not change as Flyspray/DokuWiki is rendering comments on a task. Eventually, the regular expression that is constructed by appending all these regex patterns together gets so large (done by Doku_LexerParallelRegex::_getCompoundedRegex()) that PHP‘s PCRE engine explodes and throws an error like this:
After this point, no more rendering is done, as the regex that Doku_LexerParallelRegex::_getCompoundedRegex() tries to do simply fails over and over. How many comments triggers this behavior seems to vary and I’m not sure why. I have seen it on tasks with about 30 comments. Then again, I have seen ones approximately that long that did not “overflow” with regexes yet. But it’s definitely only an issue which would effect only tasks with a lot of comments. Tracking down this issue is more complicated because of the way Flyspray caches instructions. In order to quickly confirm this is happening, you will need to modify a file and view a non-cached Task page. This outputs the number of regexes on the “base” mode, but note that it happens to all registered modes. File: plugins/dokuwiki/dokuwiki_formattext.inc.php Function: render() // Add modes to parser foreach($modes as $mode){ $Parser->addMode($mode['mode'], $mode['obj']); } $instructions = $Parser->parse($text); // Output number of patterns in the base mode's lexer echo count($Parser->modes['base']->Lexer->_regexes["base"]->_patterns); On my installation, the first call echo’s 55, 66, 77, 88, 99, 110, etc. The exact number will most likely vary. However, if I remove the Flyspray “caching”, like so: File: plugins/dokuwiki/inc/parserutils.php Function: p_get_parsermodes() /*
//reuse old data
static $modes = null;
if($modes != null){
return $modes;
}
*/
The result is 55, 55, 55, 55, 55, etc, as expected. I don’t know if this is a Flyspray bug or a DokuWiki bug. For my install, I just commented out the “mode caching”, like above, in p_get_parsermodes() to resolve the problem. However, this only fixes the symptom and not the real problem. If you var_dump() the _patterns array (instead of doing a count()), you’ll see that the duplicates are all from the externallink rendering mode- The patterns all include regexes looking for http, https, ftp, gopher, telnet, etc. So it’s possible it’s a bug in Doku_Parser_Mode_externallink specifically.
Actually, the more I look into this, the more I think it’s a DokuWiki issue. But I have only been looking at this code for a few days, so I’m not sure exactly how the DokuWiki integration was done. |
ok
OK to that