Feature/regex match speedup#1456
Conversation
Compile the diagnostic output regex pattern once and reuse it across all stream name checks instead of calling regcomp() for every comparison. This avoids repeated regex compilation overhead in regex_matching.c, which could account for more than 30% of execution time. Timer information shows 10x speedup in 'diagnostic_fields' on CWA Fujitsu FX1000 machine Credit: Yu-Tze Hong <aspen.hong@gmail.com>
|
@yjwu890355 Thank you for identifying this opportunity to speed up regular expression matching. Under the assumption that most calls to the Would you be willing to consider the code changes in commit ae7a389 instead? |
|
Yes, thank you for the suggestion. Based on my testing, this approach also gives about a 10x speedup in diagnostic_fields (200s ->14s) on the CWA Fujitsu FX1000 machine. I agree that commit ae7a389 is a simpler and more effective solution. |
I've just opened PR #1466 with the simplified optimization. Please let me know how you and Yu-Tze Hong would like to be credited (name, GitHub ID, institution, etc.). Thanks so much again for finding this optimization! |
Speed up regex matching in diagnostic output
Compile the diagnostic output regex pattern once and reuse it across all stream
name checks instead of calling regcomp() for every comparison. This avoids
repeated regex compilation overhead in regex_matching.c, which could account for
more than 30% of execution time.
Timer information shows 10x speedup in 'diagnostic_fields' on CWA Fujitsu FX1000 machine
Credit: Yu-Tze Hong aspen.hong@gmail.com