Last modified: 2014-05-27 22:21:17 UTC
An IP can be checked for inclusion in a CIDR block via bitwise operations. The current general purpose solution provided by IP::isInRange() does not take this potential optimization into account.
This is probably premature optimization (as silly as the code is). That said, I'd like to see toUnsigned() removed, it's usage in the class is roundabout and overly complex (hurting readability).
(In reply to comment #1) > This is probably premature optimization (as silly as the code is). That said, > I'd like to see toUnsigned() removed, it's usage in the class is roundabout > and > overly complex (hurting readability). Really? Repeated calls resulted in a large jump in cluster CPU usage...
(In reply to comment #2) > (In reply to comment #1) > > This is probably premature optimization (as silly as the code is). That said, > > I'd like to see toUnsigned() removed, it's usage in the class is roundabout > > and > > overly complex (hurting readability). > > Really? Repeated calls resulted in a large jump in cluster CPU usage... Where is the graph? I guess if you call something enough times...
That would be http://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Application+servers+eqiad&m=cpu_report&s=by+name&mc=2&g=cpu_report (ephemeral), I left some notes about it on 52829.
(In reply to comment #3) > (In reply to comment #2) > > Really? Repeated calls resulted in a large jump in cluster CPU usage... > > Where is the graph? I guess if you call something enough times... This was almost definitely a case of "calling something enough times". When I made the patch for 52829 I didn't fully comprehend the size of the WMF $wgSquidServersNoPurge list and the expected number of times it would be traversed in the course of a normal request. I have a hunch that if the configuration had been converted to the 10-15 CIDR ranges that should be able to cover the production cache clusters at the same time that the change enabling CIDR usage was rolled out we wouldn't be bothering to look at this. The follow on patches that have been made for 52829 should help greatly, but there will still be a performance hit for sites configured with a large number of $wgSquidServersNoPurge entries. If the large list is primarily/entirely single IPv4/IPv6 addresses the cost should be constrained to a list traversal with a single strpos() call per entry.
Brandon tried again to use the CIDR ranges in I5a2d86ef060b5b95412dbf42f8f955f3834aad8e, but this resulted in CPU utilization on the API servers jumping from ~50% to ~90%. It was reverted promptly.
Change 131758 had a related patch set uploaded by MaxSem: Speed up CIDR matching from $wgSquidServersNoPurge https://gerrit.wikimedia.org/r/131758
Brandon Black implemented a highly optimized CIDR matching engine in Ia3b12fb90c3e7e492374a128943b014481cc2730.