Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove unnecessary string casting of bytes #276

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BYK
Copy link

@BYK BYK commented Nov 14, 2024

@Kriechi
Copy link
Member

Kriechi commented Nov 15, 2024

To call out the explicit: is this a behaviour change or a performance improvement?

If its a performance related change, I would like to see a benchmark showing "time spent" or similar metrics that changed.

@BYK
Copy link
Author

BYK commented Nov 15, 2024

Great call! Thanks to your prompt I actually timed the new version and improved it even further (code will be pushed when you are reading this).

So this should keep the same behavior in terms of types. The caveat is, the original version always returns a new bytes object due to the bytes -> str -> bytes conversion whereas the new version simply returns the same object if it is already bytes. This should be okay, even good as the intent was never to get a new copy but worth calling out. Otherwise this is purely a performance optimization with the following numbers where _to_bytes is the old version and _to_bytes_new is the new version proposed in this PR:

>>> s
'someheader'
>>> b
b'someheader'
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.22222996300115483
0.16239697000128217
0.18601619399851188
0.1108450250030728
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.20466226000280585
0.16354602499632165
0.1937305280007422
0.12870584500342375
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.20280992899643024
0.16408630299702054
0.1972411400056444
0.12873706800019136
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.21261542499996722
0.16578178100462537
0.18726225899445126
0.15498205900075845

As you can see it is consistently faster than the old version for both str and bytes inputs and significantly faster for bytes inputs which is what we should be dealing with after python-hyper/h2#1286

These numbers are using Python 3.10.12. When using Python 3.12.4, they are even better:

>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.13931607300037285
0.1050484180013882
0.11498607300018193
0.07148081599734724
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.14760045900038676
0.0974436950054951
0.11569900900212815
0.06948418200045126
>>> timeit(lambda: _to_bytes(s));timeit(lambda: _to_bytes(b));timeit(lambda: _to_bytes_new(s));timeit(lambda: _to_bytes_new(b))
0.13979562999884365
0.09725762299785856
0.11685396800021408
0.075411345998873

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants