Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No byteswarning #1290

Merged
merged 3 commits into from
Dec 18, 2024
Merged

No byteswarning #1290

merged 3 commits into from
Dec 18, 2024

Conversation

Kriechi
Copy link
Member

@Kriechi Kriechi commented Nov 23, 2024

super-seeds and closes #1286, #1236

@BYK please review. I combined most of your previous commits from #1286, so the only code change should be in my most recent commit on this PR.

BYK and others added 3 commits November 23, 2024 11:44
Fixes python-hyper#1236.

This patch makes all header operations operate on `bytes` and converts all headers and values to bytes before operation. With a follow up patch to `hpack` it should also increase efficiency as currently, `hpack` casts everything to a `str` first before converting back to bytes: https://github.com/python-hyper/hpack/blob/02afcab28ca56eb5259904fd414baa89e9f50266/src/hpack/hpack.py#L150-L151
Copy link
Contributor

@BYK BYK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't fully agree with the .startswith() changes and removal of list comprehension in favor of incremental list building. That said I'm not gonna block them as fixing the bytes warning is more important.

If you agree with my performance concerns, you can revert those changes back to my version before merging.

Finally, thank you so much for pushing this through!


if n and n[0] != SIGIL:
if not n.startswith(b':'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? (And the other .startswith() below from my n and n[0] == SIGIL version?) I'm quite sure they are significantly slower as they involve a function call and are not optimized for 1-byte look up.

Comment on lines +517 to +524
encoded_headers = []
for header in headers:
h = (_to_bytes(header[0]), _to_bytes(header[1]))
if isinstance(header, HeaderTuple):
encoded_headers.append(header.__class__(h[0], h[1]))
else:
encoded_headers.append(h)
return encoded_headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again, quite less efficient compared to using a list-comprehension directly.

in the block, and so that it can stop looking when it finds the first
header field whose name does not begin with a colon.
are well formed and encoded as bytes: that is, that the HTTP/2 special
headers are first in the block, and so that it can stop looking when it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
headers are first in the block, and so that it can stop looking when it
headers are first in the list, and so that it can stop looking when it

@@ -345,30 +323,26 @@ def _reject_pseudo_header_fields(headers, hdr_validation_flags):
method = None

for header in headers:
if _custom_startswith(header[0], b':', u':'):
if header[0][0] == SIGIL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to go down the .startswith() path, we should align this line with that too. This also probably means the top-level SIGIL and INFORMATIONAL_START become obsolete.

@BYK
Copy link
Contributor

BYK commented Dec 4, 2024

Shall we merge this? I'm happy to follow up with the issues I raised myself :)

@Kriechi Kriechi merged commit 5f129a0 into python-hyper:master Dec 18, 2024
8 checks passed
@Kriechi
Copy link
Member Author

Kriechi commented Dec 18, 2024

Yes, let's merge this so we can iterate an smaller changes going forward.
Thanks for your contribution!

@Kriechi Kriechi deleted the no-byteswarning branch December 18, 2024 20:45
@Kriechi
Copy link
Member Author

Kriechi commented Dec 18, 2024

For the performance discussion, I would really like to see a proper benchmark comparing v4.1.0 as released, against latest main branch - ideally across all supported Python versions to rule out any weird optimization edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants