-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No byteswarning #1290
No byteswarning #1290
Conversation
Fixes python-hyper#1236. This patch makes all header operations operate on `bytes` and converts all headers and values to bytes before operation. With a follow up patch to `hpack` it should also increase efficiency as currently, `hpack` casts everything to a `str` first before converting back to bytes: https://github.com/python-hyper/hpack/blob/02afcab28ca56eb5259904fd414baa89e9f50266/src/hpack/hpack.py#L150-L151
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't fully agree with the .startswith()
changes and removal of list comprehension in favor of incremental list building. That said I'm not gonna block them as fixing the bytes warning is more important.
If you agree with my performance concerns, you can revert those changes back to my version before merging.
Finally, thank you so much for pushing this through!
|
||
if n and n[0] != SIGIL: | ||
if not n.startswith(b':'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? (And the other .startswith()
below from my n and n[0] == SIGIL
version?) I'm quite sure they are significantly slower as they involve a function call and are not optimized for 1-byte look up.
encoded_headers = [] | ||
for header in headers: | ||
h = (_to_bytes(header[0]), _to_bytes(header[1])) | ||
if isinstance(header, HeaderTuple): | ||
encoded_headers.append(header.__class__(h[0], h[1])) | ||
else: | ||
encoded_headers.append(h) | ||
return encoded_headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is again, quite less efficient compared to using a list-comprehension directly.
in the block, and so that it can stop looking when it finds the first | ||
header field whose name does not begin with a colon. | ||
are well formed and encoded as bytes: that is, that the HTTP/2 special | ||
headers are first in the block, and so that it can stop looking when it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
headers are first in the block, and so that it can stop looking when it | |
headers are first in the list, and so that it can stop looking when it |
@@ -345,30 +323,26 @@ def _reject_pseudo_header_fields(headers, hdr_validation_flags): | |||
method = None | |||
|
|||
for header in headers: | |||
if _custom_startswith(header[0], b':', u':'): | |||
if header[0][0] == SIGIL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to go down the .startswith()
path, we should align this line with that too. This also probably means the top-level SIGIL
and INFORMATIONAL_START
become obsolete.
Shall we merge this? I'm happy to follow up with the issues I raised myself :) |
Yes, let's merge this so we can iterate an smaller changes going forward. |
For the performance discussion, I would really like to see a proper benchmark comparing |
super-seeds and closes #1286, #1236
@BYK please review. I combined most of your previous commits from #1286, so the only code change should be in my most recent commit on this PR.