Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free GIL when running check_bytes_arrays_within_dist #23

Open
RandomNameUser opened this issue Jan 29, 2023 · 2 comments · May be fixed by #27
Open

Free GIL when running check_bytes_arrays_within_dist #23

RandomNameUser opened this issue Jan 29, 2023 · 2 comments · May be fixed by #27

Comments

@RandomNameUser
Copy link

The check_bytes_arrays_within_dist function has the potential to take a long time when searching in a large array. It would be helpful if the lib could free the GIL to let other threads run while that is going on.

Never having written a C Python extension, if I interpret the docs (https://docs.python.org/3/c-api/init.html#c.Py_BEGIN_ALLOW_THREADS) right, all it would take to do that is to add Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros and remove the in-loop return, i.e. change the loop to:

    int return_value = -1;

    Py_BEGIN_ALLOW_THREADS

    int res;
    uint64_t number_of_elements = big_array_size / small_array_size;
    uint8_t* pBig = big_array;
    for (uint64_t i = 0; i < number_of_elements; i++, pBig += small_array_size) {
        res = (int)ptr__hamming_distance_bytes(pBig, small_array, small_array_size, max_dist);
        if (res == 1)
            return_value = i;
            break;
        }

    Py_END_ALLOW_THREADS

   return Py_BuildValue("i", return_value);

I don't have a good way to test this right now, but it seems trivial enough. I would appreciate if you could give this a try.

Thanks for a greatly useful little library. Stuff like this is why I love Python! ;)

@mrecachinas
Copy link
Owner

👋 I'll need to think a little about this. It would help if you could give your use case -- e.g., are you planning to use this in a multithreaded application or are you intending for this to be akin to a coroutine, where the program keeps executing until you get a result (i.e., attempt to access the returned variable)? 🤔

@RandomNameUser
Copy link
Author

Currently the former. I am running a somewhat lengthy search in a thread parallel to the GUI (Qt) main thread, and things get pretty sluggish while it's running.

Not sure if the coroutine one is more complicated, but for my current use case it's not needed.

Thanks!

@github-staff github-staff deleted a comment Sep 19, 2024
mrecachinas added a commit that referenced this issue Nov 30, 2024
Fixes #23

Add GIL release in `check_bytes_arrays_within_dist` function.

- Add `Py_BEGIN_ALLOW_THREADS` macro before the loop in `check_bytes_arrays_within_dist_wrapper` function in `hexhamming/python_hexhamming.cc`.
- Add `Py_END_ALLOW_THREADS` macro after the loop in `check_bytes_arrays_within_dist_wrapper` function in `hexhamming/python_hexhamming.cc`.
- Remove the in-loop return statement in `check_bytes_arrays_within_dist_wrapper` function in `hexhamming/python_hexhamming.cc`.
- Add unit test `test_check_bytes_arrays_within_dist_gil` in `test/test_hexhamming.py` to verify the functionality of `check_bytes_arrays_within_dist` function with GIL released.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/mrecachinas/hexhamming/issues/23?shareId=XXXX-XXXX-XXXX-XXXX).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants