Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discord Notification Per-Route Rate Limit Exceeded with Notification Batches #912

Open
kibawulf opened this issue Aug 1, 2024 · 0 comments

Comments

@kibawulf
Copy link

kibawulf commented Aug 1, 2024

Problem:
Certain conditions, such as when large numbers of SeAT notification jobs fire in a short period of time utilizing Discord webhooks, can trigger Discord's per-route rate limiting, causing POSTs to fail with a 429 Too Many Requests which can lead to a MaxAttemptsExceededException SeAT exception.

This causes scenarios where large numbers of notification jobs may end up in a failed state and notifications will never be sent which could be risky depending on the severity of the notification (e.g. structure reinforcement, members leaving corps, etc). Current rate limits in SeAT appear to be set to 45/minute to avoid the documented global rate limit of 50/minute, but Discord also has an additional per-route rate limit value that is not specifically documented but is mentioned. This likely means that if a certain number of notifications that produce a Discord message configured to use the same webhook fire at or near the same time, jobs can fail. When this occurs enough times to cause a job failure, no message is sent to the Discord channel, the job retries a few times until it fails, and effectively the notification is lost.

Discord Rate Limiting Documentation:
https://discord.com/developers/docs/topics/rate-limits

Expected: Rate limits are handled dynamically via header as per the Discord documentation to avoid rate limit conditions or adjusted further to avoid potential conditions.

  • In the interim, adjusting the rate limiting settings to ensure a low-ish per second maximum, some online forums say anywhere from 3-5 per second max, in addition to the 45/minute may be enough to mitigate the issue.
  • Any limits hardcoded in SeAT would not apply to any other software running on the same system, due to global rate limiting occurring on a per-IP basis. While this is not the case in my instance as no other software on this machine is using Discord webhooks, it may provide clarity whether to determine if rate limit avoidance via header is best rather than adjusting the limits as-is.
  • Laravel documentation seems to indicate limiting using multiple rate limiters may be possible: https://laravel.com/docs/10.x/routing#multiple-rate-limits

Code sections that appear to be configured to handle Discord rate limiting:
https://github.com/eveseat/notifications/blob/5.0.x/src/Notifications/AbstractDiscordNotification.php#L32
https://github.com/eveseat/notifications/blob/5.0.x/src/NotificationsServiceProvider.php#L226

Logs / Screenshots / Proof:

Note: This webhook is only being used for SeAT - it is not re-used for any other process.

First image shows API return hitting the rate limit and indicates that it is not the global limit - hence why I believe this is related to a per-route limit:
image

This image shows behavior once the job fails enough times, rendering the job permanently failed (no notification):
image

This image shows a number of failures around the same timeframe, two are pictured above. (Note: Items with "3" attempts are 429 Too Many Requests failures, items with "4" attempts are MaxAttemptsExceededException.
image

Version Info: SeAT v5, all modules up-to-date as of 2024-07-31.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant