Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance Needed: Intermittent Unresponsiveness in Dogtag PKI Web Services in Podman Container Linked to 389ds Undefined Backend Issues #4728

Open
Meloknight89 opened this issue Apr 24, 2024 · 2 comments

Comments

@Meloknight89
Copy link

Issue Description
We are experiencing intermittent unresponsiveness in Dogtag PKI web services affecting the Certificate Authority (CA) and Registration Authority (RA), running in a Podman container. These issues correlate with apparent crashes or non-responsive periods of the 389 Directory Server (ldap database), which these services depend on. We seek detailed guidance on troubleshooting and diagnosing the root causes of these 389ds disruptions.

Package Version and Platform:

Platform: AlmaLinux
Package and version: Dogtag PKI latest, 389-ds-base.x86_64 2.0.15-1.module_el8+14185+adb3f555

Steps to Reproduce
Since the unresponsiveness is intermittent and linked to the 389ds behavior, there are no deterministic steps to reproduce the issue. It can be typically observed:

  1. When accessing the CA or RA web services via their respective interfaces.
  2. During LDAP queries when 389ds is in a non-responsive state, impacting operations within the container.

Request for Guidance

  • What specific logs or debugging tools should be enabled to capture detailed information when 389ds becomes unresponsive?
  • Are there known configurations or conditions that may predispose 389ds to such failures?
  • Suggestions for monitoring setups or diagnostic queries that could help pinpoint triggers or patterns leading to service disruptions.

Additional context

  • The entire PKI system is containerized, with Dogtag PKI running in Podman, increasing the complexity of network and service dependencies.
  • 389 Directory Server is not running in a podman container
  • Restarting the 389ds service temporarily mitigates the issue.
  • Information on to how identify
@Meloknight89
Copy link
Author

It would be greatly appreciated if I could receive some assistance with this matter. In the meantime, we have made some additional findings. We have observed that the service becomes unresponsive due to the LDAP server failing to handle search queries properly. Upon examining the access logs on the LDAP server, the following warning message appears on the CA below:

[CertStatusUpdateTask] WARNING: CertStatusUpdateTask: CertRecordPagedList: Error to get a new page
java.lang.RuntimeException: CertRecordPagedList: Error to get a new page
at com.netscape.cmscore.dbs.RecordPagedList.(RecordPagedList.java:46)
at com.netscape.cmscore.dbs.CertificateRepository.findPagedCertRecords(CertificateRepository.java:1269)
at com.netscape.cmscore.dbs.CertificateRepository.getRevokedCertsByNotAfterDate(CertificateRepository.java:1970)
at com.netscape.cmscore.dbs.CertStatusUpdateTask.updateRevokedExpiredCertificates(CertStatusUpdateTask.java:128)
at com.netscape.cmscore.dbs.CertStatusUpdateTask.updateCertStatus(CertStatusUpdateTask.java:167)
at com.netscape.cmscore.dbs.CertStatusUpdateTask.run(CertStatusUpdateTask.java:198)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: Unable to search LDAP record: Failed to send request
at com.netscape.cmscore.dbs.LDAPPagedSearch.getPage(LDAPPagedSearch.java:129)
at com.netscape.cmscore.dbs.LDAPPagedSearch.getPage(LDAPPagedSearch.java:76)
at com.netscape.cmscore.dbs.RecordPagedList.(RecordPagedList.java:44)
... 11 more
Caused by: netscape.ldap.LDAPException: Failed to send request (80)
at netscape.ldap.LDAPConnection.sendRequest(LDAPConnection.java:1875)
at netscape.ldap.LDAPConnection.search(LDAPConnection.java:2617)
at com.netscape.cmscore.dbs.LDAPPagedSearch.getPage(LDAPPagedSearch.java:118)
... 13 more

In the LDAP server access log, we can see the following entries:

[26/Apr/2024:19:11:17.634742447 +0200] conn=163 fd=157 slot=157 SSL connection from x.x.x.x to x.x.x.x
[26/Apr/2024:19:11:17.641196105 +0200] conn=163 TLS1.3 128-bit AES-GCM
[26/Apr/2024:19:11:17.642141920 +0200] conn=163 op=0 BIND dn="cn=Directory Manager" method=128 version=3
[26/Apr/2024:19:11:17.642308589 +0200] conn=163 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.007012918 optime=0.000186442 etime=0.007192975 dn="cn=directory manager"
[26/Apr/2024:19:11:17.642806488 +0200] conn=163 op=1 SRCH base="ou=certificateRepository,ou=ca,dc=ca,dc=pki,dc=domain,dc=com" scope=1 filter="(&(certStatus=VALID)(notAfter<=20240426171117Z))" attrs="objectClass serialno notBefore notAfter duration extension subjectName issuerName userCertificate version algorithmId signingAlgorithmId publicKeyData"
[26/Apr/2024:19:11:17.643079624 +0200] conn=163 op=1 RESULT err=0 tag=101 nentries=0 wtime=0.000176064 optime=0.000277317 etime=0.000442265 notes=P details="Paged Search" pr_idx=0 pr_cookie=-1
[26/Apr/2024:19:11:17.643952845 +0200] conn=163 op=-1 fd=157 closed error - B1

The "B1 Closed Error" suggests issues with network problems or improper LDAP client operations, such as a client aborting before receiving all the results. However, it cannot be due to network problems, as spawning the PKI instance would not work, which is not the case here.

Thanks in advance and Best Regards,
Joel

@jchapma
Copy link

jchapma commented May 14, 2024

Hi Joel, apologies for the delay, I am looking into this issue now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants