-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POD DNS reverse lookup #266
Comments
Is this a new trend? Is it actually important to these workloads' corectness, or can we convince them they are following a bad pattern?
In general, these names are not very useful names. It's (in theory) possible to get those names into
To do "real" reverse lookups of pod names (the default hostname) would require a larger DNS architecture change - watching all pods is expensive, so it would have to be multi-level. Are the IP-based names sufficient for this sort of use-case? What about other use-cases that have emerged for DNS PTR? @bowei @kubernetes/sig-network-feature-requests |
I am told this already works in coredns and I tried to make a pr to make this work for dns but that didn’t move mainly because I was not able to convince the team . Can you try coredns and see if that helps. Also if coredns implements I am not sure it’s in violation of he dns spec in any way or not @johnbelamaric |
It's not in violation of the spec, but it goes beyond what the spec prescribes. You need to use the |
Spark changed it just recently with version 2.3. Other products like hadoop are doing it since ages. I don't really know, what is driving them, as I have hard times to believe, that even outside the kubernetes world, every datacenter has a solid DNS setup by default. I guess the majority of self-hosters have to set it up in order to comply with hadoop-ish software stack.
Whats the purpose of these names then? Why do they exist when on the opposite, we are afraid to make the podnames DNS available because of spamming the DNS server.
|
What "every datacenter" has is not the question. The (v-)servers in big ones all have hostnames that can be resolved. I think almost every root server or cloud node can be accessed by just using its host name. Maybe the Apache guys generalized from that. |
I had a look and this is unfortunately not, what we are looking for. Let me give you an example. A pod (job) connects to a Service Having endpoint names in CoreDNS, doesn't help, because: What I would expect is, that every pod hostname is available in DNS similar to the ip-based hostame under |
Can someone give a pointer to the spark docs that reference this behavior? Maybe for spark (and associated jobs), we can give the job the synthetic pod IP as its "hostname" |
https://spark.apache.org/docs/latest/configuration.html#networking |
How? |
Is this issue dead? |
@realknorke nope, there are other people with this problem too. Just ran into it today with Flink 1.7. As far as @thockin's question on whether or not this is a new trend: unfortunately not, it's been like this in Apache projects for almost a decade now. These issues have impacted other environments before kubernetes was a thing, and yet have not been addressed. There is virtually no hope to fix all this giant corpus of horrible Java code. And I say this as someone that has been part of some of those Apache projects for years and has contributed... 😰 |
@scrwr didnt understand this . I thought coredns makes pod names DNS resolvable |
@krmayankk Not from the "outside" (e.g. from within another pod). |
Can we get a clear definition of what you need, for example in the form
used in the dns spec?
…On Mon, Dec 10, 2018 at 5:45 AM S. Teresniak ***@***.***> wrote:
@scrwr <https://github.com/scrwr> didnt understand this . I thought
coredns makes pod names DNS resolvable
@krmayankk <https://github.com/krmayankk> Not from the "outside" (e.g.
from within another pod).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#266 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AJB4szheAonDIJ_9BizD3jwXJIgzgDDNks5u3mV1gaJpZM4W4LiZ>
.
|
@johnbelamaric Hi. sadly I'm not that of a DNS guy. But I'd like to give you a detailed problem description and findings. From a DNS perspective all written below (only) relates to A record lookups. Please see a shorter version below Long storyThe situation is as follows (I use Apache Spark as an example to illustrate): I have a running Spark cluster. That is, one master pod with a (k8s) service (some ports open to connect to), and several slave pods. Everything fine here. In order to get work carried out an external application (the "Spark program", called the driver) connects to the spark-master-pod (via K8S When the slaves are done with the work (or parts of it) they want to directly connect to the driver and submit a) metadata (like progress) and b) results. And here the whole thing hits the fan: The executor cannot connect to the driver, or to put it in K8S language, the spark-slave pod cannot connect to the driver job/pod, because the job's hostname is not DNS-resolvable from the outside of the job itself. Short descriptionFrom within a pod a call to Using a K8S FindingsSome findings without any order:
Final ThoughtsManually adding the IP-ish hostname to a pod's Alternatively adding the pod's current hostname to DNS in order to be resolvable by all (other) pods may be a solution (but we have to add search paths then to the DNS config, like Alternatively making a Please don't be mad if I mixed something up or misunderstood concepts. This issue is very important to my company and we already spent a lot of time searching for workarounds. Please ask for more details if necessary. |
Is there any update on this issue? |
Can this issue be illustrated with a specific example? I'm not quite grasping the issue from the explanations above, after have read them many times over. |
Hi Chris, Start two Pods. Lets assume the pod's names: pod1 and pod2. EXEC into pod2 and do a "ping pod1". It is not working because hostname pod1 cannot be resolved from within pod2. That's the problem in a nutshell. |
In your use case, are the host names of the pods predictable in any way, or are they completely random. Can you give an actual real example, with real hostnames. |
Sure @chrisohaver Lets assume I have two Pods in a RS apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: dnstest
namespace: ops
spec:
replicas: 2
selector:
matchLabels:
k8s-app: dnstest
template:
metadata:
labels:
k8s-app: dnstest
spec:
containers:
- name: dnstestpod
image: opensuse/tumbleweed
ports:
- containerPort: 12345
command: ["/bin/bash"]
args: ["-c", "zypper -q install -y netcat-openbsd net-tools && hostname -A && hostname -I && grep $(hostname -i) /etc/hosts && while :; do netcat -lvp 12345; done"] I start the RS/Pods
Lets check STDOUT of one of the pods:
The pod with name Now lets exec into the second Pod and try to connect to the first one.
Connecting via IP works. Connecting via IP-ish hostname (manually generated by me using the K8S DNS documentation!) works, too. Connecting via hostname Log of the receiving pod (already known lines removed):
Problem: hostname Possible fix/workaround (yet to be implemented): Make the IP-ish hostname, WHICH IS ALWAYS RESOLVABLE, part of
Please note the difference of the result of calling Result achived by this workaround: The pod now can retrieve its globally resolvable hostname by asking the OS's DNS subsystem (e.g. PS: The IP-ish hostname may not the only way (or not the best way) to make a Pod's hostname resolvable. Adding a Pod's hostname to the K8S DNS server could also do the trick, as discussed above. |
Thanks, is there a need to be able to connect to a specific pod in a replica set? Or would connecting to any pod in the replica set be ok? |
If connecting to any pod in the replica set is OK, then you could do this with Service selection and the CoreDNS rewrite plugin. CoreDNS rewrite plugin: to re-write any request for When a Pod in the namespace |
Sadly, it has to work for every container manged by K8S. In my particular case it has to work for Kubernetes And for your first question: Its not |
You could create a headless service that selects all the jobs, and (as @johnbelamaric says above) use the If the service is named
A CoreDNS rewrite rule could then rewrite e.g. when a pod in ops namespace queries for |
Thanks for the advice. Please give me some time to try this out. At the moment we're not using CoreDNS but Kubernetes-DNS. I'll get back to you later with results. |
EDIT: This was resolved and alternative was found, please see next commentWe would like to chip in here as well. We're running Couchbase in StatefulSet mode.
If we try to use any Our core problem: We need a stable We tried looking around:
Generally speaking, if we boot a VM in almost all Cloud providers (including AWS and Google Cloud), the VM has a One could indeed say, that this is a problem of Couchbase, Spark, or Hadoop (found Zookeeper reported somewhere but not sure if it's related). But if we're able to run them on VMs, then a naive point of view is – we should not need to change their source code or fix couchbase clustering – just because statefulset pods don't have a stable DNS identity. We're not complaining, we really love kubernetes, and we're trying our best to run clustered applications on kubernetes, but somehow this looks increasingly like a kubernetes limitation, than the underlying software (couchbase/spark/hadoop/zk) problem. If readiness probes were available on a Service-level instead of a Pod-level (again not really a hack, but a standard cloud load balancer pattern, both AWS and Google Cloud allow to set probes on Load Balancers), we might have created a headless service with no health checks, and used that to provide a stable DNS name that always points to the current running pod no matter what it's state is. But for good reasons, that's not possible in k8s, and probes are directly on the pods, so we're really stuck here. At the very least, this looks to be a documentation bug in the service resolution page. At the best, it would be great to have a solution for a stable DNS identity corresponding to a stable Pod identity. I'm not sure if what I'm saying makes any sense, but we've hit across these chain of issues multiple times when we try to run stateful clustered resources, hence thought of adding here. |
@rdsubhas Would |
Hi @MrHohn this looks like it could do the trick, will check it out and update (and will check out what version it's available from) 👍 Thank you so much for the tip! |
@MrHohn quick update: It works flawlessly, and just what we needed! Now we're moving to bootstrap every StatefulSet with Thanks for the heads up again! To follow up: What do you think about this docs page which still mentions the deprecated |
@rdsubhas Thanks for pointing that out, I will see if we can update https://kubernetes.io/docs/concepts/services-networking/dns-pod-service to match up with the latest DNS specification. |
i just ran into the same issue when deploying a jupyter notebook onto a kubernetes cluster and trying to execute spark jobs against the cluster, the executor pods spin up, then error because they cannot talk back to the notebook because they were given the hostname of the jupyternotebook, which is a podname, |
We have created a small operator that watches for pods with a certain annotation and creates a headless service for them. It helps us as a workaround for this issue: https://github.com/src-d/k8s-pod-headless-service-operator |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
So everybody with (relatively) long-running services found a workaround here. But what about the short-living fire-and-forget jobs (e.g. spark jobs)? You don't want (headless) services for that, right? |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
We patched the Kubernetes-Plugin of CoreDNS. Now it is possible to resolve a podname. If someone is interested just give me a shout. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@realknorke I am quite interested in your patch for CoreDNS. Could you share it here? |
Can you read the fork: https://github.com/smartclip/coredns ? |
@realknorke yes, I can read the fork. Perhaps I misunderstood your patch logic. I am looking for pod name, which does not have dashed IP in it. Currently in my AKS cluster with CoreDNS, my StatefulSet PODs are resolved as |
The IP-ish hostname is always resolvable. What is not resolvable is the podname (this is NOT the service name, i mean that hostnames with $random in it for ReplicaSet, Deployment or with numbers in StatefulSet). We need podnames resolvable because for some (micro-)services the application in the pod determine his hostname, transfer that hostname to some other instance and that instance then wants to connect back to the first pod. That is not possible when the hostname=podname is not resolvable. Our patch solves exactly this. |
Understood your case now. In our situation, we want StatefulSet POD hostnames remain as: |
We have an increasing problem with Apache Hadoop-like services like Spark, Flink and co. These are trying to communicate via their hostnames in the cluster instead of their IPs. So they look up their own hostname and hence come up with the unresolvable podname. We see now two incomplete solutions:
a) Pod A records are created in a format like 1-2-3-4.namespace.pod.cluster.local variant. But the pod itself cannot be spec’ed to use this A record as its own hostname.
b) one can use hostname and subdomain together with a headless service in order to create an FQDN in the KubeDNS + in pods hostname, but this requires static hostnames and won’t work with Replicasets or Daemonsets.
We are looking for a complete solution, e.g. via optionally switch the pod hostname to its KubeDNS A Record, or injecting the A record as first entry in /etc/hosts etc. The type of pods which are heavily effected by the issue are Jobs. In the Spark context, these are driver pods / jobs. But the problem is not limited to Spark. We see the same effects all over in recent apache projects. E.g. Flink has a similar issue.
The text was updated successfully, but these errors were encountered: