A tag for anything related to Kubernetes. For the discussion see T147187: Create a tag for #kubernetes.
See also:
A tag for anything related to Kubernetes. For the discussion see T147187: Create a tag for #kubernetes.
See also:
As best I know here's the current state of this task:
Change #1074404 merged by JMeybohm:
[operations/puppet@production] ferm: Use ferm-status to restart ferm on wikikube-staging
https://gerrit.wikimedia.org/r/1074404
As no better ideas surfaced, I'll add the external-services DNS name to the SAN of the kafka broker certificates as it seems the easiest and most automated way of doing this.
Change #1074411 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] kafka::broker: Add the external-services DNS name to the certs
https://gerrit.wikimedia.org/r/1074411
Change #1074405 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] ferm: Make reload via ferm-status the default
https://gerrit.wikimedia.org/r/1074405
Change #1074404 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] ferm: Use ferm-status to restart ferm on wikikube-staging
https://gerrit.wikimedia.org/r/1074404
Change #1074371 merged by JMeybohm:
[operations/puppet@production] ferm: Fix systemd override to not append ExecReload
https://gerrit.wikimedia.org/r/1074371
Change #1074371 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] ferm: Fix systemd override to not append ExecReload
https://gerrit.wikimedia.org/r/1074371
Change #1074185 merged by JMeybohm:
[operations/puppet@production] ferm: Use ferm-status to start ferm on diffs
https://gerrit.wikimedia.org/r/1074185
Change #1072597 abandoned by Bking:
[operations/deployment-charts@master] rdf-streaming-updater: trigger a savepoint before firewall changes
Reason:
Successfully migrated withinout savepoint
https://gerrit.wikimedia.org/r/1072597
Fixing ferm_status.py is still not enough. When puppet corrects an on disk ferm config change (which has not been applied to iptables) back to the previous state, it does still reload ferm although ferm-status does return 0. Also the confd related code (requestctl rules) was restarting ferm via systemctl directly, bypassing the puppet service hack completely.
Change #1074185 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] ferm: Use ferm-status to start ferm on diffs
https://gerrit.wikimedia.org/r/1074185
Change #1073842 abandoned by Bking:
[operations/deployment-charts@master] rdf-streaming-updater: remove references to old-style network policies
Reason:
we already did this in I6a040e53d9fb21b6d0f6cae6b3c9fa9ef64633c6
https://gerrit.wikimedia.org/r/1073842
Change #1074155 merged by JMeybohm:
[operations/puppet@production] profile::firewall: Absent confd config when it is disabled
https://gerrit.wikimedia.org/r/1074155
Change #1074155 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] profile::firewall: Absent confd config when it is disabled
https://gerrit.wikimedia.org/r/1074155
Change #1074113 merged by JMeybohm:
[operations/puppet@production] ferm: Allow to specify a different ferm-status command to use
https://gerrit.wikimedia.org/r/1074113
Change #1073857 merged by JMeybohm:
[operations/puppet@production] wikikube: Remove remaining hiera files and role for non stacked masters
https://gerrit.wikimedia.org/r/1073857
Change #1073859 merged by JMeybohm:
[operations/puppet@production] wikikube: Disable requestctl ferm rules and definitions
https://gerrit.wikimedia.org/r/1073859
Change #1074113 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] ferm: Allow to specify a different ferm-status command to use
https://gerrit.wikimedia.org/r/1074113
Change #1073760 merged by JMeybohm:
[operations/puppet@production] Fix ferm_status to actually compare rules
https://gerrit.wikimedia.org/r/1073760
Change #1074091 had a related patch set uploaded (by DCausse; author: DCausse):
[operations/deployment-charts@master] cirrus-streaming-updater: disable legacy network policies for kafka
https://gerrit.wikimedia.org/r/1074091
Change #1074090 had a related patch set uploaded (by DCausse; author: DCausse):
[operations/deployment-charts@master] cirrus-streaming-update: enable calico network policies
https://gerrit.wikimedia.org/r/1074090
Change #1073859 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] wikikube: Disable requestctl ferm rules and definitions
https://gerrit.wikimedia.org/r/1073859
Change #1073857 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] wikikube: Remove remaining hiera files and role for non stacked masters
https://gerrit.wikimedia.org/r/1073857
Change #1072243 merged by Bking:
[operations/deployment-charts@master] rdf-streaming-updater: switch to calico-based network policies
https://gerrit.wikimedia.org/r/1072243
Change #1073842 had a related patch set uploaded (by Bking; author: Bking):
[operations/deployment-charts@master] rdf-streaming-updater: remove references to old-style network policies
https://gerrit.wikimedia.org/r/1073842
Change #1072236 merged by jenkins-bot:
[operations/deployment-charts@master] flink-app: customize calico label selector
https://gerrit.wikimedia.org/r/1072236
Change #1073760 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] Fix ferm_status to actually compare rules
https://gerrit.wikimedia.org/r/1073760
In the RFC I read
In some cases, the URI is specified as an IP address rather than a hostname. In this case, the iPAddress subjectAltName must be present in the certificate and must exactly match the IP in the URI.
It does not feel right to bake the external-services stuff into the kafka certificates. But if there is no option to make the library use the IPs for validation, that's probably the only way to go.
Thanks for looking into this!
Now failing with javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching kafka-main-eqiad.external-services.svc.cluster.local found use_all_dns_ips and now it seems that it wants to validate the hostname passed via bootstrap.servers...
I'll investigate more to see if there are more options, if we fail to workaround this do you think it'll be acceptable to add kafka-main-eqiad.external-services.svc.cluster.local as a valid alternative in the cert?
In T374729#10149804, @JMeybohm wrote:Bummer...looks like we need to fix our DNS config in kubernetes then.
Change #1073402 merged by JMeybohm:
[operations/puppet@production] kafka::broker: Populate cert SAN with hostname and IPs
https://gerrit.wikimedia.org/r/1073402
With 1073233 merged, ferm is correctly reloaded (not stopped/started) on notify:
Change #1073233 merged by JMeybohm:
[operations/puppet@production] Don't restart(stop,start) ferm on puppet notify, use reload instead
https://gerrit.wikimedia.org/r/1073233
Change #1073402 had a related patch set uploaded (by JMeybohm; author: JMeybohm):
[operations/puppet@production] kafka::broker: Populate cert SAN with hostname and IPs
https://gerrit.wikimedia.org/r/1073402