Page MenuHomePhabricator

KubernetesTag
ActivePublic

Details

Description

A tag for anything related to Kubernetes. For the discussion see T147187: Create a tag for #kubernetes.

See also:

Recent Activity

Yesterday

CDanis added a comment to T344171: Reverse DNS for k8s pods IPs.

As best I know here's the current state of this task:

Fri, Sep 20, 5:40 PM · Traffic, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated the task description for T362408: Migration to containerd and away from docker.
Fri, Sep 20, 3:00 PM · Prod-Kubernetes, Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074404 merged by JMeybohm:

[operations/puppet@production] ferm: Use ferm-status to restart ferm on wikikube-staging

https://gerrit.wikimedia.org/r/1074404

Fri, Sep 20, 1:15 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

As no better ideas surfaced, I'll add the external-services DNS name to the SAN of the kafka broker certificates as it seems the easiest and most automated way of doing this.

Fri, Sep 20, 12:54 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
gerritbot added a project to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s: Patch-For-Review.
Fri, Sep 20, 12:50 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
gerritbot added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

Change #1074411 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] kafka::broker: Add the external-services DNS name to the certs

https://gerrit.wikimedia.org/r/1074411

Fri, Sep 20, 12:50 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074405 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] ferm: Make reload via ferm-status the default

https://gerrit.wikimedia.org/r/1074405

Fri, Sep 20, 12:38 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Fri, Sep 20, 12:35 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074404 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] ferm: Use ferm-status to restart ferm on wikikube-staging

https://gerrit.wikimedia.org/r/1074404

Fri, Sep 20, 12:35 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
Maintenance_bot removed a project from T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Fri, Sep 20, 9:31 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074371 merged by JMeybohm:

[operations/puppet@production] ferm: Fix systemd override to not append ExecReload

https://gerrit.wikimedia.org/r/1074371

Fri, Sep 20, 8:51 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074371 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] ferm: Fix systemd override to not append ExecReload

https://gerrit.wikimedia.org/r/1074371

Fri, Sep 20, 8:48 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074185 merged by JMeybohm:

[operations/puppet@production] ferm: Use ferm-status to start ferm on diffs

https://gerrit.wikimedia.org/r/1074185

Fri, Sep 20, 8:39 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops

Thu, Sep 19

gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1072597 abandoned by Bking:

[operations/deployment-charts@master] rdf-streaming-updater: trigger a savepoint before firewall changes

Reason:

Successfully migrated withinout savepoint

https://gerrit.wikimedia.org/r/1072597

Thu, Sep 19, 2:31 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
bking updated the task description for T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.
Thu, Sep 19, 2:31 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
JMeybohm added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Fixing ferm_status.py is still not enough. When puppet corrects an on disk ferm config change (which has not been applied to iptables) back to the previous state, it does still reload ferm although ferm-status does return 0. Also the confd related code (requestctl rules) was restarting ferm via systemctl directly, bypassing the puppet service hack completely.

Thu, Sep 19, 2:19 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Thu, Sep 19, 2:00 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074185 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] ferm: Use ferm-status to start ferm on diffs

https://gerrit.wikimedia.org/r/1074185

Thu, Sep 19, 2:00 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
Maintenance_bot removed a project from T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Thu, Sep 19, 1:30 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1073842 abandoned by Bking:

[operations/deployment-charts@master] rdf-streaming-updater: remove references to old-style network policies

Reason:

we already did this in I6a040e53d9fb21b6d0f6cae6b3c9fa9ef64633c6

https://gerrit.wikimedia.org/r/1073842

Thu, Sep 19, 1:19 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074155 merged by JMeybohm:

[operations/puppet@production] profile::firewall: Absent confd config when it is disabled

https://gerrit.wikimedia.org/r/1074155

Thu, Sep 19, 1:15 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Thu, Sep 19, 12:00 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074155 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] profile::firewall: Absent confd config when it is disabled

https://gerrit.wikimedia.org/r/1074155

Thu, Sep 19, 12:00 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
Maintenance_bot removed a project from T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Thu, Sep 19, 11:31 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074113 merged by JMeybohm:

[operations/puppet@production] ferm: Allow to specify a different ferm-status command to use

https://gerrit.wikimedia.org/r/1074113

Thu, Sep 19, 11:29 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T353464: Migrate wikikube control planes to hardware nodes.

Change #1073857 merged by JMeybohm:

[operations/puppet@production] wikikube: Remove remaining hiera files and role for non stacked masters

https://gerrit.wikimedia.org/r/1073857

Thu, Sep 19, 9:41 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1073859 merged by JMeybohm:

[operations/puppet@production] wikikube: Disable requestctl ferm rules and definitions

https://gerrit.wikimedia.org/r/1073859

Thu, Sep 19, 9:41 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1074113 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] ferm: Allow to specify a different ferm-status command to use

https://gerrit.wikimedia.org/r/1074113

Thu, Sep 19, 9:13 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1073760 merged by JMeybohm:

[operations/puppet@production] Fix ferm_status to actually compare rules

https://gerrit.wikimedia.org/r/1073760

Thu, Sep 19, 8:39 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1074091 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/deployment-charts@master] cirrus-streaming-updater: disable legacy network policies for kafka

https://gerrit.wikimedia.org/r/1074091

Thu, Sep 19, 7:52 AM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1074090 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/deployment-charts@master] cirrus-streaming-update: enable calico network policies

https://gerrit.wikimedia.org/r/1074090

Thu, Sep 19, 7:52 AM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops

Wed, Sep 18

gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1073859 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] wikikube: Disable requestctl ferm rules and definitions

https://gerrit.wikimedia.org/r/1073859

Wed, Sep 18, 5:44 PM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T353464: Migrate wikikube control planes to hardware nodes.

Change #1073857 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] wikikube: Remove remaining hiera files and role for non stacked masters

https://gerrit.wikimedia.org/r/1073857

Wed, Sep 18, 5:40 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1072243 merged by Bking:

[operations/deployment-charts@master] rdf-streaming-updater: switch to calico-based network policies

https://gerrit.wikimedia.org/r/1072243

Wed, Sep 18, 4:10 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1073842 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] rdf-streaming-updater: remove references to old-style network policies

https://gerrit.wikimedia.org/r/1073842

Wed, Sep 18, 4:05 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
gerritbot added a comment to T373195: Migrate Search Platform-owned helm charts to Calico Network Policies.

Change #1072236 merged by jenkins-bot:

[operations/deployment-charts@master] flink-app: customize calico label selector

https://gerrit.wikimedia.org/r/1072236

Wed, Sep 18, 3:03 PM · Patch-For-Review, Data-Platform-SRE (2024.09.06 - 2024.09.27), Prod-Kubernetes, Kubernetes, serviceops
isarantopoulos moved T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies from Unsorted to Backlog/SRE on the Machine-Learning-Team board.
Wed, Sep 18, 2:26 PM · Machine-Learning-Team, Kubernetes
gerritbot added a project to T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Wed, Sep 18, 11:16 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1073760 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] Fix ferm_status to actually compare rules

https://gerrit.wikimedia.org/r/1073760

Wed, Sep 18, 11:16 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops

Tue, Sep 17

dcausse added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

In the RFC I read

In some cases, the URI is specified as an IP address rather than a
hostname. In this case, the iPAddress subjectAltName must be present
in the certificate and must exactly match the IP in the URI.
Tue, Sep 17, 7:45 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
JMeybohm added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

It does not feel right to bake the external-services stuff into the kafka certificates. But if there is no option to make the library use the IPs for validation, that's probably the only way to go.

Tue, Sep 17, 3:37 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
dcausse reopened T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s as "Open".

Thanks for looking into this!
Now failing with javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching kafka-main-eqiad.external-services.svc.cluster.local found use_all_dns_ips and now it seems that it wants to validate the hostname passed via bootstrap.servers...
I'll investigate more to see if there are more options, if we fail to workaround this do you think it'll be acceptable to add kafka-main-eqiad.external-services.svc.cluster.local as a valid alternative in the cert?

Tue, Sep 17, 1:52 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
JMeybohm closed T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s as Resolved.

Bummer...looks like we need to fix our DNS config in kubernetes then.

Tue, Sep 17, 1:28 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
Maintenance_bot removed a project from T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s: Patch-For-Review.
Tue, Sep 17, 10:30 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
gerritbot added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

Change #1073402 merged by JMeybohm:

[operations/puppet@production] kafka::broker: Populate cert SAN with hostname and IPs

https://gerrit.wikimedia.org/r/1073402

Tue, Sep 17, 9:46 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
Maintenance_bot removed a project from T374366: Race condition in iptables rules during puppet runs on k8s nodes: Patch-For-Review.
Tue, Sep 17, 9:31 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

With 1073233 merged, ferm is correctly reloaded (not stopped/started) on notify:

Tue, Sep 17, 9:25 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T374366: Race condition in iptables rules during puppet runs on k8s nodes.

Change #1073233 merged by JMeybohm:

[operations/puppet@production] Don't restart(stop,start) ferm on puppet notify, use reload instead

https://gerrit.wikimedia.org/r/1073233

Tue, Sep 17, 9:08 AM · Patch-For-Review, Infrastructure-Foundations, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s: Patch-For-Review.
Tue, Sep 17, 8:58 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops
gerritbot added a comment to T374729: Use kafka-main-[eqiad|codfw].external-services.svc.cluster.local to discover kafka brokers in kafka client running in k8s.

Change #1073402 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] kafka::broker: Populate cert SAN with hostname and IPs

https://gerrit.wikimedia.org/r/1073402

Tue, Sep 17, 8:58 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, Discovery-Search (Current work), serviceops