Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stream metadata #477

Closed
cgwalters opened this issue Dec 18, 2020 · 10 comments
Closed

Add stream metadata #477

cgwalters opened this issue Dec 18, 2020 · 10 comments

Comments

@cgwalters
Copy link
Member

cgwalters commented Dec 18, 2020

Moving a subthread from coreos/fedora-coreos-tracker#98 (comment) (part of openshift/enhancements#201 )here.

Basically let's add https://mirror.openshift.com/dependencies/rhcos/stream-4.8.json

(Note that stream metadata contains multiple architectures by design, so it's not under the existing arch-dependent location).

This could then also be used by coreos-installer download -o rhcos -s rhcos-4.8, and we'd also update all the UPI documentation to refer to it instead of e.g. hardcoding AMIs in the documentation.

Now, we have the option to move this data out of openshift/installer entirely; in connected installs we just download it dynamically, in disconnected installs we require the user to mirror it. I'm on the fence about this; perhaps too big of a change for 4.8.

Proposal: mirror.openshift.com path

  • Introduce github.com/openshift/rhel-coreos-bootimage that contains a single stream.json file with multiple branches.
  • Add an initially-manual flow to submit a PR there to bump it, that works just like bumping openshift/installer would - e.g. we have CI hooked up
  • Add an ART process that scrapes this JSON from git and syncs it to mirror.openshift.com
  • Initially also do a manual PR to update openshift/installer

The major disadvantage of this is that we would have bootimages in two places, and the potential for divergence between IPI and UPI. But, I am hopeful that if we streamline CI for rhel-coreos-bootimage we can just wave through sync PRs to update openshift/installer.

It also slightly breaks the "hermetic" nature of the current (openshift-install, release image) tuple because there's another input which can change over time not versioned with those. Perhaps a middle ground compromise is that we add versioned stream metadata, i.e. https://mirror.openshift.com/dependencies/rhcos/stream-4.8-20200325.json and openshift-install has a pinned copy of it.

Proposal: Embed in release image path

  • Introduce github.com/openshift/rhel-coreos-bootimage that contains a single stream.json file with multiple branches.
  • Add a Dockerfile which puts this JSON in a container image
  • Add that container image to the release image
  • Change openshift/installer to pull and extract that piece of the release image

Advantages: Makes it more obvious how to get this data "in cluster" for eventual in-cluster updates. Versioned with the release image naturally.
Disadvantages: Nontrivial change to openshift/installer because it has to pull the release image client side (but maybe openshift-installer can just require oc to be present and fork off oc image extract?)

@cgwalters
Copy link
Member Author

For the implementation of this, I think the simplest is probably to directly go from the existing RHCOS cosa metadata and every time we do a build that generates bootimages, update the stream immediately at bootimages.svc. The mirror.openshift.com version would derive from the installer-pinned version.

@bgilbert
Copy link
Contributor

I think the simplest is probably to directly go from the existing RHCOS cosa metadata

To be clear, is the proposal that the stream metadata would be exactly the meta.json? I'd be in favor of using an explicitly designed schema, either the FCOS stream metadata schema or a new one.

@cgwalters
Copy link
Member Author

Oh no the idea is it looks exactly the same as the existing FCOS stream metadata; why would it differ? I'm just saying that we don't create intermediate release.json, i.e. the translation is just cosa meta.json ➡️ stream, right?

@bgilbert
Copy link
Contributor

Ah, okay. As an implementation detail, it might make sense to internally generate an intermediate release.json so we can reuse existing tooling.

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 13, 2021
Part of implementing openshift/os#477

For now I decided to go directly from cosa to a stream because:

 - The cosa2release code is in a separate repo in Python
 - The RHCOS pipeline currently generates "one big build" anyways
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 13, 2021
Part of implementing openshift/os#477

For now I decided to go directly from cosa to a stream because:

 - The cosa2release code is in a separate repo in Python
 - The RHCOS pipeline currently generates "one big build" anyways
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 13, 2021
Part of implementing openshift/os#477

For now I decided to go directly from cosa to a stream because:

 - The cosa2release code is in a separate repo in Python
 - The RHCOS pipeline currently generates "one big build" anyways
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 14, 2021
This is a straight up import of almost-exactly the contents
of https://github.com/coreos/fedora-coreos-releng-automation/blob/master/coreos-meta-translator/trans.py

Prep for adding more stream tooling into coreos-assembler as
part of openshift/os#477
openshift-merge-robot pushed a commit to coreos/coreos-assembler that referenced this issue Jan 14, 2021
This is a straight up import of almost-exactly the contents
of https://github.com/coreos/fedora-coreos-releng-automation/blob/master/coreos-meta-translator/trans.py

Prep for adding more stream tooling into coreos-assembler as
part of openshift/os#477
@cgwalters
Copy link
Member Author

cgwalters commented Jan 15, 2021

That said, openshift/enhancements#201 calls for having this as part of the release image. Though...hmm, perhaps the simplest thing is that the installer uploads it as a configmap into the cluster. Now if we decide that this will ultimately be the MCO's job, perhaps it becomes oc -n machine-config-operator configmap/bootimages? Or it could go in oc -n openshift configmap/coreos-bootimages?

But a problem is that doesn't make it easy to get from oc for UPI installs in a way that reliably picks up the same installer-pinned version...which we might want for the "simulate UPI in CI" case. But some UPI installs are going to want to access the mirror.openshift.com data (particularly bare metal), so we could say even e.g. AWS UPI installs should get the AMIs from there, and we do the same for our CI.

But if we do want to implement the oc adm release info --bootimages or whatever...:

"oc adm release new scrapes openshift-install"

First, something like openshift/installer#4102
Then next, we change oc adm release new to explicitly scrape the data out of the installer image and add it to the image it generates (so it's convenient to download). Or we could even try to compress it into container image metadata for the image.

@LorbusChris
Copy link
Member

@cgwalters one reason it seemed enticing to me to store the bootimage refs in the payload as opposed to the installer repo was that they wouldn't have to be maintained in the installer repo's rhcos.json (and fcos.json for OKD) in the future.

I nonetheless think having a configmap for the bootimages does make a lot of sense, but why not scrape that data off the payload?

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 15, 2021
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 15, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 15, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 15, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
@cgwalters
Copy link
Member Author

@cgwalters one reason it seemed enticing to me to store the bootimage refs in the payload as opposed to the installer repo was that they wouldn't have to be maintained in the installer repo's rhcos.json (and fcos.json for OKD) in the future.

But the core problem there is the bootstrap node. Unless we change the installer to pivot in all cases, and we decide to accept the potential skew with the "in cluster bootimages" then the installer needs to pin.

Honestly long term the way I think things should work is basically that if you run openshift-install outside of the target cloud, it can spin up a traditional RHEL cloud image (if available, or RHCOS, or FCOS, or heck even anything that can run podman) and use that to start as a "bootstrap bootstrap node" that then e.g. downloads the release image and (short term) spins up e.g an RHCOS bootstrap node and proceeds from there. Longer term, if we made some of the bootstrap logic run inside a containerized kubelet, then we'd only have one bootstrap node.

cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 19, 2021
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 19, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 19, 2021
cgwalters added a commit to cgwalters/coreos-assembler that referenced this issue Jan 19, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
openshift-merge-robot pushed a commit to coreos/coreos-assembler that referenced this issue Jan 20, 2021
openshift-merge-robot pushed a commit to coreos/coreos-assembler that referenced this issue Jan 20, 2021
Part of implementing openshift/os#477

I don't think for RHCOS (at least initially) we will go
through the whole "release.json" middle ground, so let's
add a command to directly turn 1 or more cosa build(s) into a stream
JSON.

Because the "cosa build" -> "release" bits are in Python,
we semi-hackily fork it off as a subprocess.  But, it works.
@cgwalters
Copy link
Member Author

openshift/installer#4576

cgwalters added a commit to cgwalters/release that referenced this issue Mar 1, 2021
This is part of adapting RHCOS to use the same stream metadata JSON format
that FCOS uses:

openshift/os#477

This change specifically will pair with a pending change in:
openshift/installer#4582
to embed CoreOS stream metadata in the installer git instead
of the undocumented ad-hoc JSON format that ships there now.
@cgwalters
Copy link
Member Author

This is now openshift/enhancements#679

cgwalters added a commit to cgwalters/installer that referenced this issue Mar 24, 2021
This implements part of the plan from:
openshift/os#477

When we originally added the pinned RHCOS metadata `rhcos.json`
to the installer, we also changed the coreos-assembler `meta.json`
format into an arbitrary new format in the name of some cleanups.
In retrospect, this was a big mistake because we now have two
formats.

Then Fedora CoreOS appeared and added streams JSON as a public API.

We decided to unify on streams metadata; there's now a published
Go library for it: https://github.com/coreos/stream-metadata-go

Among other benefits, it is a single file that supports multiple
architectures.

UPI installs should now use stream metadata, particularly
to find public cloud images.  This is exposed via a new
`openshift-install coreos print-stream-json` command.

This is an important preparatory step for exposing this via
`oc` as well as having something in the cluster update to
it.

HOWEVER as a (really hopefully temporary) hack, we *duplicate*
the metadata so that IPI installs use the new stream format,
and UPI CI jobs can still use the old format (with different RHCOS versions).

We will port the UPI docs and CI jobs after this merges.

Co-authored-by: Matthew Staebler <staebler@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants