Skip to main content

How do idempotent runs work?

Background

tip

To learn about what idempotent runs are in Kurtosis and the motivation behind this feature, go here.

When running the kurtosis run command, you may notice the following message get printed:

SKIPPED - This instruction has already been run in this enclave

The reason this happens is because Kurtosis will optimize each run of a Starlark package based on what has already been run in a given enclave, thus reducing execution time and resources.

This means when you try to run the exact same package twice in a row, Kurtosis will skip all the instructions for the second run because they were already executed in the first run.

info

This feature is still experimental and can be deactivated by adding --experimental NO_INSTRUCTIONS_CACHING parameter to the kurtosis run command.

How it works

Definitions

The enclave plan is defined as the sequence of Starlark instructions that were previously executed inside a given enclave. Meanwhile, the submitted plan is defined as the set of instructions generated by interpreting the package before it gets executed.

When running a Starlark package in a world without idempotent runs, all the instructions are naively executed inside the enclave and the new post-execution enclave plan is set to the concatenation of the previous enclave plan and the submitted plan.

To avoid re-running instructions that have already been run inside the enclave, Kurtosis will try to maximize the overlap between the submitted plan and the tail-end portion of the enclave plan. In the overlapping portion, if any, Kurtosis will re-run only the instructions that were updated. Then, if they are new instructions at the end of the submitted plan that were not in the enclave plan, they are executed as new instructions and added to the enclave plan

Instruction equality

To spot overlap between the enclave plan and the submitted plan, Kurtosis needs to compare instructions one by one. There are different level of equality:

  • The submitted plan instruction is equal to the enclave plan instruction - the instructions are of the same type (i.e. two exec, wait, upload_file etc.) and the set of arguments of the instructions are strictly identical.
  • The submitted plan instruction is an update of the enclave plan instruction - the instructions are of the same type but only a subset of pre-defined arguments are identical. This only exist for a certain instructions:
    • add_service instruction adding a service with the same name but a different ServiceConfig object will be considered as an update to the enclave plan instruction. The service will be restarted inside the enclave with the new service configuration.
    • upload_file instruction uploading a files artifact with the same name but different file contents will be considered as an update to the enclave plan instruction. The files artifact will be updated with the new contents inside the enclave.
    • render_template instruction creating a files artifact with the same name but a different content will be considered as an update to the enclave plan instruction. Similarly to upload_file, the content of the files artifact will be updated inside the enclave.
    • store_service_file instruction creating a files artifact with the same name but either a different source path or a different service name will be considered as an update to the enclave plan instruction. Similarly to upload_file, the content of the files artifact will be updated inside the enclave.

Two instructions that doesn't fit into any of the two categories above are considered different (i.e. independent from each other).

It's good to callout here that a few Kurtosis instructions are fundamentally incompatible with the concept of idempotency. The use of one of those instructions in the package will make the plans not resolvable, and Kurtosis will default to the "naive" execution strategy of running the submitted plan on top of the current plan, without even trying the overlap them. Those instructions currently are:

  • remove_service
  • start_service
  • stop_service

Instruction dependencies

Certain instructions depend on other ones, and with the concept of instruction update explained above comes the concept of dependency between instructions. It's easier to understand the concept with an example.

Let's consider a submitted plan with 2 instructions: an add_service adding service_1 and an exec on service_1. If the first add_service instruction is considered an update when running the package, service_1 will be updated and therefore restarted. In that case, even if the exec is equal to the matching instruction in the enclave plan it will be re-run because it runs on a component (service_1) that has been updated. It is said that the exec instruction depends on the add_service instruction.

Dependency relationships can be the following:

  • add_service instruction depends on the files artifact mounted onto the service. If one of the files artifact is updated, the add_service will be re-run
  • exec instruction depends on the service on which it runs. If the service is updated, the exec will be re-run.
  • request instruction depends on the service on which it runs, similarly to exec
  • store_service_fileinstruction depends on the service on which it runs, similarly to exec
  • wait instruction depends on the service on which it runs, similarly to exec

Examples

Case of a submitted plan being disjoint from the enclave plan

No instruction get skipped, all instructions from the submitted plan are executed and appended to the enclave plan.

disjoint-plans-v2.png

Case of a submitted plan partially overlapping the enclave plan

The first two add_service instructions from the submitted plans are equal to the last two instructions of the enclave plan. They are therefore skipped, and only the exec and store_service_files from the submitted plan are executed.

overlapping-plans-v2.png

Case of a submitted plan partially overlapping the enclave plan with instruction updates

The upload_file instruction is equal, it will be skipped similarly to the case explained above.

The add_service instruction from the submitted plan adding service service_1 is an update of the add_service instruction from the enclave plan adding service_1 (notice the *** on the schema - the ServiceConfig object has been updated, for example to update the container image version). It will therefore be re-run and service_1 will be updated inside the enclave.

The second add_service instruction from the submitted plan adding service service_2 is equal to the one from the enclave plan. It will be skipped.

The exec instruction from the submitted plan is equal to the one from the enclave plan. However, since it operates on service_1 and service_1 was updated in the submitted plan, this instruction will also be re-run.

The store_service_file from the submitted plan is equal to the one from the enclave plan, and the service on which it runs (service_2) was left intact in the submitted plan. It will therefore be skipped.

overlapping-plans-with-updates-v2.png