Skip to content

Feedback to Software Carbon Intensity for AI Specification #128

@bgamazay

Description

@bgamazay

Misc comments:

  1. Life Cycle Stages should reference ones already defined in ISO - these are very close but I think aligning makes the most sense.

  2. Design and Development stage:
    a. explicitly call out mid- and post-training to be included in this phase (rather than just pre-training) - this is a growing component
    b. explicitly call out that it shouldn't just be the final training run, but any test runs (you mention the entire training duration, but there are often many training runs done before the final one)

  3. Consumer vs Provider: I think in practice there is a lot of gray area in these definitions. I recommend checking out the EU AI Act definitions, aligning to what's already out there

  4. Consumer Functional Unit: for LLMs I think Query should be the FU rather than Token. LLMs behave very differently, especially with the rise of reasoning models, and they can generate a massively different amount of tokens from model to model. The query is really what is desired by the end user

  5. Provider Functional Units: I don't think normalizing training by anything really makes sense or provides any useful information - instead a useful thing you can add here would be to think about how this phase can be properly attributed (amortized?) to the per inference CO2 value

  6. Examples: Have you done any real-world testing of this spec? The reality is much, much more messy, complicated, and various depending on small decisions compared to what's in your examples. For instance, you say "Carbon emitted of inference servers" - just that one line raises about 50 questions in my head ranging from what's included in the energy portion (GPU, CPU, RAM, PUE, idle energy) how to calculate carbon (LB, MB, regional vs global) to how the model is configured (data precision, batching, quantization, etc.).
    I understand you are trying to be general, but there is such a wide range of variability depending on different choices that the results of this spec won't be comparable as currently written. I highly recommend integrating learnings from real-world implementations of this before proceeding.

Overall thought:
The European Committee of Standardization (CEN) is currently working (with leading experts) on a "Guidelines and metrics for the environmental impact of artificial intelligence systems and services" standard which goes into significant technical depth. ISO standardization is a goal of theirs, I believe, and I fear the space could become confused and fragmented if your spec beats them to the punch. Therefore, I discourage you from proceeding with ISO standardization of this spec as it's currently written, as it would confuse the industry, and ultimately be detrimental to the greening of software. I still think it could be useful if it has real-world numbers to ground it.

Metadata

Metadata

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions