Agreed, but my point is that stating “x-core CPU, y-core GPU, z-core NPU”, is basically non-information.
CPUs run general logical processing
GPUs run integer/float matrices
NPUs run minimal effort matrices for inference
I’d like to see the TOPS for each of those, instead of a “core count” that tells me nothing about actual performance. Even the TOPS are orientative… but would be a good start.
Agreed! I’m just not sure TOPS is the right metric for a CPU, due to how different the CPU data pipeline is than a GPU. Bubbly/clear instruction streams are one thing, but the majority type of instruction in a calculation also effects how many instructions can be run on each clock cycle pretty significantly, whereas in matrix-optimized silicon its a lot more fair to generalize over a bulk workload.
Generally, I think its fundamentally challenging to generate a generally applicable single number to represent CPU performance across different workloads.
Agreed, but my point is that stating “x-core CPU, y-core GPU, z-core NPU”, is basically non-information.
I’d like to see the TOPS for each of those, instead of a “core count” that tells me nothing about actual performance. Even the TOPS are orientative… but would be a good start.
Agreed! I’m just not sure TOPS is the right metric for a CPU, due to how different the CPU data pipeline is than a GPU. Bubbly/clear instruction streams are one thing, but the majority type of instruction in a calculation also effects how many instructions can be run on each clock cycle pretty significantly, whereas in matrix-optimized silicon its a lot more fair to generalize over a bulk workload.
Generally, I think its fundamentally challenging to generate a generally applicable single number to represent CPU performance across different workloads.