There’s no inherent way to know exactly what client a validator is running. Researchers use other metrics to make deductions on which client a validator is most likely operating. The problem is they cannot distinguish with 100% certainty which client a validator is running.
Miga Labs - A crawler is used to count beacon nodes and their self-reported identity. However, this means that validators sharing a node are counted only once and nodes with fewer validators have a greater influence on the estimate.
Rated - Methodology unknown.
Ethernodes - Methodology unknown.
execution-diversity.info - Through social effort, execution-diversity.info (lead by Sonic) gathers self-reported client breakdown data. From there, Clientdiversity.org takes their data relating to pools and uses validator counts from Rated Network to weight the data. While this doesn’t capture data on the entire network, the marketshare from the entities involved is substantial enough to be considered representative. Operator data is omitted due to unknown overlap between pool data.