Despite a common approach to employ usage of comprehensive benchmarks to determine performance of different hardware components of processing units, this if often not a great idea, mainly due to general limitations mentioned in my earlier blog post.
An alternative way to quickly predict the potential and a likely performance of a given GPU under stress in terms of its limits is to observe its raw „GFLOPs score. Although it isn’t a definite case, it is quite much a great predictor and it is definitely a great way to scope the potential of any given GPU.
It is quite striaght-forward to calculate the GFLOPS for a given GPU if one can obtain the info to calculate the number of shaders and gpu core frequency, which sometimes isn’t the case and some manufacturers happen to withhold even the information on the GPU that is being used, so the only solution is to literally google for data in some cases.
Although many informed folks will be quick to mention that other factors MAY play a huge role as well, including the bus clock speed and the amount of RAM available and so on, a general rule of thumb for the user will once again show that the higher the GFLOPS the GPU can process, the better the outcome will be and it’s likely going to be in such proportion.
The basic formula for obtaining the total number of shaders is very straight-forward and the factors to take into consideration are the following:
- Number of multiply/add operations
- Number of SIMD units per core (single instruction, multiple data)
- Number of GPU cores
Given the amount of shaders is X, the formula generally results in the following manner:
X = number of computing units * 4 SIMD per compute unit * number of ALUs per SIMD
The total number of FLOPS is the result of multiplying the number of shaders with the core clock speed (Y being the GFLOPS theoretical score):
Y = X * 2 * frequency
ALUs that can process both multiply and add operations within single clock speed will have the total amount of FLOPS multiplied by two and this means that generally any SoC GPU that you may encounter in everyday life will have the score multiplied by two. The resulting score Y is the theoretical peak score.
Finding a reliable info on all the necessary data is often hard mainly because some manufacturers want to hide the „lack“ of raw performance while relying on testing. Although no site info is perfectly reliable, a number of hardware enthusiasts will offer comprehensive scoring of SoCs GPUs on their sites, or they’ll just announce the core clock and GPU GFLOPS score. To verify the data, any user should pre-check some basic information, whether the number of cores, the SoC manufacturer and the clock speed match. Although not 100% accurate, the link below offers comprehensive list of scores for various SoCs manufactured in the previous years, including many of the newer SoCs:
Marketing schemes to mislead the users
Most of the users aren’t aware of these simple ways to quickly scope the raw potential and likely performance of their SoC, or GPU in particular, although one of the crucial factors for image-processings task will be the GPU GFLOPS score. This applies to both, single-board computers and mobile phones, or any other device.
A common strategy is to emphasize the clock speed or the number of cores of a given GPU, without releasing the info on its GFLOPS score. A more common strategy is also to disregard the GPU and focus solely on CPU, but to also focus on number of CPU cores and their clock speed, or other factors that won’t offer any easy way to properly scope any SoC’s CPU performance . This will be mentioned in another post with more detail, since it is much more profound among CPU performance marketing, while it mostly relies on withholding any reliable data on GFLOPS scores for SoC’s GPU.