Theoretically speaking, the expectation is that going with a more direct way (Presto CPU or proprietary CUDA) will give better performance compared to OpenCL. From there on, unless we have a stable build and tune it (i.e. optimizing it as much as we did with the other APIs) the comparison will not be fair, so it is only a qualitative index now.eph wrote: How would you say the current OpenCL port compares to CUDA speed? Does it have the potential to equal / surpass it in terms of performance?
Best wishes