compjootery: Go Slow

Friday, 26 March 2010

Go Slow

Cores up to your ears, so stop writing single threaded code! You can even grab multi machine programming, you'll hardly know the difference and there are some things that can really benefit from it. Imagine image editors grabbing spare cycles from the machines on the LAN as required. It's here today and it's called Go.

In this entry I'm going to take someone's existing Go implementation of the smallpt global illumination renderer. I hope to demonstrate the easy multi-threaded nature of CSP based techniques, even spanning machines!

The single threaded Go version gives me :

% time ./smallpt 8
Rendering (8 spp) 100.00
721.50 user 5.08 system 12:06.84 elapsed 99%CPU

Single threaded C++ with -O3

% time ./small-sthread 8
Rendering (8 spp) 100.00%
61.76u 0.27s 62.04r     ./small-sthread 8

OpenMP Multi threaded Code, 2 cores + HT

% time ./small-omp 8
Rendering (8 spp) 100.00%
88.55u 0.90s 24.83r     ./small-omp 8

Oh dear, that's amazingly piss poor. An order of magnitude to find!

OK I added an I/O thread to gather the pixels

time ./smallpt.io 8
Rendering (8 spp) 100.00
849.65 user 54.19 system 14:02.77 elapsed 107%CPU

And not suprisingly I added 50s of Syscalls, some 10% overhead but at least they seem to have ended up on 1 CPU.

Ultimate overload : 1 thread per pixel.


crashed

with a threadpool of 4 on a machine with 2 HT CPUS

time ./smallpt 8
Submitting (8 spp) 100.00
1106.38 user 369.83 system 15:47.15 elapsed 155% CPU

This used more CPU time but the running time was much the same. All the SMP gain is used up by the overhead.

I manually inlined the function calls - getting rid of the Vec methods

% time ./smallpt.un 8
Rendering (8 spp) 100.00
273.82 user 5.62 system 4:40.05 elapsed 99% CPU

shit, now I fixed it it is 4x slower !

One interesting question I have is, do Go routines without a CPU operate in non pre-empting co-op mode like Limbo? So if I spawn my Go routine to do slow I/O will it be interleaved or sit there doing nothing while the heavy CPU work co-ops ?

Go Routined version

time ./smallpt_gr.go 8
1012.35 user 454.21 system 16:21.66 elapsed 149% CPU

Libbed out vector you'll need

compjootery

Friday, 26 March 2010

Go Slow

No comments:

Tags