I have encountered following problem while usign GPU on BIC system:
As some people might know there is a machine with Tesla K80 gpu, known as gpunode. I have started using it for our lab’s experiments with deep learning. I was using it directly connecting to the node via SSH.
However on 2017-04-10 06:07 PM I recieved following email from the BIC system adminstrator Sylvain Milot:
could you please start using gpu.q instead of using it interactively ?
actually I’m telling you … thanks!
should you need anything else, ask JF … I’m back on April 24th.
Since my computation task uses both GPU cores provided by Tesla K80, I submitted a job to the gpu.q using parallel environment with two slots. Following morning I receieved following email from the same system administrator:
a 2-node parallel environment on gpu.q, really ? Isn’t this a little greedy ? Unless you give me a good reason for this, aside from it being possible, I think we should disable parallel environments for that queue.
… and your job has been running for 8 hours already … must be a tough problem to solve!
Anyway I suppose you have until my return (April 24th) to play!
I replied that I needed to use both GPU nodes , and recieved following email:
I think you’re missing the point Vlad.
This is in fact just one adapter with 2 GPUs and I have disabled parallel environments.
Also, at the time there were no jobs on the queue waiting to be executed. Which I pointed out in my reply.
And recieved the following responce:
I understand that Vlad, but this is a limited ressource and I wan’t to give others the chance to use it, especially if you plan to run jobs which take days to run …
If you have money to spend, this server can house 6 adapters total, at ~ $5500 per adapter (Tesla K80)
Highlited by me.
So, I would like to find out is the usage of GPU is considered a “premium” service provided by the BIC? Am I allowed to use it for my research?