TargetDevice->"GPU" fails even though a CUDA GPU exists

Question

Similar question has already been asked, but in that case, the method didn't work due to OpenCL compatibility. I have an older CUDA card. Worked great with CUDALink in Mathematica 8:

CUDAInformation[]

{1 -> {"Name" -> "GeForce GTX 460", "Clock Rate" -> 1526000,
 "Compute Capabilities" -> 2.1, "GPU Overlap" -> 1, 
 "Maximum Block Dimensions" -> {1024, 1024, 64},
 "Maximum Grid Dimensions" -> {65535, 65535, 65535},
 "Maximum Threads Per Block" -> 1024,
 "Maximum Shared Memory Per Block" -> 49152,
 "Total Constant Memory" -> 65536, "Warp Size" -> 32, 
 "Maximum Pitch" -> 2147483647, "Maximum Registers Per Block" -> 32768,
 "Texture Alignment" -> 512, "Multiprocessor Count" -> 7,
 "Core Count" -> 224, "Execution Timeout" -> 1, "Integrated" -> False,
 "Can Map Host Memory" -> True, "Compute Mode" -> "Default", 
 "Texture1D Width" -> 65536, "Texture2D Width" -> 65536,
 "Texture2D Height" -> 65535, "Texture3D Width" -> 2048,
 "Texture3D Height" -> 2048, "Texture3D Depth" -> 2048,
 "Texture2D Array Width" -> 16384, 
 "Texture2D Array Height" -> 16384,
 "Texture2D Array Slices" -> 2048, "Surface Alignment" -> 512,
 "Concurrent Kernels" -> True, "ECC Enabled" -> False, 
 "TCC Enabled" -> False, "Total Memory" -> 805306368}}

When I run a NetTrain with Mathematica 11 using GPU as TrainingDevice, I receive:

Failure[\[WarningSign]  Message:    TargetDevice -> {GPU,0} could not
be used, please ensure that you have a compatible graphics card and
have installed CUDA drivers.
Tag:    NetTrain
]

Any ideas why it won't use my GPU? I understand that this puny old card is laughable, compared to some of the GPU computation setups you've been using, but I need to understand that this works and does what I need before I justify an expensive upgrade.

UPDATE WITH ANOTHER GPU AND V11.1:

Tried running Mathematica 11.1 on a 64x Win7 laptop with 2 GPUs - Intel integrated and NVIDIA K2100M discrete GPU (compute capability 3.0). Latest drivers installed, system restarted a number of times and set to use the discrete graphics card, Nvidia control panel shows Mathematica running on the GPU, CUDALink functions/demos work fine:

CUDAInformation[]
{1->{Name->Quadro K2100M,Clock Rate->666500,Compute Capabilities->3.,GPU Overlap->1,Maximum Block Dimensions->{1024,1024,64},Maximum Grid Dimensions->{2147483647,65535,65535},Maximum Threads Per Block->1024,Maximum Shared Memory Per Block->49152,Total Constant Memory->65536,Warp Size->32,Maximum Pitch->2147483647,Maximum Registers Per Block->65536,Texture Alignment->512,Multiprocessor Count->3,Core Count->96,Execution Timeout->1,Integrated->False,Can Map Host Memory->True,Compute Mode->Default,Texture1D Width->65536,Texture2D Width->65536,Texture2D Height->65536,Texture3D Width->4096,Texture3D Height->4096,Texture3D Depth->4096,Texture2D Array Width->16384,Texture2D Array Height->16384,Texture2D Array Slices->2048,Surface Alignment->512,Concurrent Kernels->True,ECC Enabled->False,TCC Enabled->False,Total Memory->2147483648}}

CUDADriverVersion[]
368.39

NetTrain[] returns this message:

NetTrain::badtrgdev: TargetDevice -> GPU could not be used, please ensure that you have a compatible NVIDIA graphics card and have installed the latest drivers.

So 3.0 and waiting for 11.1 did not fix the problem. What gives? Any ideas?

andre314 · Answer 1 · 2017-03-07T17:31:25.390

5

See wolfram community :

Here is the link

edited Mar 07 '17 at 17:31

answered Mar 07 '17 at 17:25

andre314

18,474
1
36
69

Trying version 11.1 on my laptop with Quadro K2100M (compute capability 3, latest drivers) - no luck. CUDAInformation[] properly sees it as the first and only device. What gives? I get this message: NetTrain::badtrgdev: TargetDevice -> GPU could not be used, please ensure that you have a compatible NVIDIA graphics card and have installed the latest drivers. – Gregory Klopper Apr 03 '17 at 21:12
1

@Gregory Klopper I have discovered the message from Wolfram Community when I tried to play with Neural Networks on my old NVidia GT540M (compute capability : 2.1). As it was the fourth problem I encountered, I decided to give up with Neural Networks (the other problems have nothing to do with yours). So I have no more informations. – andre314 Apr 04 '17 at 18:39
Appreciate the effort. I was very hopeful. The NN tools in 11.1 are extraordinary and work great on CPU. I just need to figure out how to get the darn GPU to work. Or get out there and splurge for a 10xx Pascal card. – Gregory Klopper Apr 04 '17 at 20:26

Edmund · Accepted Answer · 2017-04-05T01:26:08.933

3

As of his writing the latest Nvidia driver is for Quadro K2100M is

However CUDADriverVersion is reporting 368.39 on your machine. Update to the latest Nvidia drivers.

I have a laptop with both an Intel integrated GPU and a Nvidia GeForce GTX 860M. Although it is a newer card it can be said that NetTrain will work with an integrated + standalone configuration. Also, I had an issue with NetTrain a few months back and upgrading the driver to the latest version fixed it.

Hope this helps.

edited Apr 05 '17 at 01:26

answered Apr 05 '17 at 01:20

Edmund

42,267
3
51
143

You're a genius! I don't know how I missed it. System reported it being the latest driver and Nvidia update reported "No update available". However, installing 378.66 made things work! And really appreciate the reassurance that the dual-video config works. The crazy thing is - the GPU is MUCH MUCH MUCH slower at training my absolutely trivial neural net than my mobile i7 CPU: NetChain[{2, LogisticSigmoid, 3, SoftmaxLayer[]}, "Input" -> {150}, "Output" -> NetDecoder[{"Class", {-1, 0, 1}}]] – Gregory Klopper Apr 05 '17 at 05:39
@GregoryKlopper Did you run it twice? If it was the first time running with TargetDevice -> "GPU" then things may take longer as Mathematica sets up all the required items in the background. The second run (in the same session) is usually more representative when doing a small example. – Edmund Apr 05 '17 at 09:53
@GregoryKlopper Try comparing the two using the Basic Example in the TargetDevice documentation page. There is a huge difference between "GPU" and "CPU" as theTargetDevice. Perhaps the CUDA overheads make very tiny neural nets go a bit slower on GPU. However, for non-trivial neural nets you see a massive difference as in the documentation page example; "GPU" took seconds and "CPU" is at 53% after 7 minutes. – Edmund Apr 05 '17 at 10:06
It's probably just my older GPU. The example in help docs is barely faster on GPU, but the network I built is about 10 times faster on CPU. I think it came down to precision. Using whole numbers yielded better speed on the GPU, but the CPU was still twice as fast. Using small reals in range -1..1 provides best outcome for the network, as well as best CPU speed (half-million inputs per second), GPU was AT BEST at about 1/10th the speed. Need a new GPU. – Gregory Klopper Apr 06 '17 at 08:10

TargetDevice->"GPU" fails even though a CUDA GPU exists

2 Answers2

Linked