Page MenuHome GnuPG

gnupg v2 does not allow for parallel processing any more
Closed, InvalidPublic

Description

I used gnupg v1 to decrypt files in parallel. Now that gnupg v2 is using gpg-agent
for all of the hard work, and gpg-agent either gets locked or isn't parallelized,
this does not work any more. Can we please fix this?

Event Timeline

Now that gnupg v2 is using gpg-agent for all of the hard work,

It isn't. The agent merely decrypts the session key. gpg then decrypts the
actual data with the symmetric cipher.

and gpg-agent either gets locked

It isn't.

or isn't parallelized,

It is.

this does not work any more.

Can you please be more specific?

Well, I can only say right now that since upgrading to Ubuntu 16.10, the gpg
command now is gnupg v2 by default, and my parallel decryption using
multiple gpg processes does not work any more. "Not working" means there is
only one gpg-agent processes using any CPU at all, and it is using only one
CPU core at 100% for a very long time. Nothing else pops up in top regarding
CPU usage. 75% of the CPU cores remain idle. So my guess is that the gpg-
agent does all of the work and therefore prevents multiple parallel
executions. My conclusions seem pretty obvious to me. But maybe it has to do
with stuff done by some downstream debian or Ubuntu packagers?

I just tried:

$ g10/gpg --encrypt -r samuel </dev/urandom >/dev/null

As expected, the gpg process eats a lot of cpu time, and I can spawn two of them
just fine. This works with both my build as well as gpg from Debian testing.

In gpg-agent, only a single thread of execution runs at a time. So it is
entirely possible that what you are describing happens. For us to debug it, we
need a very concrete example. Please provide us with the command line(s) that
you are using to decrypt the files in parallel. Also, please list the keys. (A
small guess: you are using 16k RSA.)

Not quite true. As soon as a blocking system cal is used another thread is
scheduled. Long running operations like generating a new key may indeed take a
long time and inhibit other threads from running. They run long becuase they
need to collect entropy. Having other threads running at that time would not
really be helpful. Using gpg-agent for more than a decade now, I never made
that experience.

The more likely reason for the problem is that no working pinentry is installed
and the boths threads are waiting for the pinentry (pinentry access is obviously
serialized).

We need a log file from gpg-agent: Out this into gpg-agent.conf

log-file /tmp/foo/agent.log
debug 1024
verbose

and restart the agent.

The difference (according to the gpg agent log) is that gpg v1 is obviously caching
the decrypted private key used to decrypt the files using the option "-d --
multifile" whereas gpg v2 in my case repeatedly requests the decryption of the
private key for each single file. Any way to change that?

Did you changed --default-cache-ttl or --max-cache-ttl to zero or another small
value? The multifile feature requires that the passphrase cache has been enabled.

To make this work again, I think gpg-agent needs to cache the public key or support batch-operations (which would require some restructuring in gpg to request such a batch-operation).

werner removed a project: Bug Report.

No info received and thus assuming that the caching was disabled.

Testcase:

I create 1000 empty files, and sign then using GNU parallel+gpg and trying various parallelization factors. (CPU used is AMD 3700X with 16 threads.)

mkdir t; cd t
for ((i=0;i<1000;++i)); do >"t$i"; done
gpg -ab t0; rm -f t0.asc; # get gpg-agent launched and passphrase cached
rm -f *.asc; time find . -type f | parallel -j1 gpg -ab
rm -f *.asc; time find . -type f | parallel -j2 gpg -ab
rm -f *.asc; time find . -type f | parallel -j4 gpg -ab
rm -f *.asc; time find . -type f | parallel -j8 gpg -ab
rm -f *.asc; time find . -type f | parallel -j16 gpg -ab

Run /usr/bin/top in the background at your leisure.

Observed:

  • -j1: 69 s
  • -j2: 56 s
  • -j4: 56 s
  • at that point I gave up on higher -jN

Expected:

  • Linear scaling

Well this depends of course. If the "Hard work" is the actual signing it depends a ton on your Key. An ECC key will go much quicker then for example RSA4096 but IMO the "Hard work" when signing is the hashing and that is done in parralel for extremely specialized setups you could run multiple gpg-agents in different homedirs with access to the same key.

Why couldn't gpg-agent just fake these homedirs on its own?

by default we keep the unlocked secret key limited to this very tiny process (gpg-agent) which only does the secret key operations. That is I think the best decision. It is IMO not really a bottleneck since except for very small data bits the bottleneck is usually the hashing. What is your usecase of doing a thousand secret key operations (signing) on apparently extremely small data files a minute? And even then are you sure it is not your disk IO that is the bottleneck and it is in fact gpg-agent?

What is your usecase of doing a thousand secret key operations (signing) on apparently extremely small data files a minute

Signing .rpm files for Linux distributions. openSUSE for example:

     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
7.401e+03 3.313e+04 7.602e+04 1.787e+06 2.880e+05 1.730e+09 
       7K      33K        76K      1.7M     288K      1.73G
in 51136 files.