trust signature domain restrictions don't work
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	linsam
	Jan 16 2017, 6:43 AM

Description

OSes tested: Ubuntu 16.04, Centos7

When issuing a trust signature for, say "customer.com", gnupg saves the
trust_regexp to '<[^>]+[@.]customer\.com>$', which looks right. However, when
trust checks are done, that string is passed to sanitize_regexp(), which adds
backslashes before every letter. The resulting regexp doesn't match email
addresses in the domain.

If I modify sanitize_regexp() to leave alphabet letters alone, it matches and
the trust sig works as expected.

I think this is related to bug T2284.

(I have example output and an example patch, but adding these makes the bug
tracker think I'm spam; I'll try to add again later)

Details

Version: 1.4.20, 2.0.22, 2.1.11

Revisions and Commits

rG GnuPG
	rG0d0b9eb0d4f9 g10: Fix regexp sanitization.
	rG9441946e1824 g10: Fix regexp sanitization.
	rG9ba0e2c76c0c g10: Fix regexp sanitization.
	rGccf3ba92087e g10: Fix regexp sanitization.

Related Objects

Mentioned In: T2284: tsign behavior does not achieve what dkg says it should
T2283: tsign domain not documented
Mentioned Here: rGccf3ba92087e: g10: Fix regexp sanitization.
T2284: tsign behavior does not achieve what dkg says it should
D406: 944_example.patch

Event Timeline

linsam added projects: gnupg (gpg21), gnupg (gpg14), gnupg (gpg20), gnupg, Bug Report.Jan 16 2017, 6:43 AM

linsam set Version to 1.4.20, 2.0.22, 2.1.11.

linsam added a subscriber: linsam.

943_example.txt1 KBDownload

Attached example are the following setup:

user1 tsign user2 with full trust, depth 1, domain="customer.com". User2 signs
user3 through user5 (regular signatures). User4 is at customer.com, users 3 and
5 are at example.com.

D406: 944_example.patch

Attached example patch prevents escaping normal lowercase letters.

Note that this isn't a general solution, though it does solve the issue for me.
For example, some email addresses have numbers (I don't know if having backslash
before numbers is an issue like it is for letters)

Attached example output after patch is applied. Now User4 has full validity like
expected, and the debug output shows a match for User4's email address (NOTE:
the debug output has 'YES' for no match and 'NO' for successful match)

945_example-patched.txt2 KBDownload

gouttegd added a subscriber: gouttegd.Jul 2 2017, 12:23 AM

For information, this issue was also discussed on both gnupg-user and gnupg-devel back in january 2017. I mention it here for reference.

In particular:

John Lane proposed an useful script to test the behavior of trust signatures;
two patches were also proposed to fix the issue, one by myself and one by John O'Meara.

marcus updated the task description. (Show Details)Jul 3 2017, 11:07 AM

marcus removed projects: gnupg, gnupg (gpg20), gnupg (gpg14).

We need to find out when this regression happened. Back when David implemented trust signatures he ran tests with the PGP folks to make sure our implementation are compatible. This is why I call this a regression.

The cause of the regression may actually not be in GnuPG's code.

As it is, the sanitize_regexp function obviously assumes that it is OK to escape characters that do not have any special meaning, and it escapes everything. But according to POSIX, the behavior in that case is left to the implementations:

The interpretation of an ordinary character preceded by an unescaped <backslash> is undefined.

So, maybe the behavior of glibc's regex implementation on escaped ordinary characters changed at some point between the time sanitize_regexp was written and now?

In any case, I think we should escape only the characters that must be escaped according to POSIX, so that we do not depend on the behavior of a particular regex engine.

gouttegd mentioned this in T2283: tsign domain not documented.Jul 14 2017, 12:19 PM

including these tests (or something similar) in the gpg test suite would be a good way to avoid future regressions.

dkg renamed this task from tust signature domain restrictions don't work to trust signature domain restrictions don't work.Jul 14 2017, 1:01 PM

In T2923#100519, @dkg wrote:

including these tests (or something similar) in the gpg test suite would be a good way to avoid future regressions.

Agreed, but difficult. So far we have no tests for the trust model pgp.

In T2923#99651, @gouttegd wrote:

As it is, the sanitize_regexp function obviously assumes that it is OK to escape characters that do not have any special meaning, and it escapes everything.

Maybe that is because rfc4880 is vague at best about the syntax of regular expressions. https://tools.ietf.org/html/rfc4880#section-8 says:

An atom is a [...] '\' followed by a single character (matching that character) [...]

The way I read that allows what sanitize_regexp is doing. However, maybe we cannot use glibc's regex implementation then.

• werner edited projects, added gnupg (gpg23); removed gnupg (gpg21).Sep 21 2017, 3:49 PM

wiktor-k added a subscriber: wiktor-k.Nov 7 2017, 3:41 PM

For the reference sanitize_regexp was introduced in this commit from 2007 to "Protect against malloc bombs.": and I see no changes to it (except typo correction) in git blame in trustdb.c.

It might be not a regression. The possibilities are: (1) it was tested by using non-GNU operating system. (2) Tests didn't cover characters (b, B, w, W, s, and S).

GNU extension for regular expression supports: \b, \B (word boundary), \w, \W (word), \s, \S (space). This extension is enabled by GNU C library for regcomp.
Apparently, sanitize_regexp doesn't work well with the GNU extension for regexp.

For what is worth I think sanitize_regexp was programmed while reading 4880 because the RFC allows backslash + any character (section 8: Regular Expressions):

An atom is
(...)
a '\' followed by a single character (matching that character)

• gniibe claimed this task.Nov 8 2017, 9:06 AM

Henry Spencer wrote three implementations (old, BSD, and Tcl): https://garyhouston.github.io/regex/
Indeed, for the one in old library and BSD library, \ + CHAR means that single CHAR.
For one in Tcl library, \s, \S, \w, \W is supported (just like GNU), and \d, \D (digit) is also supported.

I think that the sanitize_regexp refers the one in old library which doesn't support "{}" operator.

I'm going to apply @gouttegd 's patch. Since the commit message is wrong (example.com just works well, while example.es doesn't), I will write my own comment.
I can't adopt John O'Meara's approach, because it's not a bug fix, but adding feature, assuming GNU regexp use somehow.

• gniibe added a commit: rGccf3ba92087e: g10: Fix regexp sanitization..Nov 9 2017, 7:39 AM

• gniibe closed this task as a duplicate of T2284: tsign behavior does not achieve what dkg says it should.Nov 9 2017, 7:41 AM

No, I was not accurate. EXAMPLE.COM works, while example.com doesn't work.

Anyway, I committed a change to master. rGccf3ba92087e: g10: Fix regexp sanitization.

• gniibe reopened this task as Open.Nov 9 2017, 7:44 AM

• gniibe merged a task: T2284: tsign behavior does not achieve what dkg says it should.

• gniibe added subscribers: clint, neal.

It might be easier to include a regexp implementation in GnUPG proper. This way we have a well defined behaviour and it will work also on Windows. The gpg-check-pattern tool might slightly change its behaviour, though.

It's fixed in master.
It is good to backport this to GnuPG 2.2 and GnuPG 1.4.

• gniibe added a commit: rG9ba0e2c76c0c: g10: Fix regexp sanitization..Dec 4 2017, 11:55 AM

• gniibe added a commit: rG9441946e1824: g10: Fix regexp sanitization..

• gniibe added a commit: rG0d0b9eb0d4f9: g10: Fix regexp sanitization..

It's in 2.2.4 and 1.4.23.
Closing.