From 8da50542d1884775d40510285f995a53488f383b Mon Sep 17 00:00:00 2001 From: Sophie Hirn Date: Sun, 19 Oct 2025 11:54:12 +0200 Subject: [PATCH] json: Allow encoding multibyte UTF-8 sequences The current implementation rejects all input with bytes > 0x7E, which includes all multibyte UTF-8 sequences. According to ECMA-404, Section 9, only double quotation marks, backslashes, and characters 0x00 - 0x1F must be escaped in JSON strings, so non-ascii bytes can just be passed without escaping. This also mirrors what the decoder does above. Of course this allows invalid UTF-8 characters to be encoded. Checks for this could be added as well, but at least the decoder does not seem to do that. And from what I can tell from a quick glance, the text output path does not check that either. Fixes: https://gitlab.freedesktop.org/pulseaudio/pulseaudio/-/issues/1310 --- src/pulsecore/json.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/pulsecore/json.c b/src/pulsecore/json.c index 51bbea9ae..0f204b051 100644 --- a/src/pulsecore/json.c +++ b/src/pulsecore/json.c @@ -763,8 +763,8 @@ static char *pa_json_escape(const char *p) { *output++ = 't'; break; default: - if (*s < 0x20 || *s > 0x7E) { - pa_log("Invalid non-ASCII character: 0x%x", (unsigned int) *s); + if (*s < 0x20 || *s == 0x7F) { + pa_log("Invalid ASCII character: 0x%x", (unsigned int) *s); pa_xfree(out_string); return NULL; }