[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Guile-commits] 03/11: Fix uri-decode behavior for "+"
From: |
Andy Wingo |
Subject: |
[Guile-commits] 03/11: Fix uri-decode behavior for "+" |
Date: |
Tue, 21 Jun 2016 16:10:00 +0000 (UTC) |
wingo pushed a commit to branch stable-2.0
in repository guile.
commit 0a3ea0586d77030e5dda2b39b0a949a2231f0aee
Author: Andy Wingo <address@hidden>
Date: Mon Jun 20 14:34:19 2016 +0200
Fix uri-decode behavior for "+"
* module/web/uri.scm (uri-decode): Add #:decode-plus-to-space? keyword
argument.
(split-and-decode-uri-path): Don't decode plus to space.
* doc/ref/web.texi (URIs): Update documentation.
* test-suite/tests/web-uri.test ("decode"): Add tests.
* NEWS: Add entry.
Based on a patch by Brent <address@hidden>.
---
NEWS | 7 +++++++
doc/ref/web.texi | 7 ++++++-
module/web/uri.scm | 11 ++++++++---
test-suite/tests/web-uri.test | 5 ++++-
4 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/NEWS b/NEWS
index 8ed1b8d..80f1cd4 100644
--- a/NEWS
+++ b/NEWS
@@ -19,6 +19,13 @@ Guile's reader conform more closely to the R6RS syntax. In
particular:
- It enables the `square-brackets', `hungry-eol-escapes' and
`r6rs-hex-escapes' reader options.
+* Bug fixes
+
+** Don't replace + with space when splitting and decoding URI paths
+
+See the documentation for `uri-decode', for more on the new
+`#:decode-plus-to-space?' keyword argument.
+
Changes in 2.0.11 (since 2.0.10):
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 9e6e0fd..bb478e0 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -245,7 +245,7 @@ serialization.
Declare a default port for the given URI scheme.
@end deffn
address@hidden {Scheme Procedure} uri-decode str [#:address@hidden"utf-8"}]
address@hidden {Scheme Procedure} uri-decode str [#:address@hidden"utf-8"}]
[#:decode-plus-to-space? #t]
Percent-decode the given @var{str}, according to @var{encoding}, which
should be the name of a character encoding.
@@ -262,6 +262,11 @@ decoded bytes are not valid for the given encoding. Pass
@code{#f} for
@xref{Ports, @code{set-port-encoding!}}, for more information on
character encodings.
+If @var{decode-plus-to-space?} is true, which is the default, also
+replace instances of the plus character @samp{+} with a space character.
+This is needed when parsing @code{application/x-www-form-urlencoded}
+data.
+
Returns a string of the decoded characters, or a bytevector if
@var{encoding} was @code{#f}.
@end deffn
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 3ab820d..179618d 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -304,7 +304,7 @@ serialization."
(define hex-chars
(string->char-set "0123456789abcdefABCDEF"))
-(define* (uri-decode str #:key (encoding "utf-8"))
+(define* (uri-decode str #:key (encoding "utf-8") (decode-plus-to-space? #t))
"Percent-decode the given STR, according to ENCODING,
which should be the name of a character encoding.
@@ -320,6 +320,10 @@ bytes are not valid for the given encoding. Pass ‘#f’ for
ENCODING if
you want decoded bytes as a bytevector directly. ‘set-port-encoding!’,
for more information on character encodings.
+If DECODE-PLUS-TO-SPACE? is true, which is the default, also replace
+instances of the plus character (+) with a space character. This is
+needed when parsing application/x-www-form-urlencoded data.
+
Returns a string of the decoded characters, or a bytevector if
ENCODING was ‘#f’."
(let* ((len (string-length str))
@@ -330,7 +334,7 @@ ENCODING was ‘#f’."
(if (< i len)
(let ((ch (string-ref str i)))
(cond
- ((eqv? ch #\+)
+ ((and (eqv? ch #\+) decode-plus-to-space?)
(put-u8 port (char->integer #\space))
(lp (1+ i)))
((and (< (+ i 2) len) (eqv? ch #\%)
@@ -413,7 +417,8 @@ removing empty components.
For example, ‘\"/foo/bar%20baz/\"’ decodes to the two-element list,
‘(\"foo\" \"bar baz\")’."
(filter (lambda (x) (not (string-null? x)))
- (map uri-decode (string-split path #\/))))
+ (map (lambda (s) (uri-decode s #:decode-plus-to-space? #f))
+ (string-split path #\/))))
(define (encode-and-join-uri-path parts)
"URI-encode each element of PARTS, which should be a list of
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 3d14d9d..e1b6ca3 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -255,7 +255,10 @@
(equal? "foo bar" (uri-decode "foo%20bar")))
(pass-if "foo+bar"
- (equal? "foo bar" (uri-decode "foo+bar"))))
+ (equal? "foo bar" (uri-decode "foo+bar")))
+
+ (pass-if "foo+bar"
+ (equal? '("foo+bar") (split-and-decode-uri-path "foo+bar"))))
(with-test-prefix "encode"
(pass-if (equal? "foo%20bar" (uri-encode "foo bar")))
- [Guile-commits] branch stable-2.0 updated (a192c33 -> dde0d17), Andy Wingo, 2016/06/21
- [Guile-commits] 04/11: Fix size measurement in bytevector_large_set, Andy Wingo, 2016/06/21
- [Guile-commits] 02/11: Add reference to the lack of "non-greedy" variants, Andy Wingo, 2016/06/21
- [Guile-commits] 06/11: Fix SRFI-2 (and-let*) implementation., Andy Wingo, 2016/06/21
- [Guile-commits] 07/11: Add SRFI-2 (and-let*) test suite., Andy Wingo, 2016/06/21
- [Guile-commits] 01/11: Remove link to Emacs' regexp syntax, Andy Wingo, 2016/06/21
- [Guile-commits] 09/11: Detect too-old libunistring at configure-time., Andy Wingo, 2016/06/21
- [Guile-commits] 10/11: Document pretty-print #:max-expr-width, Andy Wingo, 2016/06/21
- [Guile-commits] 11/11: Use source file permissions for compiled files, Andy Wingo, 2016/06/21
- [Guile-commits] 03/11: Fix uri-decode behavior for "+",
Andy Wingo <=
- [Guile-commits] 08/11: Update and-let-star.test, Andy Wingo, 2016/06/21
- [Guile-commits] 05/11: Document sigaction + SA_RESTART, Andy Wingo, 2016/06/21