url decode problem 解決方法
2024-05-04 23:16:19
供稿:網友
 
試驗了一下python的urllib庫以及js 的 encodeURIComponent 均不會替換。空格encode也是替換成了 '%20' 。python提供了urllib.quote_plus, urlib.unquote_plus來處理空格->加號,看起來還是比較合理的。 
查了一下 RFC 3986: 有下面一段 
Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-"). 
RFC 2396 有下面的一段 
The plus "+", dollar "$", and comma "," characters have been added to those in the "reserved" set, since they are treated as reserved within the query component. 
表示加號已經是url的保留字了,不需要轉義。 
然后html4文檔里才有關于加號的轉義: 
application/x-www-form-urlencoded 
Forms submitted with this content type must be encoded as follows: 
Control names and values are escaped. Space characters are replaced by`+', and then reserved characters..... 
聲明只有content-type為application/x-www-form-urlencoded時才會對+做轉義。 
又翻了下php的文檔,發現有一個 
rawurlencode() - URL-encode according to RFC 3986 
也就是php又搞了rawurlencode和rawurldecode把標準實現了。。。。 
不能反一下么,畢竟大部分人應該都會用urlencode。php真是蛋疼啊。。。。