phpjs

<a href="http://an3m1.com/" rel="nofollow">???? ????????</a> on 2012-04-17 15:33:39 If I might —perhaps you should consider adding a few images. I don’t mean to disrespect what you’ve said ; its very enlightening, indeed. However, I think would respond to it more positively if they could be something tangible to your ideas <hr /> <a href="http://brett-zamir.me" rel="nofollow">Brett Zamir</a> on 2009-10-28 02:32:00 @Joris: Thanks for your code (it's helpful to have here for reference and makes a good complement to the previous code for urlencode and reposted just now in the comments at http://phpjs.org/functions/urlencode ), but I went ahead with the faster use of decodeURIComponent(). <hr /> <a href="http://brett-zamir.me" rel="nofollow">Brett Zamir</a> on 2009-10-26 01:07:41 @Joris: Sorry I haven't gotten to your post yet; that looks great! I'd like to test it out a little first and then commit, if you can bear with me a little... <hr /> Joris van der Wel on 2009-09-29 16:14:40 Incase anyone is interested, here is a version with full UTF-8 support written without decodeURIComponent or any maps. <pre><code> function rawurldecode(url) { // This function mimmicks PHP's rawurldecode under UTF-8 // Any percentage notation is converted to its UTF-16 character. // Only tested on Mozilla browsers (Firefox 3.5) // Does NOT use any of decodeURIComponent, decodeURI, unescape, etc // Supports 4 byte characters (so unicode characters 0x0000 through 0x10FFFF) // // Original by Joris van der Wel var chr, a, len, ret, c, c2, c3, c4, hi, low; ret = ''; for (a = 0, len = url.length; a < len; a++) { chr = url.charAt(a); if (chr != '%') { ret += chr; continue; } c = parseInt(url.charAt(a+1) + url.charAt(a+2), 16); if (isNaN(c)) { ret += '%'; // If php comes across something invalid, it just shows it without parsing continue; } a += 2; // skip 2 ret += String.fromCharCode(c); } // second pass, convert UTF-8 to UTF-16 (Strings in javascript (ECMA-262 to be exact) are UTF-16) url = ret; ret = ''; for (a = 0, len = url.length; a < len; a++) { c = url.charCodeAt(a); // c & 1000 0000 === 0000 0000 if( (c & 0x80) === 0 ) // 0xxxxxxx { ret += url.charAt(a); } // c & 1110 0000 === 1100 0000 else if ((c & 0xE0) === 0xC0) // 110y yyxx 10xx xxxx { a++; c2 = url.charCodeAt(a); ret += String.fromCharCode( ((c & 0x1F) << 6) | ((c2 & 0x3F) << 0) ); } // c & 1111 0000 === 1110 0000 else if ((c & 0xF0) === 0xE0) // 1110 yyyy 10yy yyxx 10xx xxxx { a++; c2 = url.charCodeAt(a); a++; c3 = url.charCodeAt(a); ret += String.fromCharCode( ((c & 0x0F) << 12) | ((c2 & 0x3F) << 6 ) | ((c3 & 0x3F) << 0 ) ); } // c & 1111 1000 === 1111 0000 else if ((c & 0xF8) === 0xF0) // 1111 0zzz 10zz yyyy 10yy yyxx 10xx xxxx { a++; c2 = url.charCodeAt(a); a++; c3 = url.charCodeAt(a); a++; c4 = url.charCodeAt(a); c = ((c & 0x07) << 18) | ((c2 & 0x3F) << 12) | ((c3 & 0x3F) << 6 ) | ((c4 & 0x3F) << 0 ) ; if (c >= 0x10000) // split it up using surrogates { c -= 0x10000; hi = (c & 0xFFC00) >> 10; // first 10 bits low = c & 0x003FF; // last 10 bits hi += 0xD800; // high surrogate range low += 0xDC00; // low surrogate range ret += String.fromCharCode(hi, low); } else { ret += String.fromCharCode(c); } } } return ret; } </code></pre> You could probably rewrite it to use only one loop, but that would turn into spaghetti code very fast Gr. <hr /> <a href="http://kevin.vanzonneveld.net" rel="nofollow">Kevin van Zonneveld</a> on 2009-04-03 13:33:35 @ travc & Brett Zamir: Fixed in SVN: Thanks! <hr /> Brett Zamir on 2009-03-30 16:19:31 Sorry, it looks like to fully reflect PHP's behavior, you have to add this to the histogram (all of our other related functions should be converted accordingly as well): <pre><code> histogram['\u20AC'] = '%80'; histogram['\u0081'] = '%81'; histogram['\u201A'] = '%82'; histogram['\u0192'] = '%83'; histogram['\u201E'] = '%84'; histogram['\u2026'] = '%85'; histogram['\u2020'] = '%86'; histogram['\u2021'] = '%87'; histogram['\u02C6'] = '%88'; histogram['\u2030'] = '%89'; histogram['\u0160'] = '%8A'; histogram['\u2039'] = '%8B'; histogram['\u0152'] = '%8C'; histogram['\u008D'] = '%8D'; histogram['\u017D'] = '%8E'; histogram['\u008F'] = '%8F'; histogram['\u0090'] = '%90'; histogram['\u2018'] = '%91'; histogram['\u2019'] = '%92'; histogram['\u201C'] = '%93'; histogram['\u201D'] = '%94'; histogram['\u2022'] = '%95'; histogram['\u2013'] = '%96'; histogram['\u2014'] = '%97'; histogram['\u02DC'] = '%98'; histogram['\u2122'] = '%99'; histogram['\u0161'] = '%9A'; histogram['\u203A'] = '%9B'; histogram['\u0153'] = '%9C'; histogram['\u009D'] = '%9D'; histogram['\u017E'] = '%9E'; histogram['\u0178'] = '%9F'; </code></pre> and then add this line right before the call to decodeURIComponent(): <pre><code> ret = ret.replace(/%([a-fA-F][0-9a-fA-F])/g, function (all, hex) {return String.fromCharCode('0x'+hex);}); // These Latin-B have the same values in Unicode, so we can convert them like this </code></pre> <hr /> travc on 2009-03-30 12:57:23 Found an apparent bug... I'll try to track it down, but I'm a javascript noob. <pre><code> <? print rawurldecode('-22%97bc%2Fbc'); ?> <script type="text/javascript"> var foo = rawurldecode('-22%97bc%2Fbc'); alert(foo); </script> </code></pre> php part works fine, js breaks. firebug reports: malformed URI sequence rawurldecode("-22%97bc%2Fbc") And, yes, this string comes from encoding in php with rawurlencode (from a big nasty db response). <hr />