com.google.gdata.util.common.net
Class UriEncoder

java.lang.Object
  extended by com.google.gdata.util.common.net.UriEncoder

public final class UriEncoder
extends java.lang.Object

Implements percent-encoding, specifying how to encode non-US-ASCII and reserved characters in URIs.

Per Section 2.1 of RFC 3986, URIs should contain only characters that are part of US-ASCII, and some characters are further reserved to delimit components or subcomponents; therefore, characters that are outside the allowed set need to be encoded. This is done using the escape sequence "%XX" where XX is the hexadecimal value of the bytewise representation of the character.

This encoding format is used for the application/x-www-form-urlencoded content type, as defined by section 17.13.4 of the W3C's HTML 4.01 Specification.

For example, the Unicode string "flambé" is represented as the byte sequence [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xe9] in ISO-8859-1. In UTF-8, it is represented as [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xc3, 0xa9]. The first five characters are unreserved and do not require encoding, but the last character is not, so the URI representation is "flamb%E9" in ISO-8859-1 and "flamb%C3%A9" in UTF-8. Escape sequences are not case-sensitive.

See Also:
Uri

Field Summary
static java.nio.charset.Charset DEFAULT_ENCODING
          The default character encoding, UTF-8, per Section 2.5 of RFC 3986.
 
Method Summary
static java.lang.String decode(java.lang.String string)
          Percent-decodes a US-ASCII string into a Unicode string.
static java.lang.String decode(java.lang.String string, java.nio.charset.Charset encoding)
          Percent-decodes a US-ASCII string into a Unicode string.
static java.lang.String encode(java.lang.String string)
          Percent-encodes a Unicode string into a US-ASCII string.
static java.lang.String encode(java.lang.String string, java.nio.charset.Charset encoding)
          Percent-encodes a Unicode string into a US-ASCII string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_ENCODING

public static final java.nio.charset.Charset DEFAULT_ENCODING
The default character encoding, UTF-8, per Section 2.5 of RFC 3986.

See Also:
Charsets
Method Detail

encode

public static java.lang.String encode(java.lang.String string)
Percent-encodes a Unicode string into a US-ASCII string. The DEFAULT_ENCODING, UTF-8, is used to determine how non-US-ASCII and reserved characters should be represented as consecutive sequences of the form "%XX".

This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.

Parameters:
string - a Unicode string
Returns:
a percent-encoded US-ASCII string
Throws:
java.lang.NullPointerException - if string is null

encode

public static java.lang.String encode(java.lang.String string,
                                      java.nio.charset.Charset encoding)
Percent-encodes a Unicode string into a US-ASCII string. The specified encoding is used to determine how non-US-ASCII and reserved characters should be represented as consecutive sequences of the form "%XX".

This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.

Parameters:
string - a Unicode string
encoding - a character encoding
Returns:
a percent-encoded US-ASCII string
Throws:
java.lang.NullPointerException - if any argument is null

decode

public static java.lang.String decode(java.lang.String string)
Percent-decodes a US-ASCII string into a Unicode string. The DEFAULT_ENCODING, UTF-8, is used to determine what characters are represented by any consecutive sequences of the form "%XX".

This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.

Parameters:
string - a percent-encoded US-ASCII string
Returns:
a Unicode string
Throws:
java.lang.NullPointerException - if string is null

decode

public static java.lang.String decode(java.lang.String string,
                                      java.nio.charset.Charset encoding)
Percent-decodes a US-ASCII string into a Unicode string. The specified encoding is used to determine what characters are represented by any consecutive sequences of the form "%XX". This is the strict kind of decoding, that will throw an exception if any "%XX" sequence encountered is invalid (for example, "%HH").

This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.

Parameters:
string - a percent-encoded US-ASCII string
encoding - a character encoding
Returns:
a Unicode string
Throws:
java.lang.NullPointerException - if any argument is null
java.lang.RuntimeException - if any the decoding failed because some % sequence above is invalid (for example, "%HH")