UNPKG

epubjs

Version:

Render ePub documents in the browser, across many devices

304 lines (297 loc) 54.5 kB
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"><head><title>Strings</title><link rel="stylesheet" href="core.css" type="text/css"/><meta name="generator" content="DocBook XSL Stylesheets V1.74.0"/></head><body><div class="sect1" title="Strings"><div class="titlepage"><div><div><h1 class="title"><a id="learnjava3-CHP-10-SECT-2"/>Strings</h1></div></div></div><p>We’ll start by taking a closer look at the Java <code class="literal">String</code> class (or, more specifically, <code class="literal">java.lang.String</code>). Because working with <code class="literal">String</code>s is so fundamental, it’s important to understand how they are implemented and what you can do with them. A <code class="literal">String</code> object encapsulates a sequence of Unicode characters. Internally, these characters are stored in a regular Java array, but the <code class="literal">String</code> object guards this array jealously and gives you access to it only through its own API. This is to support the idea that <code class="literal">String</code>s are <a id="I_indexterm10_id724416" class="indexterm"/><span class="emphasis"><em>immutable</em></span>; once you create a <code class="literal">String</code> object, you can’t change its value. Lots of operations on a <code class="literal">String</code> object appear to change the characters or length of a string, but what they really do is return a new <code class="literal">String</code> object that copies or internally references the needed characters of the original. Java implementations make an effort to consolidate identical strings used in the same class into a shared-string pool and to share parts of <code class="literal">String</code>s where possible.</p><p>The original motivation for all of this was performance. Immutable <code class="literal">String</code>s can save memory and be optimized for speed by the Java VM. The flip side is that a programmer should have a basic understanding of the <code class="literal">String</code> class in order to avoid creating an excessive number of <code class="literal">String</code> objects in places where performance is an issue. That was especially true in the past, when VMs were slow and handled memory poorly. Nowadays, string usage is not usually an issue in the overall performance of a real application.<sup>[<a id="learnjava3-CHP-10-FNOTE-1" href="#ftn.learnjava3-CHP-10-FNOTE-1" class="footnote">29</a>]</sup></p><div class="sect2" title="Constructing Strings"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.1"/>Constructing Strings</h2></div></div></div><p><a id="idx10572" class="indexterm"/> <a id="idx10589" class="indexterm"/>Literal strings, defined in your source code, are declared with double quotes and can be assigned to a <code class="literal">String</code> variable:</p><a id="I_10_tt549"/><pre class="programlisting"> <code class="n">String</code> <code class="n">quote</code> <code class="o">=</code> <code class="s">"To be or not to be"</code><code class="o">;</code></pre><p>Java automatically converts the literal string into a <code class="literal">String</code> object and assigns it to the variable.</p><p><code class="literal">String</code>s keep track of their own length, so <code class="literal">String</code> objects in Java don’t require special terminators. You can get the length of a <code class="literal">String</code> with the <a id="I_indexterm10_id724566" class="indexterm"/><code class="literal">length()</code> method. You can also test for a zero length string by using <code class="literal">isEmpty()</code>:</p><a id="I_10_tt550"/><pre class="programlisting"> <code class="kt">int</code> <code class="n">length</code> <code class="o">=</code> <code class="n">quote</code><code class="o">.</code><code class="na">length</code><code class="o">();</code> <code class="kt">boolean</code> <code class="n">empty</code> <code class="o">=</code> <code class="n">quote</code><code class="o">.</code><code class="na">isEmpty</code><code class="o">();</code></pre><p><code class="literal">String</code>s can take advantage of the only overloaded operator in Java, the <a id="I_indexterm10_id724600" class="indexterm"/><a id="I_indexterm10_id724608" class="indexterm"/><code class="literal">+</code> operator, for string concatenation. The following code produces equivalent strings:</p><a id="I_10_tt551"/><pre class="programlisting"> <code class="n">String</code> <code class="n">name</code> <code class="o">=</code> <code class="s">"John "</code> <code class="o">+</code> <code class="s">"Smith"</code><code class="o">;</code> <code class="n">String</code> <code class="n">name</code> <code class="o">=</code> <code class="s">"John "</code><code class="o">.</code><code class="na">concat</code><code class="o">(</code><code class="s">"Smith"</code><code class="o">);</code></pre><p>Literal strings can’t span lines in Java source files, but we can concatenate lines to produce the same effect:</p><a id="I_10_tt552"/><pre class="programlisting"> <code class="n">String</code> <code class="n">poem</code> <code class="o">=</code> <code class="s">"'Twas brillig, and the slithy toves\n"</code> <code class="o">+</code> <code class="s">" Did gyre and gimble in the wabe:\n"</code> <code class="o">+</code> <code class="s">"All mimsy were the borogoves,\n"</code> <code class="o">+</code> <code class="s">" And the mome raths outgrabe.\n"</code><code class="o">;</code></pre><p>Embedding lengthy text in source code is not normally something you want to do. In this and the following chapter, we’ll talk about ways to load <code class="literal">String</code>s from files, special packages called resource bundles, and URLs. Technologies like Java Server Pages and template engines also provide a way to factor out large amounts of text from your code. For example, in <a class="xref" href="ch14.html" title="Chapter 14. Programming for the Web">Chapter 14</a>, we’ll see how to load our poem from a web server by opening a URL like this:</p><a id="I_10_tt553"/><pre class="programlisting"> <code class="n">InputStream</code> <code class="n">poem</code> <code class="o">=</code> <code class="k">new</code> <code class="n">URL</code><code class="o">(</code> <code class="s">"http://myserver/~dodgson/jabberwocky.txt"</code><code class="o">).</code><code class="na">openStream</code><code class="o">();</code></pre><p>In addition to making strings from literal expressions, you can construct a <code class="literal">String</code> directly from an array of characters:</p><a id="I_10_tt554"/><pre class="programlisting"> <code class="kt">char</code> <code class="o">[]</code> <code class="n">data</code> <code class="o">=</code> <code class="k">new</code> <code class="kt">char</code> <code class="o">[]</code> <code class="o">{</code> <code class="sc">'L'</code><code class="o">,</code> <code class="sc">'e'</code><code class="o">,</code> <code class="sc">'m'</code><code class="o">,</code> <code class="sc">'m'</code><code class="o">,</code> <code class="sc">'i'</code><code class="o">,</code> <code class="sc">'n'</code><code class="o">,</code> <code class="sc">'g'</code> <code class="o">};</code> <code class="n">String</code> <code class="n">lemming</code> <code class="o">=</code> <code class="k">new</code> <code class="n">String</code><code class="o">(</code> <code class="n">data</code> <code class="o">);</code></pre><p>You can also construct a <code class="literal">String</code> from an array of bytes:</p><a id="I_10_tt555"/><pre class="programlisting"> <code class="kt">byte</code> <code class="o">[]</code> <code class="n">data</code> <code class="o">=</code> <code class="k">new</code> <code class="kt">byte</code> <code class="o">[]</code> <code class="o">{</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">97</code><code class="o">,</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">98</code><code class="o">,</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">99</code> <code class="o">};</code> <code class="n">String</code> <code class="n">abc</code> <code class="o">=</code> <code class="k">new</code> <code class="n">String</code><code class="o">(</code><code class="n">data</code><code class="o">,</code> <code class="s">"ISO8859_1"</code><code class="o">);</code></pre><p>In this case, the second argument to the <code class="literal">String</code> constructor is the name of a character-encoding scheme. The <code class="literal">String</code> constructor uses it to convert the raw bytes in the specified encoding to the internally used standard 2-byte Unicode characters. If you don’t specify a character encoding, the default encoding scheme on your system is used. We’ll discuss character encodings more when we talk about the <code class="literal">Charset</code> class, IO, in <a class="xref" href="ch12.html" title="Chapter 12. Input/Output Facilities">Chapter 12</a>.<sup>[<a id="learnjava3-CHP-10-FNOTE-2" href="#ftn.learnjava3-CHP-10-FNOTE-2" class="footnote">30</a>]</sup></p><p>Conversely, the <a id="I_indexterm10_id724745" class="indexterm"/><code class="literal">charAt()</code> method of the <code class="literal">String</code> class lets you access the characters of a <code class="literal">String</code> in an array-like fashion:</p><a id="I_10_tt556"/><pre class="programlisting"> <code class="n">String</code> <code class="n">s</code> <code class="o">=</code> <code class="s">"Newton"</code><code class="o">;</code> <code class="k">for</code> <code class="o">(</code> <code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="mi">0</code><code class="o">;</code> <code class="n">i</code> <code class="o">&lt;</code> <code class="n">s</code><code class="o">.</code><code class="na">length</code><code class="o">();</code> <code class="n">i</code><code class="o">++</code> <code class="o">)</code> <code class="n">System</code><code class="o">.</code><code class="na">out</code><code class="o">.</code><code class="na">println</code><code class="o">(</code> <code class="n">s</code><code class="o">.</code><code class="na">charAt</code><code class="o">(</code> <code class="n">i</code> <code class="o">)</code> <code class="o">);</code></pre><p>This code prints the characters of the string one at a time. Alternately, we can get the characters all at once with <a id="I_indexterm10_id724782" class="indexterm"/><code class="literal">toCharArray()</code>. Here’s a way to save typing a bunch of single quotes and get an array holding the alphabet:</p><a id="I_10_tt557"/><pre class="programlisting"> <code class="kt">char</code> <code class="o">[]</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">.</code><code class="na">toCharArray</code><code class="o">();</code></pre><p>The notion that a <code class="literal">String</code> is a sequence of characters is also codified by the <code class="literal">String</code> class implementing the interface <code class="literal">java.lang.CharSequence</code>, which prescribes the methods <code class="literal">length()</code> and <code class="literal">charAt()</code> as well as a way to get a subset of the characters.<a id="I_indexterm10_id724834" class="indexterm"/><a id="I_indexterm10_id724841" class="indexterm"/></p></div><div class="sect2" title="Strings from Things"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.2"/>Strings from Things</h2></div></div></div><p><a id="idx10575" class="indexterm"/> <a id="idx10592" class="indexterm"/>Objects and primitive types in Java can be turned into a default textual representation as a <code class="literal">String</code>. For primitive types like numbers, the string should be fairly obvious; for object types, it is under the control of the object itself. We can get the string representation of an item with the static <a id="I_indexterm10_id724895" class="indexterm"/><code class="literal">String.valueOf()</code> method. Various overloaded versions of this method accept each of the primitive types:</p><a id="I_10_tt558"/><pre class="programlisting"> <code class="n">String</code> <code class="n">one</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="mi">1</code> <code class="o">);</code> <code class="c1">// integer, "1"</code> <code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="mf">2.384f</code> <code class="o">);</code> <code class="c1">// float, "2.384"</code> <code class="n">String</code> <code class="n">notTrue</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="kc">false</code> <code class="o">);</code> <code class="c1">// boolean, "false"</code></pre><p>All objects in Java have a <a id="I_indexterm10_id724919" class="indexterm"/><code class="literal">toString()</code> method that is inherited from the <code class="literal">Object</code> class. For many objects, this method returns a useful result that displays the contents of the object. For example, a <code class="literal">java</code>.<code class="literal">util</code>.<code class="literal">Date</code> object’s <code class="literal">toString()</code> method returns the date it represents formatted as a string. For objects that do not provide a representation, the string result is just a unique identifier that can be used for debugging. The <code class="literal">String.valueOf()</code> method, when called for an object, invokes the object’s <code class="literal">toString()</code> method and returns the result. The only real difference in using this method is that if you pass it a null object reference, it returns the <code class="literal">String</code> “null” for you, instead of producing a <code class="literal">NullPointerException</code>:</p><a id="I_10_tt559"/><pre class="programlisting"> <code class="n">Date</code> <code class="n">date</code> <code class="o">=</code> <code class="k">new</code> <code class="n">Date</code><code class="o">();</code> <code class="c1">// Equivalent, e.g., "Fri Dec 19 05:45:34 CST 1969"</code> <code class="n">String</code> <code class="n">d1</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="n">date</code> <code class="o">);</code> <code class="n">String</code> <code class="n">d2</code> <code class="o">=</code> <code class="n">date</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code> <code class="n">date</code> <code class="o">=</code> <code class="kc">null</code><code class="o">;</code> <code class="n">d1</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="n">date</code> <code class="o">);</code> <code class="c1">// "null"</code> <code class="n">d2</code> <code class="o">=</code> <code class="n">date</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code> <code class="c1">// NullPointerException!</code></pre><p>String concatenation uses the <code class="literal">valueOf()</code> method internally, so if you “add” an object or primitive using the plus operator (+), you get a <code class="literal">String</code>:</p><a id="I_10_tt560"/><pre class="programlisting"> <code class="n">String</code> <code class="n">today</code> <code class="o">=</code> <code class="s">"Today's date is :"</code> <code class="o">+</code> <code class="n">date</code><code class="o">;</code></pre><p>You’ll sometimes see people use the empty string and the plus operator (+) as shorthand to get the string value of an object. For example:<a id="I_indexterm10_id725028" class="indexterm"/><a id="I_indexterm10_id725035" class="indexterm"/></p><a id="I_10_tt561"/><pre class="programlisting"> <code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="s">""</code> <code class="o">+</code> <code class="mf">2.384f</code><code class="o">;</code> <code class="n">String</code> <code class="n">today</code> <code class="o">=</code> <code class="s">""</code> <code class="o">+</code> <code class="k">new</code> <code class="n">Date</code><code class="o">();</code></pre></div><div class="sect2" title="Comparing Strings"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.3"/>Comparing Strings</h2></div></div></div><p><a id="idx10571" class="indexterm"/> <a id="idx10588" class="indexterm"/>The standard <a id="I_indexterm10_id725085" class="indexterm"/><code class="literal">equals()</code> method can compare strings for <span class="emphasis"><em>equality</em></span>; they contain exactly the same characters in the same order. You can use a different method, <a id="I_indexterm10_id725104" class="indexterm"/><code class="literal">equalsIgnoreCase()</code>, to check the equivalence of strings in a case-insensitive way:</p><a id="I_10_tt562"/><pre class="programlisting"> <code class="n">String</code> <code class="n">one</code> <code class="o">=</code> <code class="s">"FOO"</code><code class="o">;</code> <code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="s">"foo"</code><code class="o">;</code> <code class="n">one</code><code class="o">.</code><code class="na">equals</code><code class="o">(</code> <code class="n">two</code> <code class="o">);</code> <code class="c1">// false</code> <code class="n">one</code><code class="o">.</code><code class="na">equalsIgnoreCase</code><code class="o">(</code> <code class="n">two</code> <code class="o">);</code> <code class="c1">// true</code></pre><p>A common mistake for novice programmers in Java is to compare strings with the <a id="I_indexterm10_id725128" class="indexterm"/><code class="literal">==</code> operator when they intend to use the <code class="literal">equals()</code> method. Remember that strings are objects in Java, and <code class="literal">==</code> tests for object <span class="emphasis"><em>identity</em></span>; that is, whether the two arguments being tested are the same object. In Java, it’s easy to make two strings that have the same characters but are not the same string object. For example:</p><a id="I_10_tt563"/><pre class="programlisting"> <code class="n">String</code> <code class="n">foo1</code> <code class="o">=</code> <code class="s">"foo"</code><code class="o">;</code> <code class="n">String</code> <code class="n">foo2</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="k">new</code> <code class="kt">char</code> <code class="o">[]</code> <code class="o">{</code> <code class="sc">'f'</code><code class="o">,</code> <code class="sc">'o'</code><code class="o">,</code> <code class="sc">'o'</code> <code class="o">}</code> <code class="o">);</code> <code class="n">foo1</code> <code class="o">==</code> <code class="n">foo2</code> <code class="c1">// false!</code> <code class="n">foo1</code><code class="o">.</code><code class="na">equals</code><code class="o">(</code> <code class="n">foo2</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>This mistake is particularly dangerous because it often works for the common case in which you are comparing literal strings (strings declared with double quotes right in the code). The reason for this is that Java tries to manage strings efficiently by combining them. At compile time, Java finds all the identical strings within a given class and makes only one object for them. This is safe because strings are immutable and cannot change. You can coalesce strings yourself in this way at runtime using the <code class="literal">String intern()</code> method. Interning a string returns an equivalent string reference that is unique across the VM.</p><p>The <a id="I_indexterm10_id725182" class="indexterm"/><code class="literal">compareTo()</code> method compares the lexical value of the <code class="literal">String</code> to another <code class="literal">String</code>, determining whether it sorts alphabetically earlier than, the same as, or later than the target string. It returns an integer that is less than, equal to, or greater than zero:</p><a id="I_10_tt564"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abc</code> <code class="o">=</code> <code class="s">"abc"</code><code class="o">;</code> <code class="n">String</code> <code class="n">def</code> <code class="o">=</code> <code class="s">"def"</code><code class="o">;</code> <code class="n">String</code> <code class="n">num</code> <code class="o">=</code> <code class="s">"123"</code><code class="o">;</code> <code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">def</code> <code class="o">)</code> <code class="o">&lt;</code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code> <code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">abc</code> <code class="o">)</code> <code class="o">==</code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code> <code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">num</code> <code class="o">)</code> <code class="o">&gt;</code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>The <code class="literal">compareTo()</code> method compares strings strictly by their characters’ positions in the Unicode specification. This works for simple text but does not handle all language variations well. The <code class="literal">Collator</code> class, discussed next, can be used for more sophisticated comparisons.</p><div class="sect3" title="The Collator class"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-10-SECT-2.3.1"/>The Collator class</h3></div></div></div><p><a id="idx10517" class="indexterm"/>The <code class="literal">java.text</code> package provides a sophisticated set of classes for comparing strings in specific languages. German, for example, has vowels with umlauts and another character that resembles the Greek letter beta and represents a double “s.” How should we sort these? Although the rules for sorting such characters are precisely defined, you can’t assume that the lexical comparison we used earlier has the correct meaning for languages other than English. Fortunately, the <code class="literal">Collator</code> class takes care of these complex sorting problems.</p><p>In the following example, we use a <code class="literal">Collator</code> designed to compare German strings. You can obtain a default <code class="literal">Collator</code> by calling the <code class="literal">Collator.getInstance()</code> method with no arguments. Once you have an appropriate <code class="literal">Collator</code> instance, you can use its <a id="I_indexterm10_id725298" class="indexterm"/><code class="literal">compare()</code> method, which returns values just like <code class="literal">String</code>’s <code class="literal">compareTo()</code> method. The following code creates two strings for the German translations of “fun” and “later,” using Unicode constants for these two special characters. It then compares them, using a <code class="literal">Collator</code> for the German locale. (<code class="literal">Locale</code>s help you deal with issues relevant to particular languages and cultures; we’ll talk about them in detail later in this chapter.) The result in this case is that “fun” (Spaß) sorts before “later” (später):</p><a id="I_10_tt565"/><pre class="programlisting"> <code class="n">String</code> <code class="n">fun</code> <code class="o">=</code> <code class="s">"Spa\u00df"</code><code class="o">;</code> <code class="n">String</code> <code class="n">later</code> <code class="o">=</code> <code class="s">"sp\u00e4ter"</code><code class="o">;</code> <code class="n">Collator</code> <code class="n">german</code> <code class="o">=</code> <code class="n">Collator</code><code class="o">.</code><code class="na">getInstance</code><code class="o">(</code><code class="n">Locale</code><code class="o">.</code><code class="na">GERMAN</code><code class="o">);</code> <code class="k">if</code> <code class="o">(</code><code class="n">german</code><code class="o">.</code><code class="na">compare</code><code class="o">(</code><code class="n">fun</code><code class="o">,</code> <code class="n">later</code><code class="o">)</code> <code class="o">&lt;</code> <code class="mi">0</code><code class="o">)</code> <code class="c1">// true</code></pre><p>Using collators is essential if you’re working with languages other than English. In Spanish, for example, “ll” and “ch” are treated as unique characters and alphabetized separately. A collator handles cases like these automatically.<a id="I_indexterm10_id725355" class="indexterm"/><a id="I_indexterm10_id725362" class="indexterm"/><a id="I_indexterm10_id725370" class="indexterm"/></p></div></div><div class="sect2" title="Searching"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.4"/>Searching</h2></div></div></div><p><a id="I_indexterm10_id725383" class="indexterm"/> <a id="I_indexterm10_id725390" class="indexterm"/> <a id="I_indexterm10_id725398" class="indexterm"/>The <code class="literal">String</code> class provides several simple methods for finding fixed substrings within a string. The <a id="I_indexterm10_id725415" class="indexterm"/><code class="literal">startsWith()</code> and <a id="I_indexterm10_id725426" class="indexterm"/><code class="literal">endsWith()</code> methods compare an argument string with the beginning and end of the <code class="literal">String</code>, respectively:</p><a id="I_10_tt566"/><pre class="programlisting"> <code class="n">String</code> <code class="n">url</code> <code class="o">=</code> <code class="s">"http://foo.bar.com/"</code><code class="o">;</code> <code class="k">if</code> <code class="o">(</code> <code class="n">url</code><code class="o">.</code><code class="na">startsWith</code><code class="o">(</code><code class="s">"http:"</code><code class="o">)</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>The <a id="I_indexterm10_id725453" class="indexterm"/><code class="literal">indexOf()</code> method searches for the first occurrence of a character or substring and returns the starting character position, or <code class="literal">-1</code> if the substring is not found:</p><a id="I_10_tt567"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">;</code> <code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="sc">'p'</code> <code class="o">);</code> <code class="c1">// 15</code> <code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="s">"def"</code> <code class="o">);</code> <code class="c1">// 3</code> <code class="kt">int</code> <code class="n">I</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="s">"Fang"</code> <code class="o">);</code> <code class="c1">// -1</code></pre><p>Similarly, <code class="literal">lastIndexOf()</code> searches backward through the string for the last occurrence of a character or substring.</p><p>The <a id="I_indexterm10_id725491" class="indexterm"/><code class="literal">contains()</code> method handles the very common task of checking to see whether a given substring is contained in the target string:</p><a id="I_10_tt568"/><pre class="programlisting"> <code class="n">String</code> <code class="n">log</code> <code class="o">=</code> <code class="s">"There is an emergency in sector 7!"</code><code class="o">;</code> <code class="k">if</code> <code class="o">(</code> <code class="n">log</code><code class="o">.</code><code class="na">contains</code><code class="o">(</code><code class="s">"emergency"</code><code class="o">)</code> <code class="o">)</code> <code class="n">pageSomeone</code><code class="o">();</code> <code class="c1">// equivalent to</code> <code class="k">if</code> <code class="o">(</code> <code class="n">log</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code><code class="s">"emergency"</code><code class="o">)</code> <code class="o">!=</code> <code class="o">-</code><code class="mi">1</code> <code class="o">)</code> <code class="o">...</code></pre><p>For more complex searching, you can use the Regular Expression API, which allows you to look for and parse complex patterns. We’ll talk about regular expressions later in this chapter.</p></div><div class="sect2" title="Editing"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.5"/>Editing</h2></div></div></div><p><a id="I_indexterm10_id725525" class="indexterm"/> <a id="I_indexterm10_id725531" class="indexterm"/> <a id="I_indexterm10_id725539" class="indexterm"/>A number of methods operate on the <code class="literal">String</code> and return a new <code class="literal">String</code> as a result. While this is useful, you should be aware that creating lots of strings in this manner can affect performance. If you need to modify a string often or build a complex string from components, you should use the <code class="literal">StringBuilder</code> class, as we’ll discuss shortly.</p><p><a id="I_indexterm10_id725570" class="indexterm"/> <code class="literal">trim()</code> is a useful method that removes leading and trailing whitespace (i.e., carriage return, newline, and tab) from the <code class="literal">String</code>:</p><a id="I_10_tt569"/><pre class="programlisting"> <code class="n">String</code> <code class="n">str</code> <code class="o">=</code> <code class="s">" abc "</code><code class="o">;</code> <code class="n">str</code> <code class="o">=</code> <code class="n">str</code><code class="o">.</code><code class="na">trim</code><code class="o">();</code> <code class="c1">// "abc"</code></pre><p>In this example, we threw away the original <code class="literal">String</code> (with excess whitespace), and it will be garbage-collected.</p><p>The <a id="I_indexterm10_id725608" class="indexterm"/><code class="literal">toUpperCase()</code> and <a id="I_indexterm10_id725618" class="indexterm"/><code class="literal">toLowerCase()</code> methods return a new <code class="literal">String</code> of the appropriate case:</p><a id="I_10_tt570"/><pre class="programlisting"> <code class="n">String</code> <code class="n">down</code> <code class="o">=</code> <code class="s">"FOO"</code><code class="o">.</code><code class="na">toLowerCase</code><code class="o">();</code> <code class="c1">// "foo"</code> <code class="n">String</code> <code class="n">up</code> <code class="o">=</code> <code class="n">down</code><code class="o">.</code><code class="na">toUpperCase</code><code class="o">();</code> <code class="c1">// "FOO"</code></pre><p><a id="idx10577" class="indexterm"/> <code class="literal">substring()</code> returns a specified range of characters. The starting index is <span class="emphasis"><em>inclusive</em></span>; the ending is <span class="emphasis"><em>exclusive</em></span>:</p><a id="I_10_tt571"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">;</code> <code class="n">String</code> <code class="n">cde</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">substring</code><code class="o">(</code> <code class="mi">2</code><code class="o">,</code> <code class="mi">5</code> <code class="o">);</code> <code class="c1">// "cde"</code></pre><p>The <a id="I_indexterm10_id725680" class="indexterm"/><code class="literal">replace()</code> method provides simple, literal string substitution. One or more occurrences of the target string are replaced with the replacement string, moving from beginning to end. For example:</p><a id="I_10_tt572"/><pre class="programlisting"> <code class="n">String</code> <code class="n">message</code> <code class="o">=</code> <code class="s">"Hello NAME, how are you?"</code><code class="o">.</code><code class="na">replace</code><code class="o">(</code> <code class="s">"NAME"</code><code class="o">,</code> <code class="s">"Penny"</code> <code class="o">);</code> <code class="c1">// "Hello Penny, how are you?"</code> <code class="n">String</code> <code class="n">xy</code> <code class="o">=</code> <code class="s">"xxooxxxoo"</code><code class="o">.</code><code class="na">replace</code><code class="o">(</code> <code class="s">"xx"</code><code class="o">,</code> <code class="s">"X"</code> <code class="o">);</code> <code class="c1">// "XooXxoo"</code></pre><p>The <code class="literal">String</code> class also has two methods that allow you to do more complex pattern substitution: <a id="I_indexterm10_id725710" class="indexterm"/><code class="literal">replaceAll()</code> and <a id="I_indexterm10_id725721" class="indexterm"/><code class="literal">replaceFirst()</code>. Unlike the simple <code class="literal">replace()</code> method, these methods use regular expressions (a special syntax) to describe the replacement pattern, which we’ll cover later in this chapter.</p></div><div class="sect2" title="String Method Summary"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.6"/>String Method Summary</h2></div></div></div><p><a id="idx10573" class="indexterm"/> <a id="idx10590" class="indexterm"/> <a class="xref" href="ch10s02.html#learnjava3-CHP-10-TABLE-2" title="Table 10-2. String methods">Table 10-2</a> summarizes the methods provided by the <code class="literal">String</code> class.</p><div class="table"><a id="learnjava3-CHP-10-TABLE-2"/><p class="title">Table 10-2. String methods</p><div class="table-contents"><table summary="String methods" style="border-collapse: collapse;border-top: 0.5pt solid ; border-bottom: 0.5pt solid ; "><colgroup><col/><col/></colgroup><thead><tr><th style="text-align: left"><p>Method</p></th><th style="text-align: left"><p>Functionality</p></th></tr></thead><tbody><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725829" class="indexterm"/> <code class="literal">charAt()</code> </p></td><td style="text-align: left"><p>Gets a particular character in the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725853" class="indexterm"/> <code class="literal">compareTo()</code> </p></td><td style="text-align: left"><p>Compares the string with another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725876" class="indexterm"/> <code class="literal">concat()</code> </p></td><td style="text-align: left"><p>Concatenates the string with another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725900" class="indexterm"/> <code class="literal">contains()</code> </p></td><td style="text-align: left"><p>Checks whether the string contains another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725924" class="indexterm"/> <code class="literal">copyValueOf()</code> </p></td><td style="text-align: left"><p>Returns a string equivalent to the specified character array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725948" class="indexterm"/> <code class="literal">endsWith()</code> </p></td><td style="text-align: left"><p>Checks whether the string ends with a specified suffix</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725972" class="indexterm"/> <code class="literal">equals()</code> </p></td><td style="text-align: left"><p>Compares the string with another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725995" class="indexterm"/> <code class="literal">equalsIgnoreCase()</code> </p></td><td style="text-align: left"><p>Compares the string with another string, ignoring case</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726019" class="indexterm"/> <code class="literal">getBytes()</code> </p></td><td style="text-align: left"><p>Copies characters from the string into a byte array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726043" class="indexterm"/> <code class="literal">getChars()</code> </p></td><td style="text-align: left"><p>Copies characters from the string into a character array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726067" class="indexterm"/> <code class="literal">hashCode()</code> </p></td><td style="text-align: left"><p>Returns a hashcode for the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726091" class="indexterm"/> <code class="literal">indexOf()</code> </p></td><td style="text-align: left"><p>Searches for the first occurrence of a character or substring in the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726115" class="indexterm"/> <code class="literal">intern()</code> </p></td><td style="text-align: left"><p>Fetches a unique instance of the string from a global shared-string pool</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726139" class="indexterm"/> <code class="literal">isEmpty()</code> </p></td><td style="text-align: left"><p>Returns true if the string is zero length</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726162" class="indexterm"/> <code class="literal">lastIndexOf()</code> </p></td><td style="text-align: left"><p>Searches for the last occurrence of a character or substring in a string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726186" class="indexterm"/> <code class="literal">length()</code> </p></td><td style="text-align: left"><p>Returns the length of the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726210" class="indexterm"/> <code class="literal">matches()</code> </p></td><td style="text-align: left"><p>Determines if the whole string matches a regular expression pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726234" class="indexterm"/> <code class="literal">regionMatches()</code> </p></td><td style="text-align: left"><p>Checks whether a region of the string matches the specified region of another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726258" class="indexterm"/> <code class="literal">replace()</code> </p></td><td style="text-align: left"><p>Replaces all occurrences of a character in the string with another character</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726282" class="indexterm"/> <code class="literal">replaceAll()</code> </p></td><td style="text-align: left"><p>Replaces all occurrences of a regular expression pattern with a pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726306" class="indexterm"/> <code class="literal">replaceFirst()</code> </p></td><td style="text-align: left"><p>Replaces the first occurrence of a regular expression pattern with a pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726330" class="indexterm"/> <code class="literal">split()</code> </p></td><td style="text-align: left"><p>Splits the string into an array of strings using a regular expression pattern as a delimiter</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726354" class="indexterm"/> <code class="literal">startsWith()</code> </p></td><td style="text-align: left"><p>Checks whether the string starts with a specified prefix</p></td></tr><tr><td style="text-align: left"><p> <code class="literal">substring()</code> </p></td><td style="text-align: left"><p>Returns a substring from the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726396" class="indexterm"/> <code class="literal">toCharArray()</code> </p></td><td style="text-align: left"><p>Returns the array of characters from the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726419" class="indexterm"/> <code class="literal">toLowerCase()</code> </p></td><td style="text-align: left"><p>Converts the string to lowercase</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726443" class="indexterm"/> <code class="literal">toString()</code> </p></td><td style="text-align: left"><p>Returns the string value of an object</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726467" class="indexterm"/> <code class="literal">toUpperCase()</code> </p></td><td style="text-align: left"><p>Converts the string to uppercase</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726491" class="indexterm"/> <code class="literal">trim()</code> </p></td><td style="text-align: left"><p>Removes leading and trailing whitespace from the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726514" class="indexterm"/> <code class="literal">valueOf()</code> </p></td><td style="text-align: left"><p>Returns a string representation of a value<a id="I_indexterm10_id726532" class="indexterm"/><a id="I_indexterm10_id726539" class="indexterm"/></p></td></tr></tbody></table></div></div></div><div class="sect2" title="StringBuilder and StringBuffer"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.7"/>StringBuilder and StringBuffer</h2></div></div></div><p><a id="idx10569" class="indexterm"/> <a id="idx10570" class="indexterm"/> <a id="idx10574" class="indexterm"/> <a id="idx10591" class="indexterm"/>In contrast to the immutable string, the <code class="literal">java.lang.StringBuilder</code> class is a modifiable and expandable buffer for characters. You can use it to create a big string efficiently. <code class="literal">StringBuilder</code> and <code class="literal">StringBuffer</code> are twins; they have exactly the same API. <code class="literal">StringBuilder</code> was added in Java 5.0 as a drop-in, unsynchronized replacement for <code class="literal">StringBuffer</code>. We’ll come back to that in a bit.</p><p>First, let’s look at some examples of <code class="literal">String</code> construction:</p><a id="I_10_tt573"/><pre class="programlisting"> <code class="c1">// Could be better</code> <code class="n">String</code> <code class="n">ball</code> <code class="o">=</code> <code class="s">"Hello"</code><code class="o">;</code> <code class="n">ball</code> <code class="o">=</code> <code class="n">ball</code> <code class="o">+</code> <code class="s">" there."</code><code class="o">;</code> <code class="n">ball</code> <code class="o">=</code> <code class="n">ball</code> <code class="o">+</code> <code class="s">" How are you?"</code><code class="o">;</code></pre><p>This example creates an unnecessary <code class="literal">String</code> object each time we use the concatenation operator (+). Whether this is significant depends on how often this code is run and how big the string actually gets. Here’s a more extreme example:</p><a id="I_10_tt574"/><pre class="programlisting"> <code class="c1">// Bad use of + ...</code> <code class="k">while</code><code class="o">(</code> <code class="o">(</code><code class="n">line</code> <code class="o">=</code> <code class="n">readLine</code><code class="o">())</code> <code class="o">!=</code> <code class="n">EOF</code> <code class="o">)</code> <code class="n">text</code> <code class="o">+=</code> <code class="n">line</code><code class="o">;</code></pre><p>This example repeatedly produces new <code class="literal">String</code> objects. The character array must be copied over and over, which can adversely affect performance. The solution is to use a <code class="literal">StringBuilder</code> object and its <a id="I_indexterm10_id726684" class="indexterm"/><code class="literal">append()</code> method:</p><a id="I_10_tt575"/><pre class="programlisting"> <code class="n">StringBuilder</code> <code class="n">sb</code> <code class="o">=</code> <code class="k">new</code> <code class="n">StringBuilder</code><code class="o">(</code><code class="s">"Hello"</code><code class="o">);</code> <code class="n">sb</code><code class="o">.</code><code class="na">append</code><code class="o">(</code><code class="s">" there."</code><code class="o">);</code> <code class="n">sb</code><code class="o">.</code><code class="na">append</code><code class="o">(</code><code class="s">" How are you?"</code><code class="o">);</code> <code class="n">StringBuilder</code> <code class="n">text</code> <code class="o">=</code> <code class="k">new</code> <code class="n">StringBuilder</code><code class="o">();</code> <code class="k">while</code><code class="o">(</code> <code class="o">(</code><code class="n">line</code> <code class="o">=</code> <code class="n">readline</code><code class="o">())</code> <code class="o">!=</code> <code class="n">EOF</code> <code class="o">)</code> <code class="n">text</code><code class="o">.</code><code class="na">append</code><code class="o">(</code> <code class="n">line</code> <code class="o">);</code></pre><p>Here, the <code class="literal">StringBuilder</code> efficiently handles expanding the array as necessary. We can get a <code class="literal">String</code> back from the <code class="literal">StringBuilder</code> with its <code class="literal">toString()</code> method:</p><a id="I_10_tt576"/><pre class="programlisting"> <code class="n">String</code> <code class="n">message</code> <code class="o">=</code> <code class="n">sb</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code></pre><p>You can also retrieve part of a <code class="literal">StringBuilder</code> as a <code class="literal">String</code> by using one of the <a id="I_indexterm10_id726750" class="indexterm"/><code class="literal">substring()</code> methods.</p><p>You might be interested to know that when you write a long expression using string concatenation, the compiler generates code that uses a <code class="literal">StringBuilder</code> behind the scenes:</p><a id="I_10_tt577"/><pre class="programlisting"> <code class="n">String</code> <code class="n">foo</code> <code class="o">=</code> <code class="s">"To "</code> <code class="o">+</code> <code class="s">"be "</code>