epubjs
Version:
Render ePub documents in the browser, across many devices
304 lines (297 loc) • 54.5 kB
HTML
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Strings</title><link rel="stylesheet" href="core.css" type="text/css"/><meta name="generator" content="DocBook XSL Stylesheets V1.74.0"/></head><body><div class="sect1" title="Strings"><div class="titlepage"><div><div><h1 class="title"><a id="learnjava3-CHP-10-SECT-2"/>Strings</h1></div></div></div><p>We’ll start by taking a closer look at the Java <code class="literal">String</code> class (or, more specifically, <code class="literal">java.lang.String</code>). Because working with <code class="literal">String</code>s is so fundamental, it’s important to
understand how they are implemented and what you can do with them. A
<code class="literal">String</code> object encapsulates a sequence
of Unicode characters. Internally, these characters are stored in a
regular Java array, but the <code class="literal">String</code>
object guards this array jealously and gives you access to it only through
its own API. This is to support the idea that <code class="literal">String</code>s are <a id="I_indexterm10_id724416" class="indexterm"/><span class="emphasis"><em>immutable</em></span>; once you create a <code class="literal">String</code> object, you can’t change its value. Lots
of operations on a <code class="literal">String</code> object appear
to change the characters or length of a string, but what they really do is
return a new <code class="literal">String</code> object that copies
or internally references the needed characters of the original. Java
implementations make an effort to consolidate identical strings used in
the same class into a shared-string pool and to share parts of <code class="literal">String</code>s where possible.</p><p>The original motivation for all of this was performance. Immutable
<code class="literal">String</code>s can save memory and be
optimized for speed by the Java VM. The flip side is that a programmer
should have a basic understanding of the <code class="literal">String</code> class in order to avoid creating an
excessive number of <code class="literal">String</code> objects in
places where performance is an issue. That was especially true in the
past, when VMs were slow and handled memory poorly. Nowadays, string usage
is not usually an issue in the overall performance of a real
application.<sup>[<a id="learnjava3-CHP-10-FNOTE-1" href="#ftn.learnjava3-CHP-10-FNOTE-1" class="footnote">29</a>]</sup></p><div class="sect2" title="Constructing Strings"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.1"/>Constructing Strings</h2></div></div></div><p><a id="idx10572" class="indexterm"/> <a id="idx10589" class="indexterm"/>Literal strings, defined in your source code, are declared
with double quotes and can be assigned to a <code class="literal">String</code> variable:</p><a id="I_10_tt549"/><pre class="programlisting"> <code class="n">String</code> <code class="n">quote</code> <code class="o">=</code> <code class="s">"To be or not to be"</code><code class="o">;</code></pre><p>Java automatically converts the literal string into a <code class="literal">String</code> object and assigns it to the
variable.</p><p><code class="literal">String</code>s keep track of their own
length, so <code class="literal">String</code> objects in Java
don’t require special terminators. You can get the length of a <code class="literal">String</code> with the <a id="I_indexterm10_id724566" class="indexterm"/><code class="literal">length()</code> method. You
can also test for a zero length string by using <code class="literal">isEmpty()</code>:</p><a id="I_10_tt550"/><pre class="programlisting"> <code class="kt">int</code> <code class="n">length</code> <code class="o">=</code> <code class="n">quote</code><code class="o">.</code><code class="na">length</code><code class="o">();</code>
<code class="kt">boolean</code> <code class="n">empty</code> <code class="o">=</code> <code class="n">quote</code><code class="o">.</code><code class="na">isEmpty</code><code class="o">();</code></pre><p><code class="literal">String</code>s can take advantage of
the only overloaded operator in Java, the <a id="I_indexterm10_id724600" class="indexterm"/><a id="I_indexterm10_id724608" class="indexterm"/><code class="literal">+</code> operator, for string
concatenation. The following code produces equivalent strings:</p><a id="I_10_tt551"/><pre class="programlisting"> <code class="n">String</code> <code class="n">name</code> <code class="o">=</code> <code class="s">"John "</code> <code class="o">+</code> <code class="s">"Smith"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">name</code> <code class="o">=</code> <code class="s">"John "</code><code class="o">.</code><code class="na">concat</code><code class="o">(</code><code class="s">"Smith"</code><code class="o">);</code></pre><p>Literal strings can’t span lines in Java source files, but we can
concatenate lines to produce the same effect:</p><a id="I_10_tt552"/><pre class="programlisting"> <code class="n">String</code> <code class="n">poem</code> <code class="o">=</code>
<code class="s">"'Twas brillig, and the slithy toves\n"</code> <code class="o">+</code>
<code class="s">" Did gyre and gimble in the wabe:\n"</code> <code class="o">+</code>
<code class="s">"All mimsy were the borogoves,\n"</code> <code class="o">+</code>
<code class="s">" And the mome raths outgrabe.\n"</code><code class="o">;</code></pre><p>Embedding lengthy text in source code is not normally something
you want to do. In this and the following chapter, we’ll talk about ways
to load <code class="literal">String</code>s from files, special
packages called resource bundles, and URLs. Technologies like Java
Server Pages and template engines also provide a way to factor out large
amounts of text from your code. For example, in <a class="xref" href="ch14.html" title="Chapter 14. Programming for the Web">Chapter 14</a>, we’ll see how to load our poem from a
web server by opening a URL like this:</p><a id="I_10_tt553"/><pre class="programlisting"> <code class="n">InputStream</code> <code class="n">poem</code> <code class="o">=</code> <code class="k">new</code> <code class="n">URL</code><code class="o">(</code>
<code class="s">"http://myserver/~dodgson/jabberwocky.txt"</code><code class="o">).</code><code class="na">openStream</code><code class="o">();</code></pre><p>In addition to making strings from literal expressions, you can
construct a <code class="literal">String</code> directly from an
array of characters:</p><a id="I_10_tt554"/><pre class="programlisting"> <code class="kt">char</code> <code class="o">[]</code> <code class="n">data</code> <code class="o">=</code> <code class="k">new</code> <code class="kt">char</code> <code class="o">[]</code> <code class="o">{</code> <code class="sc">'L'</code><code class="o">,</code> <code class="sc">'e'</code><code class="o">,</code> <code class="sc">'m'</code><code class="o">,</code> <code class="sc">'m'</code><code class="o">,</code> <code class="sc">'i'</code><code class="o">,</code> <code class="sc">'n'</code><code class="o">,</code> <code class="sc">'g'</code> <code class="o">};</code>
<code class="n">String</code> <code class="n">lemming</code> <code class="o">=</code> <code class="k">new</code> <code class="n">String</code><code class="o">(</code> <code class="n">data</code> <code class="o">);</code></pre><p>You can also construct a <code class="literal">String</code>
from an array of bytes:</p><a id="I_10_tt555"/><pre class="programlisting"> <code class="kt">byte</code> <code class="o">[]</code> <code class="n">data</code> <code class="o">=</code> <code class="k">new</code> <code class="kt">byte</code> <code class="o">[]</code> <code class="o">{</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">97</code><code class="o">,</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">98</code><code class="o">,</code> <code class="o">(</code><code class="kt">byte</code><code class="o">)</code><code class="mi">99</code> <code class="o">};</code>
<code class="n">String</code> <code class="n">abc</code> <code class="o">=</code> <code class="k">new</code> <code class="n">String</code><code class="o">(</code><code class="n">data</code><code class="o">,</code> <code class="s">"ISO8859_1"</code><code class="o">);</code></pre><p>In this case, the second argument to the <code class="literal">String</code> constructor is the name of a
character-encoding scheme. The <code class="literal">String</code>
constructor uses it to convert the raw bytes in the specified encoding
to the internally used standard 2-byte Unicode characters. If you don’t
specify a character encoding, the default encoding scheme on your system
is used. We’ll discuss character encodings more when we talk about the
<code class="literal">Charset</code> class, IO, in <a class="xref" href="ch12.html" title="Chapter 12. Input/Output Facilities">Chapter 12</a>.<sup>[<a id="learnjava3-CHP-10-FNOTE-2" href="#ftn.learnjava3-CHP-10-FNOTE-2" class="footnote">30</a>]</sup></p><p>Conversely, the <a id="I_indexterm10_id724745" class="indexterm"/><code class="literal">charAt()</code> method of the
<code class="literal">String</code> class lets you access the
characters of a <code class="literal">String</code> in an
array-like fashion:</p><a id="I_10_tt556"/><pre class="programlisting"> <code class="n">String</code> <code class="n">s</code> <code class="o">=</code> <code class="s">"Newton"</code><code class="o">;</code>
<code class="k">for</code> <code class="o">(</code> <code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="mi">0</code><code class="o">;</code> <code class="n">i</code> <code class="o"><</code> <code class="n">s</code><code class="o">.</code><code class="na">length</code><code class="o">();</code> <code class="n">i</code><code class="o">++</code> <code class="o">)</code>
<code class="n">System</code><code class="o">.</code><code class="na">out</code><code class="o">.</code><code class="na">println</code><code class="o">(</code> <code class="n">s</code><code class="o">.</code><code class="na">charAt</code><code class="o">(</code> <code class="n">i</code> <code class="o">)</code> <code class="o">);</code></pre><p>This code prints the characters of the string one at a time.
Alternately, we can get the characters all at once with <a id="I_indexterm10_id724782" class="indexterm"/><code class="literal">toCharArray()</code>. Here’s a
way to save typing a bunch of single quotes and get an array holding the
alphabet:</p><a id="I_10_tt557"/><pre class="programlisting"> <code class="kt">char</code> <code class="o">[]</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">.</code><code class="na">toCharArray</code><code class="o">();</code></pre><p>The notion that a <code class="literal">String</code> is a
sequence of characters is also codified by the <code class="literal">String</code> class implementing the interface
<code class="literal">java.lang.CharSequence</code>, which
prescribes the methods <code class="literal">length()</code> and
<code class="literal">charAt()</code> as well as a way to get a
subset of the characters.<a id="I_indexterm10_id724834" class="indexterm"/><a id="I_indexterm10_id724841" class="indexterm"/></p></div><div class="sect2" title="Strings from Things"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.2"/>Strings from Things</h2></div></div></div><p><a id="idx10575" class="indexterm"/> <a id="idx10592" class="indexterm"/>Objects and primitive types in Java can be turned into a
default textual representation as a <code class="literal">String</code>. For primitive types like numbers, the
string should be fairly obvious; for object types, it is under the
control of the object itself. We can get the string representation of an item with the static
<a id="I_indexterm10_id724895" class="indexterm"/><code class="literal">String.valueOf()</code>
method. Various overloaded versions of this method accept each of the
primitive types:</p><a id="I_10_tt558"/><pre class="programlisting"> <code class="n">String</code> <code class="n">one</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="mi">1</code> <code class="o">);</code> <code class="c1">// integer, "1"</code>
<code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="mf">2.384f</code> <code class="o">);</code> <code class="c1">// float, "2.384"</code>
<code class="n">String</code> <code class="n">notTrue</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="kc">false</code> <code class="o">);</code> <code class="c1">// boolean, "false"</code></pre><p>All objects in Java have a <a id="I_indexterm10_id724919" class="indexterm"/><code class="literal">toString()</code> method that
is inherited from the <code class="literal">Object</code> class.
For many objects, this method returns a useful result that displays the
contents of the object. For example, a <code class="literal">java</code>.<code class="literal">util</code>.<code class="literal">Date</code>
object’s <code class="literal">toString()</code> method returns
the date it represents formatted as a string. For objects that do not
provide a representation, the string result is just a unique identifier
that can be used for debugging. The <code class="literal">String.valueOf()</code> method, when called for an
object, invokes the object’s <code class="literal">toString()</code> method and returns the result. The
only real difference in using this method is that if you pass it a null
object reference, it returns the <code class="literal">String</code> “null” for you, instead of producing a
<code class="literal">NullPointerException</code>:</p><a id="I_10_tt559"/><pre class="programlisting"> <code class="n">Date</code> <code class="n">date</code> <code class="o">=</code> <code class="k">new</code> <code class="n">Date</code><code class="o">();</code>
<code class="c1">// Equivalent, e.g., "Fri Dec 19 05:45:34 CST 1969"</code>
<code class="n">String</code> <code class="n">d1</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="n">date</code> <code class="o">);</code>
<code class="n">String</code> <code class="n">d2</code> <code class="o">=</code> <code class="n">date</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code>
<code class="n">date</code> <code class="o">=</code> <code class="kc">null</code><code class="o">;</code>
<code class="n">d1</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="n">date</code> <code class="o">);</code> <code class="c1">// "null"</code>
<code class="n">d2</code> <code class="o">=</code> <code class="n">date</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code> <code class="c1">// NullPointerException!</code></pre><p>String concatenation uses the <code class="literal">valueOf()</code> method internally, so if you “add”
an object or primitive using the plus operator (+), you get a <code class="literal">String</code>:</p><a id="I_10_tt560"/><pre class="programlisting"> <code class="n">String</code> <code class="n">today</code> <code class="o">=</code> <code class="s">"Today's date is :"</code> <code class="o">+</code> <code class="n">date</code><code class="o">;</code></pre><p>You’ll sometimes see people use the empty string and the plus
operator (+) as shorthand to get the string value of an object. For
example:<a id="I_indexterm10_id725028" class="indexterm"/><a id="I_indexterm10_id725035" class="indexterm"/></p><a id="I_10_tt561"/><pre class="programlisting"> <code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="s">""</code> <code class="o">+</code> <code class="mf">2.384f</code><code class="o">;</code>
<code class="n">String</code> <code class="n">today</code> <code class="o">=</code> <code class="s">""</code> <code class="o">+</code> <code class="k">new</code> <code class="n">Date</code><code class="o">();</code></pre></div><div class="sect2" title="Comparing Strings"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.3"/>Comparing Strings</h2></div></div></div><p><a id="idx10571" class="indexterm"/> <a id="idx10588" class="indexterm"/>The standard <a id="I_indexterm10_id725085" class="indexterm"/><code class="literal">equals()</code> method can
compare strings for <span class="emphasis"><em>equality</em></span>; they contain exactly
the same characters in the same order. You can use a different method,
<a id="I_indexterm10_id725104" class="indexterm"/><code class="literal">equalsIgnoreCase()</code>, to
check the equivalence of strings in a case-insensitive way:</p><a id="I_10_tt562"/><pre class="programlisting"> <code class="n">String</code> <code class="n">one</code> <code class="o">=</code> <code class="s">"FOO"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">two</code> <code class="o">=</code> <code class="s">"foo"</code><code class="o">;</code>
<code class="n">one</code><code class="o">.</code><code class="na">equals</code><code class="o">(</code> <code class="n">two</code> <code class="o">);</code> <code class="c1">// false</code>
<code class="n">one</code><code class="o">.</code><code class="na">equalsIgnoreCase</code><code class="o">(</code> <code class="n">two</code> <code class="o">);</code> <code class="c1">// true</code></pre><p>A common mistake for novice programmers in Java is to compare
strings with the <a id="I_indexterm10_id725128" class="indexterm"/><code class="literal">==</code> operator when they
intend to use the <code class="literal">equals()</code> method.
Remember that strings are objects in Java, and <code class="literal">==</code> tests for object
<span class="emphasis"><em>identity</em></span>; that is, whether the two arguments being
tested are the same object. In Java, it’s easy to make two strings that
have the same characters but are not the same string object. For
example:</p><a id="I_10_tt563"/><pre class="programlisting"> <code class="n">String</code> <code class="n">foo1</code> <code class="o">=</code> <code class="s">"foo"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">foo2</code> <code class="o">=</code> <code class="n">String</code><code class="o">.</code><code class="na">valueOf</code><code class="o">(</code> <code class="k">new</code> <code class="kt">char</code> <code class="o">[]</code> <code class="o">{</code> <code class="sc">'f'</code><code class="o">,</code> <code class="sc">'o'</code><code class="o">,</code> <code class="sc">'o'</code> <code class="o">}</code> <code class="o">);</code>
<code class="n">foo1</code> <code class="o">==</code> <code class="n">foo2</code> <code class="c1">// false!</code>
<code class="n">foo1</code><code class="o">.</code><code class="na">equals</code><code class="o">(</code> <code class="n">foo2</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>This mistake is particularly dangerous because it often works for
the common case in which you are comparing literal strings (strings
declared with double quotes right in the code). The reason for this is
that Java tries to manage strings efficiently by combining them. At
compile time, Java finds all the identical strings within a given class
and makes only one object for them. This is safe because strings are
immutable and cannot change. You can coalesce strings yourself in this
way at runtime using the <code class="literal">String
intern()</code> method. Interning a string returns an equivalent
string reference that is unique across the VM.</p><p>The <a id="I_indexterm10_id725182" class="indexterm"/><code class="literal">compareTo()</code> method
compares the lexical value of the <code class="literal">String</code> to another <code class="literal">String</code>, determining whether it sorts
alphabetically earlier than, the same as, or later than the target
string. It returns an integer that is less than, equal to, or greater
than zero:</p><a id="I_10_tt564"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abc</code> <code class="o">=</code> <code class="s">"abc"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">def</code> <code class="o">=</code> <code class="s">"def"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">num</code> <code class="o">=</code> <code class="s">"123"</code><code class="o">;</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">def</code> <code class="o">)</code> <code class="o"><</code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">abc</code> <code class="o">)</code> <code class="o">==</code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">abc</code><code class="o">.</code><code class="na">compareTo</code><code class="o">(</code> <code class="n">num</code> <code class="o">)</code> <code class="o">></code> <code class="mi">0</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>The <code class="literal">compareTo()</code> method compares
strings strictly by their characters’ positions in the Unicode
specification. This works for simple text but does not handle all
language variations well. The <code class="literal">Collator</code> class, discussed next, can be used
for more sophisticated comparisons.</p><div class="sect3" title="The Collator class"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-10-SECT-2.3.1"/>The Collator class</h3></div></div></div><p><a id="idx10517" class="indexterm"/>The <code class="literal">java.text</code> package
provides a sophisticated set of classes for comparing strings in
specific languages. German, for example, has vowels with umlauts and
another character that resembles the Greek letter beta and represents
a double “s.” How should we sort these? Although the rules for sorting
such characters are precisely defined, you can’t assume that the
lexical comparison we used earlier has the correct meaning for
languages other than English. Fortunately, the <code class="literal">Collator</code> class takes care of these complex
sorting problems.</p><p>In the following example, we use a <code class="literal">Collator</code> designed to compare German strings.
You can obtain a default <code class="literal">Collator</code>
by calling the <code class="literal">Collator.getInstance()</code> method with no
arguments. Once you have an appropriate <code class="literal">Collator</code> instance, you can use its
<a id="I_indexterm10_id725298" class="indexterm"/><code class="literal">compare()</code> method,
which returns values just like <code class="literal">String</code>’s <code class="literal">compareTo()</code> method. The following code
creates two strings for the German translations of “fun” and “later,”
using Unicode constants for these two special characters. It then
compares them, using a <code class="literal">Collator</code> for
the German locale. (<code class="literal">Locale</code>s help
you deal with issues relevant to particular languages and cultures;
we’ll talk about them in detail later in this chapter.) The result in
this case is that “fun” (Spaß) sorts before “later” (später):</p><a id="I_10_tt565"/><pre class="programlisting"> <code class="n">String</code> <code class="n">fun</code> <code class="o">=</code> <code class="s">"Spa\u00df"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">later</code> <code class="o">=</code> <code class="s">"sp\u00e4ter"</code><code class="o">;</code>
<code class="n">Collator</code> <code class="n">german</code> <code class="o">=</code> <code class="n">Collator</code><code class="o">.</code><code class="na">getInstance</code><code class="o">(</code><code class="n">Locale</code><code class="o">.</code><code class="na">GERMAN</code><code class="o">);</code>
<code class="k">if</code> <code class="o">(</code><code class="n">german</code><code class="o">.</code><code class="na">compare</code><code class="o">(</code><code class="n">fun</code><code class="o">,</code> <code class="n">later</code><code class="o">)</code> <code class="o"><</code> <code class="mi">0</code><code class="o">)</code> <code class="c1">// true</code></pre><p>Using collators is essential if you’re working with languages
other than English. In Spanish, for example, “ll” and “ch” are treated
as unique characters and alphabetized separately. A collator handles
cases like these automatically.<a id="I_indexterm10_id725355" class="indexterm"/><a id="I_indexterm10_id725362" class="indexterm"/><a id="I_indexterm10_id725370" class="indexterm"/></p></div></div><div class="sect2" title="Searching"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.4"/>Searching</h2></div></div></div><p><a id="I_indexterm10_id725383" class="indexterm"/> <a id="I_indexterm10_id725390" class="indexterm"/> <a id="I_indexterm10_id725398" class="indexterm"/>The <code class="literal">String</code> class
provides several simple methods for finding fixed substrings within a
string. The <a id="I_indexterm10_id725415" class="indexterm"/><code class="literal">startsWith()</code> and
<a id="I_indexterm10_id725426" class="indexterm"/><code class="literal">endsWith()</code> methods
compare an argument string with the beginning and end of the <code class="literal">String</code>, respectively:</p><a id="I_10_tt566"/><pre class="programlisting"> <code class="n">String</code> <code class="n">url</code> <code class="o">=</code> <code class="s">"http://foo.bar.com/"</code><code class="o">;</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">url</code><code class="o">.</code><code class="na">startsWith</code><code class="o">(</code><code class="s">"http:"</code><code class="o">)</code> <code class="o">)</code> <code class="c1">// true</code></pre><p>The <a id="I_indexterm10_id725453" class="indexterm"/><code class="literal">indexOf()</code> method
searches for the first occurrence of a character or substring and
returns the starting character position, or <code class="literal">-1</code> if the substring is not found:</p><a id="I_10_tt567"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">;</code>
<code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="sc">'p'</code> <code class="o">);</code> <code class="c1">// 15</code>
<code class="kt">int</code> <code class="n">i</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="s">"def"</code> <code class="o">);</code> <code class="c1">// 3</code>
<code class="kt">int</code> <code class="n">I</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code> <code class="s">"Fang"</code> <code class="o">);</code> <code class="c1">// -1</code></pre><p>Similarly, <code class="literal">lastIndexOf()</code>
searches backward through the string for the last occurrence of a
character or substring.</p><p>The <a id="I_indexterm10_id725491" class="indexterm"/><code class="literal">contains()</code> method
handles the very common task of checking to see whether a given
substring is contained in the target string:</p><a id="I_10_tt568"/><pre class="programlisting"> <code class="n">String</code> <code class="n">log</code> <code class="o">=</code> <code class="s">"There is an emergency in sector 7!"</code><code class="o">;</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">log</code><code class="o">.</code><code class="na">contains</code><code class="o">(</code><code class="s">"emergency"</code><code class="o">)</code> <code class="o">)</code> <code class="n">pageSomeone</code><code class="o">();</code>
<code class="c1">// equivalent to</code>
<code class="k">if</code> <code class="o">(</code> <code class="n">log</code><code class="o">.</code><code class="na">indexOf</code><code class="o">(</code><code class="s">"emergency"</code><code class="o">)</code> <code class="o">!=</code> <code class="o">-</code><code class="mi">1</code> <code class="o">)</code> <code class="o">...</code></pre><p>For more complex searching, you can use the Regular Expression
API, which allows you to look for and parse complex patterns. We’ll talk
about regular expressions later in this chapter.</p></div><div class="sect2" title="Editing"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.5"/>Editing</h2></div></div></div><p><a id="I_indexterm10_id725525" class="indexterm"/> <a id="I_indexterm10_id725531" class="indexterm"/> <a id="I_indexterm10_id725539" class="indexterm"/>A number of methods operate on the <code class="literal">String</code> and return a new <code class="literal">String</code> as a result. While this is useful, you
should be aware that creating lots of strings in this manner can affect
performance. If you need to modify a string often or build a complex
string from components, you should use the <code class="literal">StringBuilder</code> class, as we’ll discuss
shortly.</p><p><a id="I_indexterm10_id725570" class="indexterm"/> <code class="literal">trim()</code> is a useful
method that removes leading and trailing whitespace (i.e., carriage
return, newline, and tab) from the <code class="literal">String</code>:</p><a id="I_10_tt569"/><pre class="programlisting"> <code class="n">String</code> <code class="n">str</code> <code class="o">=</code> <code class="s">" abc "</code><code class="o">;</code>
<code class="n">str</code> <code class="o">=</code> <code class="n">str</code><code class="o">.</code><code class="na">trim</code><code class="o">();</code> <code class="c1">// "abc"</code></pre><p>In this example, we threw away the original <code class="literal">String</code> (with excess whitespace), and it will
be garbage-collected.</p><p>The <a id="I_indexterm10_id725608" class="indexterm"/><code class="literal">toUpperCase()</code> and
<a id="I_indexterm10_id725618" class="indexterm"/><code class="literal">toLowerCase()</code> methods
return a new <code class="literal">String</code> of the
appropriate case:</p><a id="I_10_tt570"/><pre class="programlisting"> <code class="n">String</code> <code class="n">down</code> <code class="o">=</code> <code class="s">"FOO"</code><code class="o">.</code><code class="na">toLowerCase</code><code class="o">();</code> <code class="c1">// "foo"</code>
<code class="n">String</code> <code class="n">up</code> <code class="o">=</code> <code class="n">down</code><code class="o">.</code><code class="na">toUpperCase</code><code class="o">();</code> <code class="c1">// "FOO"</code></pre><p><a id="idx10577" class="indexterm"/> <code class="literal">substring()</code> returns a
specified range of characters. The starting index is
<span class="emphasis"><em>inclusive</em></span>; the ending is
<span class="emphasis"><em>exclusive</em></span>:</p><a id="I_10_tt571"/><pre class="programlisting"> <code class="n">String</code> <code class="n">abcs</code> <code class="o">=</code> <code class="s">"abcdefghijklmnopqrstuvwxyz"</code><code class="o">;</code>
<code class="n">String</code> <code class="n">cde</code> <code class="o">=</code> <code class="n">abcs</code><code class="o">.</code><code class="na">substring</code><code class="o">(</code> <code class="mi">2</code><code class="o">,</code> <code class="mi">5</code> <code class="o">);</code> <code class="c1">// "cde"</code></pre><p>The <a id="I_indexterm10_id725680" class="indexterm"/><code class="literal">replace()</code> method
provides simple, literal string substitution. One or more occurrences of
the target string are replaced with the replacement string, moving from
beginning to end. For example:</p><a id="I_10_tt572"/><pre class="programlisting"> <code class="n">String</code> <code class="n">message</code> <code class="o">=</code> <code class="s">"Hello NAME, how are you?"</code><code class="o">.</code><code class="na">replace</code><code class="o">(</code> <code class="s">"NAME"</code><code class="o">,</code> <code class="s">"Penny"</code> <code class="o">);</code>
<code class="c1">// "Hello Penny, how are you?"</code>
<code class="n">String</code> <code class="n">xy</code> <code class="o">=</code> <code class="s">"xxooxxxoo"</code><code class="o">.</code><code class="na">replace</code><code class="o">(</code> <code class="s">"xx"</code><code class="o">,</code> <code class="s">"X"</code> <code class="o">);</code>
<code class="c1">// "XooXxoo"</code></pre><p>The <code class="literal">String</code> class also has two
methods that allow you to do more complex pattern substitution:
<a id="I_indexterm10_id725710" class="indexterm"/><code class="literal">replaceAll()</code> and
<a id="I_indexterm10_id725721" class="indexterm"/><code class="literal">replaceFirst()</code>. Unlike
the simple <code class="literal">replace()</code> method, these
methods use regular expressions (a special syntax) to describe the
replacement pattern, which we’ll cover later in this chapter.</p></div><div class="sect2" title="String Method Summary"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.6"/>String Method Summary</h2></div></div></div><p><a id="idx10573" class="indexterm"/> <a id="idx10590" class="indexterm"/> <a class="xref" href="ch10s02.html#learnjava3-CHP-10-TABLE-2" title="Table 10-2. String methods">Table 10-2</a> summarizes
the methods provided by the <code class="literal">String</code>
class.</p><div class="table"><a id="learnjava3-CHP-10-TABLE-2"/><p class="title">Table 10-2. String methods</p><div class="table-contents"><table summary="String methods" style="border-collapse: collapse;border-top: 0.5pt solid ; border-bottom: 0.5pt solid ; "><colgroup><col/><col/></colgroup><thead><tr><th style="text-align: left"><p>Method</p></th><th style="text-align: left"><p>Functionality</p></th></tr></thead><tbody><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725829" class="indexterm"/> <code class="literal">charAt()</code>
</p></td><td style="text-align: left"><p>Gets a particular character in the
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725853" class="indexterm"/> <code class="literal">compareTo()</code>
</p></td><td style="text-align: left"><p>Compares the string with another
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725876" class="indexterm"/> <code class="literal">concat()</code>
</p></td><td style="text-align: left"><p>Concatenates the string with another
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725900" class="indexterm"/> <code class="literal">contains()</code>
</p></td><td style="text-align: left"><p>Checks whether the string contains
another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725924" class="indexterm"/> <code class="literal">copyValueOf()</code>
</p></td><td style="text-align: left"><p>Returns a string equivalent to the
specified character array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725948" class="indexterm"/> <code class="literal">endsWith()</code>
</p></td><td style="text-align: left"><p>Checks whether the string ends with a
specified suffix</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725972" class="indexterm"/> <code class="literal">equals()</code>
</p></td><td style="text-align: left"><p>Compares the string with another
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id725995" class="indexterm"/> <code class="literal">equalsIgnoreCase()</code> </p></td><td style="text-align: left"><p>Compares the string with another
string, ignoring case</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726019" class="indexterm"/> <code class="literal">getBytes()</code>
</p></td><td style="text-align: left"><p>Copies characters from the string into
a byte array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726043" class="indexterm"/> <code class="literal">getChars()</code>
</p></td><td style="text-align: left"><p>Copies characters from the string into
a character array</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726067" class="indexterm"/> <code class="literal">hashCode()</code>
</p></td><td style="text-align: left"><p>Returns a hashcode for the
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726091" class="indexterm"/> <code class="literal">indexOf()</code>
</p></td><td style="text-align: left"><p>Searches for the first occurrence of a
character or substring in the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726115" class="indexterm"/> <code class="literal">intern()</code>
</p></td><td style="text-align: left"><p>Fetches a unique instance of the
string from a global shared-string pool</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726139" class="indexterm"/> <code class="literal">isEmpty()</code>
</p></td><td style="text-align: left"><p>Returns true if the string is zero
length</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726162" class="indexterm"/> <code class="literal">lastIndexOf()</code>
</p></td><td style="text-align: left"><p>Searches for the last occurrence of a
character or substring in a string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726186" class="indexterm"/> <code class="literal">length()</code>
</p></td><td style="text-align: left"><p>Returns the length of the
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726210" class="indexterm"/> <code class="literal">matches()</code>
</p></td><td style="text-align: left"><p>Determines if the whole string matches
a regular expression pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726234" class="indexterm"/> <code class="literal">regionMatches()</code> </p></td><td style="text-align: left"><p>Checks whether a region of the string
matches the specified region of another string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726258" class="indexterm"/> <code class="literal">replace()</code>
</p></td><td style="text-align: left"><p>Replaces all occurrences of a
character in the string with another character</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726282" class="indexterm"/> <code class="literal">replaceAll()</code>
</p></td><td style="text-align: left"><p>Replaces all occurrences of a regular
expression pattern with a pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726306" class="indexterm"/> <code class="literal">replaceFirst()</code>
</p></td><td style="text-align: left"><p>Replaces the first occurrence of a
regular expression pattern with a pattern</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726330" class="indexterm"/> <code class="literal">split()</code>
</p></td><td style="text-align: left"><p>Splits the string into an array of
strings using a regular expression pattern as a
delimiter</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726354" class="indexterm"/> <code class="literal">startsWith()</code>
</p></td><td style="text-align: left"><p>Checks whether the string starts with
a specified prefix</p></td></tr><tr><td style="text-align: left"><p> <code class="literal">substring()</code> </p></td><td style="text-align: left"><p>Returns a substring from the
string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726396" class="indexterm"/> <code class="literal">toCharArray()</code>
</p></td><td style="text-align: left"><p>Returns the array of characters from
the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726419" class="indexterm"/> <code class="literal">toLowerCase()</code>
</p></td><td style="text-align: left"><p>Converts the string to
lowercase</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726443" class="indexterm"/> <code class="literal">toString()</code>
</p></td><td style="text-align: left"><p>Returns the string value of an
object</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726467" class="indexterm"/> <code class="literal">toUpperCase()</code>
</p></td><td style="text-align: left"><p>Converts the string to
uppercase</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726491" class="indexterm"/> <code class="literal">trim()</code>
</p></td><td style="text-align: left"><p>Removes leading and trailing
whitespace from the string</p></td></tr><tr><td style="text-align: left"><p> <a id="I_indexterm10_id726514" class="indexterm"/> <code class="literal">valueOf()</code>
</p></td><td style="text-align: left"><p>Returns a string representation of a
value<a id="I_indexterm10_id726532" class="indexterm"/><a id="I_indexterm10_id726539" class="indexterm"/></p></td></tr></tbody></table></div></div></div><div class="sect2" title="StringBuilder and StringBuffer"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-10-SECT-2.7"/>StringBuilder and StringBuffer</h2></div></div></div><p><a id="idx10569" class="indexterm"/> <a id="idx10570" class="indexterm"/> <a id="idx10574" class="indexterm"/> <a id="idx10591" class="indexterm"/>In contrast to the immutable string, the <code class="literal">java.lang.StringBuilder</code> class is a modifiable
and expandable buffer for characters. You can use it to create a big
string efficiently. <code class="literal">StringBuilder</code> and
<code class="literal">StringBuffer</code> are twins; they have
exactly the same API. <code class="literal">StringBuilder</code>
was added in Java 5.0 as a drop-in, unsynchronized replacement for
<code class="literal">StringBuffer</code>. We’ll come back to that
in a bit.</p><p>First, let’s look at some examples of <code class="literal">String</code> construction:</p><a id="I_10_tt573"/><pre class="programlisting"> <code class="c1">// Could be better</code>
<code class="n">String</code> <code class="n">ball</code> <code class="o">=</code> <code class="s">"Hello"</code><code class="o">;</code>
<code class="n">ball</code> <code class="o">=</code> <code class="n">ball</code> <code class="o">+</code> <code class="s">" there."</code><code class="o">;</code>
<code class="n">ball</code> <code class="o">=</code> <code class="n">ball</code> <code class="o">+</code> <code class="s">" How are you?"</code><code class="o">;</code></pre><p>This example creates an unnecessary <code class="literal">String</code> object each time we use the
concatenation operator (+). Whether this is significant depends on how
often this code is run and how big the string actually gets. Here’s a
more extreme example:</p><a id="I_10_tt574"/><pre class="programlisting"> <code class="c1">// Bad use of + ...</code>
<code class="k">while</code><code class="o">(</code> <code class="o">(</code><code class="n">line</code> <code class="o">=</code> <code class="n">readLine</code><code class="o">())</code> <code class="o">!=</code> <code class="n">EOF</code> <code class="o">)</code>
<code class="n">text</code> <code class="o">+=</code> <code class="n">line</code><code class="o">;</code></pre><p>This example repeatedly produces new <code class="literal">String</code> objects. The character array must be
copied over and over, which can adversely affect performance. The
solution is to use a <code class="literal">StringBuilder</code>
object and its <a id="I_indexterm10_id726684" class="indexterm"/><code class="literal">append()</code> method:</p><a id="I_10_tt575"/><pre class="programlisting"> <code class="n">StringBuilder</code> <code class="n">sb</code> <code class="o">=</code> <code class="k">new</code> <code class="n">StringBuilder</code><code class="o">(</code><code class="s">"Hello"</code><code class="o">);</code>
<code class="n">sb</code><code class="o">.</code><code class="na">append</code><code class="o">(</code><code class="s">" there."</code><code class="o">);</code>
<code class="n">sb</code><code class="o">.</code><code class="na">append</code><code class="o">(</code><code class="s">" How are you?"</code><code class="o">);</code>
<code class="n">StringBuilder</code> <code class="n">text</code> <code class="o">=</code> <code class="k">new</code> <code class="n">StringBuilder</code><code class="o">();</code>
<code class="k">while</code><code class="o">(</code> <code class="o">(</code><code class="n">line</code> <code class="o">=</code> <code class="n">readline</code><code class="o">())</code> <code class="o">!=</code> <code class="n">EOF</code> <code class="o">)</code>
<code class="n">text</code><code class="o">.</code><code class="na">append</code><code class="o">(</code> <code class="n">line</code> <code class="o">);</code></pre><p>Here, the <code class="literal">StringBuilder</code>
efficiently handles expanding the array as necessary. We can get a
<code class="literal">String</code> back from the <code class="literal">StringBuilder</code> with its <code class="literal">toString()</code> method:</p><a id="I_10_tt576"/><pre class="programlisting"> <code class="n">String</code> <code class="n">message</code> <code class="o">=</code> <code class="n">sb</code><code class="o">.</code><code class="na">toString</code><code class="o">();</code></pre><p>You can also retrieve part of a <code class="literal">StringBuilder</code> as a <code class="literal">String</code> by using one of the <a id="I_indexterm10_id726750" class="indexterm"/><code class="literal">substring()</code> methods.</p><p>You might be interested to know that when you write a long
expression using string concatenation, the compiler generates code that
uses a <code class="literal">StringBuilder</code> behind the
scenes:</p><a id="I_10_tt577"/><pre class="programlisting"> <code class="n">String</code> <code class="n">foo</code> <code class="o">=</code> <code class="s">"To "</code> <code class="o">+</code> <code class="s">"be "</code>