Quantcast
Channel: Baeldung
Viewing all articles
Browse latest Browse all 4561

Create a Mutable String in Java

$
0
0

1. Introduction

In this tutorial, we’ll discuss a few ways to create a mutable String in Java.

2. Immutability of Strings

Unlike other programming languages like C or C++, Strings are immutable in Java.

This immutable nature of Strings also means that any modifications to a String create a new String in memory with the modified content and return the updated reference. Java provides library classes such as StringBuffer and StringBuilder to work with mutable text data efficiently.

3. Mutable String Using Reflection

We can attempt to create a mutable String in Java by using the Reflection framework. The Reflection framework in Java allows us to inspect and modify the structure of objects, methods, and their attributes at runtime. While it is a very powerful tool, it should be used with caution as it can leave bugs in the program without warnings.

We can employ some of the framework’s methods to update the value of Strings, thereby creating a mutable object. Let’s start by creating two Strings, one as a String literal and another with the new keyword:

String myString = "Hello World";
String otherString = new String("Hello World");

Now, we use Reflection’s getDeclaredField() method on the String class to obtain a Field instance and make it accessible for us to override the value:

Field f = String.class.getDeclaredField("value");
f.setAccessible(true);
f.set(myString, "Hi World".toCharArray());

When we set the value of our first string to something else and try printing the second string, the mutated value appears:

System.out.println(otherString);
Hi World

Therefore, we mutated a String, and any String objects referring to this literal get the updated value of “Hi World” in them. This can introduce bugs in the system and cause a lot of breakage. Java programs run with the underlying assumption that Strings are immutable. Any deviation from that may be catastrophic in nature.

It is also important to note that the above example is extremely dated and won’t work with newer Java releases.

4. Charsets and Strings

4.1. Introduction to Charsets

The solution discussed above has a lot of disadvantages and is inconvenient. A different way of mutating a string can be by implementing a custom CharSet for our program.

Computers understand man-made characters only by their numeric codes. A Charset is a dictionary that maintains the mapping of characters against their binary counterpart. For example, ASCII has a character set of 128 characters. A standardized character encoding format, along with a defined Charset, ensures that text is properly interpreted in digital systems worldwide.

Java provides extensive support for encodings and conversions. This includes US-ASCII, ISO-8859-1, UTF-8, and UTF-16, to name a few.

4.2. Using a Charset

Let’s see an example of how we can use Charsets to encode and decode Strings. We’ll take a non-ASCII String and then encode it using UTF-8 charset. Conversely, we’ll then decode the string to the original input using the same charset.

Let’s start with the input String:

String inputString = "Hello, दुनिया";

We obtain a charset for UTF-8 using the Charset.forName() method of java.nio.charset.Charset and also get an encoder:

Charset charset = Charset.forName("UTF-8");
CharsetEncoder encoder = charset.newEncoder();

The encoder object has an encode() method, which expects a CharBuffer object, a ByteBuffer object, and an endOfInput flag.

The CharBuffer object is a buffer for holding Character data and can be obtained as follows:

CharBuffer charBuffer = CharBuffer.wrap(inputString);
ByteBuffer byteBuffer = ByteBuffer.allocate(64);

We also create a ByteBuffer object of size 64 and then pass these to the encode() method to encode the input String:

encoder.encode(charBuffer, byteBuffer, true);

The byteBuffer object is now storing the encoded characters. We can decode the contents of the byteBuffer object to reveal the original String again:

private static String decodeString(ByteBuffer byteBuffer) {
    Charset charset = Charset.forName("UTF-8");
    CharsetDecoder decoder = charset.newDecoder();
    CharBuffer decodedCharBuffer = CharBuffer.allocate(50);
    decoder.decode(byteBuffer, decodedCharBuffer, true);
    decodedCharBuffer.flip();
    return decodedCharBuffer.toString();
}

The following test verifies that we are able to decode the String back to its original value:

String inputString = "hello दुनिया";
String result = ch.decodeString(ch.encodeString(inputString));
Assertions.assertEquals(inputString, result);

4.3. Creating a Custom Charset

We can also create our custom Charset class definition for our programs. To do this, we must provide concrete implementations of the following methods:

  • newDecoder() – this should return a CharsetDecoder instance
  • newEncoder() – this should return a CharsetEncoder instance

We start with an inline Charset definition by creating a new instance of Charset as follows:

private final Charset myCharset = new Charset("mycharset", null) {
    // implement methods
}

We have already seen that Charsets extensively use CharBuffer objects in characters’ encoding and decoding lifecycle. In our custom charset definition, we create a shared CharBuffer object to use throughout the program:

private final AtomicReference<CharBuffer> cbRef = new AtomicReference<>();

Let’s now write our simple inline implementations of the newEncoder() and newDecoder() methods to complete our Charset definition. We’ll also inject the shared CharBuffer object cbRef in the methods:

@Override
public CharsetDecoder newDecoder() {
    return new CharsetDecoder(this, 1.0f, 1.0f) {
        @Override
        protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
            cbRef.set(out);
            while (in.remaining() > 0) {
                out.append((char) in.get());
            }
            return CoderResult.UNDERFLOW;
        }
    };
}
@Override
public CharsetEncoder newEncoder() {
    CharsetEncoder cd = new CharsetEncoder(this, 1.0f, 1.0f) {
        @Override
        protected CoderResult encodeLoop(CharBuffer in, ByteBuffer out) {
            while (in.hasRemaining()) {
                if (!out.hasRemaining()) {
                    return CoderResult.OVERFLOW;
                }
                char currentChar = in.get();
                if (currentChar > 127) {
                    return CoderResult.unmappableForLength(1);
                }
                out.put((byte) currentChar);
            }
            return CoderResult.UNDERFLOW;
        }
    };
    return cd;
}

4.4. Mutating a String With Custom Charset

We have now completed our Charset definition, and we can use this charset in our program. Let’s notice that we have a shared CharBuffer instance, which is updated with the output CharBuffer in the decoding process. This is an essential step towards mutating the string.

String class in Java provides multiple constructors to create and initialize a String, and one of them takes in a bytes array and a Charset:

public String(byte[] bytes, Charset charset) {
    this(bytes, 0, bytes.length, charset);
}

We use this constructor to create a String, and we pass our custom charset object myCharset to it:

public String createModifiableString(String s) {
    return new String(s.getBytes(), charset);
}

Now that we have our String let’s try to mutate it by leveraging the CharBuffer we have:

public void modifyString() {
    CharBuffer cb = cbRef.get();
    cb.position(0);
    cb.put("something");
}

Here, we update the CharBuffer’s contents to a different value at the 0th position. As this character buffer is shared, and the charset maintains a reference to it in the decodeLoop() method of the decoder, the underlying char[] is also changed. We can verify this by adding a test:

String s = createModifiableString("Hello");
Assert.assertEquals("Hello", s);
modifyString();
Assert.assertEquals("something", s);

5. Final Thoughts on String Mutation

We have seen a few ways to mutate a String. String mutation is controversial in the Java world mainly because almost all programs in Java assume the non-mutating nature of Strings.

However, we need to work with changing Strings a lot of times, which is why Java provides us with the StringBuffer and StringBuilder classes. These classes work with mutable sequences of Characters and are hence easily modifiable. Using these classes is the best and most efficient way of working with mutable character sequences.

6. Conclusion

In this article, we looked into mutable Strings and ways of mutating a String. We also understood the disadvantages and difficulties in having a straightforward algorithm for mutating a String.

As usual, the code for this article is available over on GitHub.

       

Viewing all articles
Browse latest Browse all 4561

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>