Java substring: Extract Parts of a String

The substring method on String extracts a contiguous slice of characters. It's one of the most-used methods in Java, but its index conventions trip up beginners.

The two forms

String s = "Hello World";

String a = s.substring(6);     // "World" β€” from index 6 to the end
String b = s.substring(0, 5);  // "Hello" β€” from 0 (inclusive) to 5 (exclusive)

Key rule: beginIndex is inclusive, endIndex is exclusive. The length of the result is endIndex - beginIndex.

Index visualization

Position:  0  1  2  3  4  5  6  7  8  9  10  11
Character: H  e  l  l  o  ' ' W  o  r  l  d

s.substring(0, 5)  => "Hello"   (indices 0..4)
s.substring(6)     => "World"   (indices 6..10)
s.substring(6, 11) => "World"   (same result)
s.substring(0, s.length()) => whole string

Common slicing patterns

String s = "hello.world.java";

// First N characters
String head = s.substring(0, 5);         // "hello"

// Last N characters
String tail = s.substring(s.length() - 4); // "java"

// Everything after the first dot
int dot = s.indexOf('.');
String rest = s.substring(dot + 1);       // "world.java"

// Everything before the last dot (extension stripping)
int lastDot = s.lastIndexOf('.');
String noExt = s.substring(0, lastDot);   // "hello.world"

// Between two markers
String xml = "<name>Alice</name>";
int start = xml.indexOf("<name>") + "<name>".length();
int end   = xml.indexOf("</name>");
String val = xml.substring(start, end);   // "Alice"

Exceptions

Out-of-range arguments throw StringIndexOutOfBoundsException:

"abc".substring(-1);       // ❌
"abc".substring(0, 10);    // ❌
"abc".substring(2, 1);     // ❌ endIndex < beginIndex

Always validate or clamp before slicing untrusted input:

public static String safeSubstring(String s, int start, int end) {
    if (s == null) return "";
    start = Math.max(0, Math.min(start, s.length()));
    end   = Math.max(start, Math.min(end, s.length()));
    return s.substring(start, end);
}

The surrogate pair trap

Strings are indexed by UTF-16 code units. Cutting inside a surrogate pair produces an invalid string:

String s = "Hi πŸ˜€ there";
String bad = s.substring(0, 4); // "Hi \uD83D" β€” half an emoji

// Correct: find a code-point-aligned cutoff
int codeUnits = s.offsetByCodePoints(0, 4); // 0-based code point index 4
String good = s.substring(0, codeUnits);    // "Hi πŸ˜€"

For user content that may contain emoji, use offsetByCodePoints to locate boundaries.

Memory β€” modern JVM

In old Java 6 and earlier, substring returned a view backed by the same char array. Java 7u6+ copies, so no memory leak. Each substring call today allocates a new string.

For very large strings and many slices, that allocation adds up. Use StringBuilder or direct char[] manipulation for tight loops.

Alternatives

split β€” break on a separator

String[] parts = "a,b,c".split(",");  // {"a", "b", "c"}

replace / replaceAll β€” substitute

"Hello World".replace("World", "Java"); // "Hello Java"

Pattern / Matcher β€” regex extraction

import java.util.regex.*;

Matcher m = Pattern.compile("<name>(.*?)</name>").matcher(xml);
if (m.find()) {
    String name = m.group(1);
}

Chaining with trim, toLowerCase etc.

String clean = raw.trim().substring(0, Math.min(raw.length(), 100)).toLowerCase();

This is idiomatic but make sure each call is safe β€” trim can produce an empty string, then substring(0, 0) is fine but anything else might throw.

Quick reference

IntentExpression
First 10 charss.substring(0, Math.min(10, s.length()))
Last 4 charss.substring(s.length() - 4)
After first chars.substring(1)
Between two indicess.substring(a, b)
File extensions.substring(s.lastIndexOf('.') + 1)
File name without extensions.substring(0, s.lastIndexOf('.'))

Master the inclusive/exclusive index convention and the surrogate-pair caveat, and substring covers 99% of string-slicing needs in Java.