/* * Copyright (c) 2003, 2022, Oracle and/or its affiliates. All rights reserved. * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as * published by the Free Software Foundation. Oracle designates this * particular file as subject to the "Classpath" exception as provided * by Oracle in the LICENSE file that accompanied this code. * * This code is distributed in the hope that it will be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License * version 2 for more details (a copy is included in the LICENSE file that * accompanied this code). * * You should have received a copy of the GNU General Public License version * 2 along with this work; if not, write to the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA * or visit www.oracle.com if you need additional information or have any * questions.
*/
/** * A simple text scanner which can parse primitive types and strings using * regular expressions. * * <p>A {@code Scanner} breaks its input into tokens using a * delimiter pattern, which by default matches whitespace. The resulting * tokens may then be converted into values of different types using the * various {@code next} methods. * * <p>For example, this code allows a user to read a number from * the console. * {@snippet : * var con = System.console(); * if (con != null) { * // @link substring="reader()" target="java.io.Console#reader()" : * Scanner sc = new Scanner(con.reader()); * int i = sc.nextInt(); * } * } * * <p>As another example, this code allows {@code long} types to be * assigned from entries in a file {@code myNumbers}: * {@snippet : * Scanner sc = new Scanner(new File("myNumbers")); * while (sc.hasNextLong()) { * long aLong = sc.nextLong(); * } * } * * <p>The scanner can also use delimiters other than whitespace. This * example reads several items in from a string: * {@snippet : * String input = "1 fish 2 fish red fish blue fish"; * Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*"); * System.out.println(s.nextInt()); * System.out.println(s.nextInt()); * System.out.println(s.next()); * System.out.println(s.next()); * s.close(); * } * <p> * prints the following output: * <blockquote><pre>{@code * 1 * 2 * red * blue * }</pre></blockquote> * * <p>The same output can be generated with this code, which uses a regular * expression to parse all four tokens at once: * {@snippet : * String input = "1 fish 2 fish red fish blue fish"; * Scanner s = new Scanner(input); * s.findInLine("(\\d+) fish (\\d+) fish (\\w+) fish (\\w+)"); * MatchResult result = s.match(); * for (int i=1; i<=result.groupCount(); i++) * System.out.println(result.group(i)); * s.close(); * } * * <p>The <a id="default-delimiter">default whitespace delimiter</a> used * by a scanner is as recognized by {@link Character#isWhitespace(char) * Character.isWhitespace()}. The {@link #reset reset()} * method will reset the value of the scanner's delimiter to the default * whitespace delimiter regardless of whether it was previously changed. * * <p>A scanning operation may block waiting for input. * * <p>The {@link #next} and {@link #hasNext} methods and their * companion methods (such as {@link #nextInt} and * {@link #hasNextInt}) first skip any input that matches the delimiter * pattern, and then attempt to return the next token. Both {@code hasNext()} * and {@code next()} methods may block waiting for further input. Whether a * {@code hasNext()} method blocks has no connection to whether or not its * associated {@code next()} method will block. The {@link #tokens} method * may also block waiting for input. * * <p>The {@link #findInLine findInLine()}, * {@link #findWithinHorizon findWithinHorizon()}, * {@link #skip skip()}, and {@link #findAll findAll()} * methods operate independently of the delimiter pattern. These methods will * attempt to match the specified pattern with no regard to delimiters in the * input and thus can be used in special circumstances where delimiters are * not relevant. These methods may block waiting for more input. * * <p>When a scanner throws an {@link InputMismatchException}, the scanner * will not pass the token that caused the exception, so that it may be * retrieved or skipped via some other method. * * <p>Depending upon the type of delimiting pattern, empty tokens may be * returned. For example, the pattern {@code "\\s+"} will return no empty * tokens since it matches multiple instances of the delimiter. The delimiting * pattern {@code "\\s"} could return empty tokens since it only passes one * space at a time. * * <p> A scanner can read text from any object which implements the {@link * java.lang.Readable} interface. If an invocation of the underlying * readable's {@link java.lang.Readable#read read()} method throws an {@link * java.io.IOException} then the scanner assumes that the end of the input * has been reached. The most recent {@code IOException} thrown by the * underlying readable can be retrieved via the {@link #ioException} method. * * <p>When a {@code Scanner} is closed, it will close its input source * if the source implements the {@link java.io.Closeable} interface. * * <p>A {@code Scanner} is not safe for multithreaded use without * external synchronization. * * <p>Unless otherwise mentioned, passing a {@code null} parameter into * any method of a {@code Scanner} will cause a * {@code NullPointerException} to be thrown. * * <p>A scanner will default to interpreting numbers as decimal unless a * different radix has been set by using the {@link #useRadix} method. The * {@link #reset} method will reset the value of the scanner's radix to * {@code 10} regardless of whether it was previously changed. * * <h2> <a id="localized-numbers">Localized numbers</a> </h2> * * <p> An instance of this class is capable of scanning numbers in the standard * formats as well as in the formats of the scanner's locale. A scanner's * <a id="initial-locale">initial locale </a>is the value returned by the {@link * java.util.Locale#getDefault(Locale.Category) * Locale.getDefault(Locale.Category.FORMAT)} method; it may be changed via the {@link * #useLocale useLocale()} method. The {@link #reset} method will reset the value of the * scanner's locale to the initial locale regardless of whether it was * previously changed. * * <p>The localized formats are defined in terms of the following parameters, * which for a particular locale are taken from that locale's {@link * java.text.DecimalFormat DecimalFormat} object, {@code df}, and its and * {@link java.text.DecimalFormatSymbols DecimalFormatSymbols} object, * {@code dfs}. * * <blockquote><dl> * <dt><i>LocalGroupSeparator </i> * <dd>The character used to separate thousands groups, * <i>i.e.,</i> {@code dfs.}{@link * java.text.DecimalFormatSymbols#getGroupingSeparator * getGroupingSeparator()} * <dt><i>LocalDecimalSeparator </i> * <dd>The character used for the decimal point, * <i>i.e.,</i> {@code dfs.}{@link * java.text.DecimalFormatSymbols#getDecimalSeparator * getDecimalSeparator()} * <dt><i>LocalPositivePrefix </i> * <dd>The string that appears before a positive number (may * be empty), <i>i.e.,</i> {@code df.}{@link * java.text.DecimalFormat#getPositivePrefix * getPositivePrefix()} * <dt><i>LocalPositiveSuffix </i> * <dd>The string that appears after a positive number (may be * empty), <i>i.e.,</i> {@code df.}{@link * java.text.DecimalFormat#getPositiveSuffix * getPositiveSuffix()} * <dt><i>LocalNegativePrefix </i> * <dd>The string that appears before a negative number (may * be empty), <i>i.e.,</i> {@code df.}{@link * java.text.DecimalFormat#getNegativePrefix * getNegativePrefix()} * <dt><i>LocalNegativeSuffix </i> * <dd>The string that appears after a negative number (may be * empty), <i>i.e.,</i> {@code df.}{@link * java.text.DecimalFormat#getNegativeSuffix * getNegativeSuffix()} * <dt><i>LocalNaN </i> * <dd>The string that represents not-a-number for * floating-point values, * <i>i.e.,</i> {@code dfs.}{@link * java.text.DecimalFormatSymbols#getNaN * getNaN()} * <dt><i>LocalInfinity </i> * <dd>The string that represents infinity for floating-point * values, <i>i.e.,</i> {@code dfs.}{@link * java.text.DecimalFormatSymbols#getInfinity * getInfinity()} * </dl></blockquote> * * <h3> <a id="number-syntax">Number syntax</a> </h3> * * <p> The strings that can be parsed as numbers by an instance of this class * are specified in terms of the following regular-expression grammar, where * Rmax is the highest digit in the radix being used (for example, Rmax is 9 in base 10). * * <dl> * <dt><i>NonAsciiDigit</i>: * <dd>A non-ASCII character c for which * {@link java.lang.Character#isDigit Character.isDigit}{@code (c)} * returns true * * <dt><i>Non0Digit</i>: * <dd>{@code [1-}<i>Rmax</i>{@code ] | }<i>NonASCIIDigit</i> * * <dt><i>Digit</i>: * <dd>{@code [0-}<i>Rmax</i>{@code ] | }<i>NonASCIIDigit</i> * * <dt><i>GroupedNumeral</i>: * <dd><code>( </code><i>Non0Digit</i> * <i>Digit</i>{@code ? * }<i>Digit</i>{@code ?} * <dd> <code>( </code><i>LocalGroupSeparator</i> * <i>Digit</i> * <i>Digit</i> * <i>Digit</i>{@code )+ )} * * <dt><i>Numeral</i>: * <dd>{@code ( ( }<i>Digit</i>{@code + ) * | }<i>GroupedNumeral</i>{@code )} * * <dt><a id="Integer-regex"><i>Integer</i>:</a> * <dd>{@code ( [-+]? ( }<i>Numeral</i>{@code * ) )} * <dd>{@code | }<i>LocalPositivePrefix</i> <i>Numeral</i> * <i>LocalPositiveSuffix</i> * <dd>{@code | }<i>LocalNegativePrefix</i> <i>Numeral</i> * <i>LocalNegativeSuffix</i> * * <dt><i>DecimalNumeral</i>: * <dd><i>Numeral</i> * <dd>{@code | }<i>Numeral</i> * <i>LocalDecimalSeparator</i> * <i>Digit</i>{@code *} * <dd>{@code | }<i>LocalDecimalSeparator</i> * <i>Digit</i>{@code +} * * <dt><i>Exponent</i>: * <dd>{@code ( [eE] [+-]? }<i>Digit</i>{@code + )} * * <dt><a id="Decimal-regex"><i>Decimal</i>:</a> * <dd>{@code ( [-+]? }<i>DecimalNumeral</i> * <i>Exponent</i>{@code ? )} * <dd>{@code | }<i>LocalPositivePrefix</i> * <i>DecimalNumeral</i> * <i>LocalPositiveSuffix</i> * <i>Exponent</i>{@code ?} * <dd>{@code | }<i>LocalNegativePrefix</i> * <i>DecimalNumeral</i> * <i>LocalNegativeSuffix</i> * <i>Exponent</i>{@code ?} * * <dt><i>HexFloat</i>: * <dd>{@code [-+]? 0[xX][0-9a-fA-F]*\.[0-9a-fA-F]+ * ([pP][-+]?[0-9]+)?} * * <dt><i>NonNumber</i>: * <dd>{@code NaN * | }<i>LocalNan</i>{@code * | Infinity * | }<i>LocalInfinity</i> * * <dt><i>SignedNonNumber</i>: * <dd>{@code ( [-+]? }<i>NonNumber</i>{@code )} * <dd>{@code | }<i>LocalPositivePrefix</i> * <i>NonNumber</i> * <i>LocalPositiveSuffix</i> * <dd>{@code | }<i>LocalNegativePrefix</i> * <i>NonNumber</i> * <i>LocalNegativeSuffix</i> * * <dt><a id="Float-regex"><i>Float</i></a>: * <dd><i>Decimal</i> * {@code | }<i>HexFloat</i> * {@code | }<i>SignedNonNumber</i> * * </dl> * <p>Whitespace is not significant in the above regular expressions. * * @since 1.5
*/ publicfinalclass Scanner implements Iterator<String>, Closeable {
// Internal buffer used to hold input private CharBuffer buf;
// Size of internal character buffer privatestaticfinalint BUFFER_SIZE = 1024; // change to 1024;
// The index into the buffer currently held by the Scanner privateint position;
// Internal matcher used for finding delimiters private Matcher matcher;
// Pattern used to delimit tokens private Pattern delimPattern;
// Pattern found in last hasNext operation private Pattern hasNextPattern;
// Position after last hasNext operation privateint hasNextPosition;
// Result after last hasNext operation private String hasNextResult;
// The input source private Readable source;
// Boolean is true if source is done privateboolean sourceClosed = false;
// Boolean indicating more input is required privateboolean needInput = false;
// Boolean indicating if a delim has been skipped this operation privateboolean skipped = false;
// A store of a position that the scanner may fall back to privateint savedScannerPosition = -1;
// A cache of the last primitive type scanned private Object typeCache = null;
// Boolean indicating if a match result is available privateboolean matchValid = false;
// Boolean indicating if this scanner has been closed privateboolean closed = false;
// The current radix used by this scanner privateint radix = 10;
// The default radix for this scanner privateint defaultRadix = 10;
// The locale used by this scanner private Locale locale = null;
// A cache of the last few recently used Patterns private PatternLRUCache patternCache = new PatternLRUCache(7);
// A holder of the last IOException encountered private IOException lastException;
// Number of times this scanner's state has been modified. // Generally incremented on most public APIs and checked // within spliterator implementations. int modCount;
// A pattern for java whitespace privatestatic Pattern WHITESPACE_PATTERN = Pattern.compile( "\\p{javaWhitespace}+");
// A pattern for any token privatestatic Pattern FIND_ANY_PATTERN = Pattern.compile("(?s).*");
// A pattern for non-ASCII digits privatestatic Pattern NON_ASCII_DIGIT = Pattern.compile( "[\\p{javaDigit}&&[^0-9]]");
// Fields and methods to support scanning primitive types
/** * Fields and an accessor method to match booleans
*/ privatestaticvolatile Pattern boolPattern; privatestaticfinal String BOOLEAN_PATTERN = "true|false"; privatestatic Pattern boolPattern() {
Pattern bp = boolPattern; if (bp == null)
boolPattern = bp = Pattern.compile(BOOLEAN_PATTERN,
Pattern.CASE_INSENSITIVE); return bp;
}
/** * Fields and methods to match bytes, shorts, ints, and longs
*/ private Pattern integerPattern; private String digits = "0123456789abcdefghijklmnopqrstuvwxyz"; private String non0Digit = "[\\p{javaDigit}&&[^0]]"; privateint SIMPLE_GROUP_INDEX = 5; private String buildIntegerPatternString() {
String radixDigits = digits.substring(0, radix); // \\p{javaDigit} is not guaranteed to be appropriate // here but what can we do? The final authority will be // whatever parse method is invoked, so ultimately the // Scanner will do the right thing
String digit = "((?i)["+radixDigits+"\\p{javaDigit}])";
String groupedNumeral = "("+non0Digit+digit+"?"+digit+"?("+
groupSeparator+digit+digit+digit+")+)"; // digit++ is the possessive form which is necessary for reducing // backtracking that would otherwise cause unacceptable performance
String numeral = "(("+ digit+"++)|"+groupedNumeral+")";
String javaStyleInteger = "([-+]?(" + numeral + "))";
String negativeInteger = negativePrefix + numeral + negativeSuffix;
String positiveInteger = positivePrefix + numeral + positiveSuffix; return"("+ javaStyleInteger + ")|(" +
positiveInteger + ")|(" +
negativeInteger + ")";
} private Pattern integerPattern() { if (integerPattern == null) {
integerPattern = patternCache.forName(buildIntegerPatternString());
} return integerPattern;
}
/** * Fields and an accessor method to match line separators
*/ privatestaticvolatile Pattern separatorPattern; privatestaticvolatile Pattern linePattern; privatestaticfinal String LINE_SEPARATOR_PATTERN = "\r\n|[\n\r\u2028\u2029\u0085]"; privatestaticfinal String LINE_PATTERN = ".*("+LINE_SEPARATOR_PATTERN+")|.+$";
/** * Constructs a {@code Scanner} that returns values scanned * from the specified source delimited by the specified pattern. * * @param source A character source implementing the Readable interface * @param pattern A delimiting pattern
*/ private Scanner(Readable source, Pattern pattern) { assert source != null : "source should not be null"; assert pattern != null : "pattern should not be null"; this.source = source;
delimPattern = pattern;
buf = CharBuffer.allocate(BUFFER_SIZE);
buf.limit(0);
matcher = delimPattern.matcher(buf);
matcher.useTransparentBounds(true);
matcher.useAnchoringBounds(false);
useLocale(Locale.getDefault(Locale.Category.FORMAT));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified source. * * @param source A character source implementing the {@link Readable} * interface
*/ public Scanner(Readable source) { this(Objects.requireNonNull(source, "source"), WHITESPACE_PATTERN);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified input stream. Bytes from the stream are converted * into characters using the * {@linkplain Charset#defaultCharset() default charset}. * * @param source An input stream to be scanned * @see Charset#defaultCharset()
*/ public Scanner(InputStream source) { this(new InputStreamReader(source), WHITESPACE_PATTERN);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified input stream. Bytes from the stream are converted * into characters using the specified charset. * * @param source An input stream to be scanned * @param charsetName The encoding type used to convert bytes from the * stream into characters to be scanned * @throws IllegalArgumentException if the specified character set * does not exist
*/ public Scanner(InputStream source, String charsetName) { this(source, toCharset(charsetName));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified input stream. Bytes from the stream are converted * into characters using the specified charset. * * @param source an input stream to be scanned * @param charset the charset used to convert bytes from the file * into characters to be scanned * @since 10
*/ public Scanner(InputStream source, Charset charset) { this(makeReadable(Objects.requireNonNull(source, "source"), charset),
WHITESPACE_PATTERN);
}
/** * Returns a charset object for the given charset name. * @throws NullPointerException is csn is null * @throws IllegalArgumentException if the charset is not supported
*/ privatestatic Charset toCharset(String csn) {
Objects.requireNonNull(csn, "charsetName"); try { return Charset.forName(csn);
} catch (IllegalCharsetNameException|UnsupportedCharsetException e) { // IllegalArgumentException should be thrown thrownew IllegalArgumentException(e);
}
}
/* * This method is added so that null-check on charset can be performed before * creating InputStream as an existing test required it.
*/ privatestatic Readable makeReadable(Path source, Charset charset) throws IOException {
Objects.requireNonNull(charset, "charset"); return makeReadable(Files.newInputStream(source), charset);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the * {@linkplain Charset#defaultCharset() default charset}. * * @param source A file to be scanned * @throws FileNotFoundException if source is not found * @see Charset#defaultCharset()
*/ public Scanner(File source) throws FileNotFoundException { this((ReadableByteChannel)(new FileInputStream(source).getChannel()));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * * @param source A file to be scanned * @param charsetName The encoding type used to convert bytes from the file * into characters to be scanned * @throws FileNotFoundException if source is not found * @throws IllegalArgumentException if the specified encoding is * not found
*/ public Scanner(File source, String charsetName) throws FileNotFoundException
{ this(Objects.requireNonNull(source), toDecoder(charsetName));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * * @param source A file to be scanned * @param charset The charset used to convert bytes from the file * into characters to be scanned * @throws IOException * if an I/O error occurs opening the source * @since 10
*/ public Scanner(File source, Charset charset) throws IOException { this(Objects.requireNonNull(source), charset.newDecoder());
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the * {@linkplain Charset#defaultCharset() default charset}. * * @param source * the path to the file to be scanned * @throws IOException * if an I/O error occurs opening source * @see Charset#defaultCharset() * * @since 1.7
*/ public Scanner(Path source) throws IOException
{ this(Files.newInputStream(source));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * * @param source * the path to the file to be scanned * @param charsetName * The encoding type used to convert bytes from the file * into characters to be scanned * @throws IOException * if an I/O error occurs opening source * @throws IllegalArgumentException * if the specified encoding is not found * @since 1.7
*/ public Scanner(Path source, String charsetName) throws IOException { this(Objects.requireNonNull(source), toCharset(charsetName));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified file. Bytes from the file are converted into * characters using the specified charset. * * @param source * the path to the file to be scanned * @param charset * the charset used to convert bytes from the file * into characters to be scanned * @throws IOException * if an I/O error occurs opening the source * @since 10
*/ public Scanner(Path source, Charset charset) throws IOException { this(makeReadable(source, charset));
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified string. * * @param source A string to scan
*/ public Scanner(String source) { this(new StringReader(source), WHITESPACE_PATTERN);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified channel. Bytes from the source are converted into * characters using the * {@linkplain Charset#defaultCharset() default charset}. * * @param source A channel to scan * @see Charset#defaultCharset()
*/ public Scanner(ReadableByteChannel source) { this(makeReadable(Objects.requireNonNull(source, "source")),
WHITESPACE_PATTERN);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified channel. Bytes from the source are converted into * characters using the specified charset. * * @param source A channel to scan * @param charsetName The encoding type used to convert bytes from the * channel into characters to be scanned * @throws IllegalArgumentException if the specified character set * does not exist
*/ public Scanner(ReadableByteChannel source, String charsetName) { this(makeReadable(Objects.requireNonNull(source, "source"), toDecoder(charsetName)),
WHITESPACE_PATTERN);
}
/** * Constructs a new {@code Scanner} that produces values scanned * from the specified channel. Bytes from the source are converted into * characters using the specified charset. * * @param source a channel to scan * @param charset the encoding type used to convert bytes from the * channel into characters to be scanned * @since 10
*/ public Scanner(ReadableByteChannel source, Charset charset) { this(makeReadable(Objects.requireNonNull(source, "source"), charset),
WHITESPACE_PATTERN);
}
// Clears both regular cache and type cache privatevoid clearCaches() {
hasNextPattern = null;
typeCache = null;
}
// Also clears both the regular cache and the type cache private String getCachedResult() {
position = hasNextPosition;
hasNextPattern = null;
typeCache = null; return hasNextResult;
}
// Also clears both the regular cache and the type cache privatevoid useTypeCache() { if (closed) thrownew IllegalStateException("Scanner closed");
position = hasNextPosition;
hasNextPattern = null;
typeCache = null;
}
// Tries to read more input. May block. privatevoid readInput() { if (buf.limit() == buf.capacity())
makeSpace(); // Prepare to receive data int p = buf.position();
buf.position(buf.limit());
buf.limit(buf.capacity());
int n = 0; try {
n = source.read(buf);
} catch (IOException ioe) {
lastException = ioe;
n = -1;
} if (n == -1) {
sourceClosed = true;
needInput = false;
} if (n > 0)
needInput = false; // Restore current position and limit for reading
buf.limit(buf.position());
buf.position(p);
}
// After this method is called there will either be an exception // or else there will be space in the buffer privateboolean makeSpace() {
clearCaches(); int offset = savedScannerPosition == -1 ?
position : savedScannerPosition;
buf.position(offset); // Gain space by compacting buffer if (offset > 0) {
buf.compact();
translateSavedIndexes(offset);
position -= offset;
buf.flip(); returntrue;
} // Gain space by growing buffer int newSize = buf.capacity() * 2;
CharBuffer newBuf = CharBuffer.allocate(newSize);
newBuf.put(buf);
newBuf.flip();
translateSavedIndexes(offset);
position -= offset;
buf = newBuf;
matcher.reset(buf); returntrue;
}
// When a buffer compaction/reallocation occurs the saved indexes must // be modified appropriately privatevoid translateSavedIndexes(int offset) { if (savedScannerPosition != -1)
savedScannerPosition -= offset;
}
// If we are at the end of input then NoSuchElement; // If there is still input left then InputMismatch privatevoid throwFor() {
skipped = false; if ((sourceClosed) && (position == buf.limit())) thrownew NoSuchElementException(); else thrownew InputMismatchException();
}
// Returns true if a complete token or partial token is in the buffer. // It is not necessary to find a complete token since a partial token // means that there will be another token with or without more input. privateboolean hasTokenInBuffer() {
matchValid = false;
matcher.usePattern(delimPattern);
matcher.region(position, buf.limit()); // Skip delims first if (matcher.lookingAt()) { if (matcher.hitEnd() && !sourceClosed) { // more input might change the match of delims, in which // might change whether or not if there is token left in // buffer (don't update the "position" in this case)
needInput = true; returnfalse;
}
position = matcher.end();
} // If we are sitting at the end, no more tokens in buffer if (position == buf.limit()) returnfalse; returntrue;
}
/* * Returns a "complete token" that matches the specified pattern * * A token is complete if surrounded by delims; a partial token * is prefixed by delims but not postfixed by them * * The position is advanced to the end of that complete token * * Pattern == null means accept any token at all * * Triple return: * 1. valid string means it was found * 2. null with needInput=false means we won't ever find it * 3. null with needInput=true means try again after readInput
*/ private String getCompleteTokenInBuffer(Pattern pattern) {
matchValid = false; // Skip delims first
matcher.usePattern(delimPattern); if (!skipped) { // Enforcing only one skip of leading delims
matcher.region(position, buf.limit()); if (matcher.lookingAt()) { // If more input could extend the delimiters then we must wait // for more input if (matcher.hitEnd() && !sourceClosed) {
needInput = true; returnnull;
} // The delims were whole and the matcher should skip them
skipped = true;
position = matcher.end();
}
}
// If we are sitting at the end, no more tokens in buffer if (position == buf.limit()) { if (sourceClosed) returnnull;
needInput = true; returnnull;
} // Must look for next delims. Simply attempting to match the // pattern at this point may find a match but it might not be // the first longest match because of missing input, or it might // match a partial token instead of the whole thing.
// Then look for next delims
matcher.region(position, buf.limit()); boolean foundNextDelim = matcher.find(); if (foundNextDelim && (matcher.end() == position)) { // Zero length delimiter match; we should find the next one // using the automatic advance past a zero length match; // Otherwise we have just found the same one we just skipped
foundNextDelim = matcher.find();
} if (foundNextDelim) { // In the rare case that more input could cause the match // to be lost and there is more input coming we must wait // for more input. Note that hitting the end is okay as long // as the match cannot go away. It is the beginning of the // next delims we want to be sure about, we don't care if // they potentially extend further. if (matcher.requireEnd() && !sourceClosed) {
needInput = true; returnnull;
} int tokenEnd = matcher.start(); // There is a complete token. if (pattern == null) { // Must continue with match to provide valid MatchResult
pattern = FIND_ANY_PATTERN;
} // Attempt to match against the desired pattern
matcher.usePattern(pattern);
matcher.region(position, tokenEnd); if (matcher.matches()) {
String s = matcher.group();
position = matcher.end(); return s;
} else { // Complete token but it does not match returnnull;
}
}
// If we can't find the next delims but no more input is coming, // then we can treat the remainder as a whole token if (sourceClosed) { if (pattern == null) { // Must continue with match to provide valid MatchResult
pattern = FIND_ANY_PATTERN;
} // Last token; Match the pattern here or throw
matcher.usePattern(pattern);
matcher.region(position, buf.limit()); if (matcher.matches()) {
String s = matcher.group();
position = matcher.end(); return s;
} // Last piece does not match returnnull;
}
// There is a partial token in the buffer; must read more // to complete it
needInput = true; returnnull;
}
// Finds the specified pattern in the buffer up to horizon. // Returns true if the specified input pattern was matched, // and leaves the matcher field with the current match state. privateboolean findPatternInBuffer(Pattern pattern, int horizon) {
matchValid = false;
matcher.usePattern(pattern); int bufferLimit = buf.limit(); int horizonLimit = -1; int searchLimit = bufferLimit; if (horizon > 0) {
horizonLimit = position + horizon; if (horizonLimit < bufferLimit)
searchLimit = horizonLimit;
}
matcher.region(position, searchLimit); if (matcher.find()) { if (matcher.hitEnd() && (!sourceClosed)) { // The match may be longer if didn't hit horizon or real end if (searchLimit != horizonLimit) { // Hit an artificial end; try to extend the match
needInput = true; returnfalse;
} // The match could go away depending on what is next if ((searchLimit == horizonLimit) && matcher.requireEnd()) { // Rare case: we hit the end of input and it happens // that it is at the horizon and the end of input is // required for the match.
needInput = true; returnfalse;
}
} // Did not hit end, or hit real end, or hit horizon
position = matcher.end(); returntrue;
}
if (sourceClosed) returnfalse;
// If there is no specified horizon, or if we have not searched // to the specified horizon yet, get more input if ((horizon == 0) || (searchLimit != horizonLimit))
needInput = true; returnfalse;
}
// Attempts to match a pattern anchored at the current position. // Returns true if the specified input pattern was matched, // and leaves the matcher field with the current match state. privateboolean matchPatternInBuffer(Pattern pattern) {
matchValid = false;
matcher.usePattern(pattern);
matcher.region(position, buf.limit()); if (matcher.lookingAt()) { if (matcher.hitEnd() && (!sourceClosed)) { // Get more input and try again
needInput = true; returnfalse;
}
position = matcher.end(); returntrue;
}
if (sourceClosed) returnfalse;
// Read more to find pattern
needInput = true; returnfalse;
}
// Throws if the scanner is closed privatevoid ensureOpen() { if (closed) thrownew IllegalStateException("Scanner closed");
}
// Public methods
/** * Closes this scanner. * * <p> If this scanner has not yet been closed then if its underlying * {@linkplain java.lang.Readable readable} also implements the {@link * java.io.Closeable} interface then the readable's {@code close} method * will be invoked. If this scanner is already closed then invoking this * method will have no effect. * * <p>Attempting to perform search operations after a scanner has * been closed will result in an {@link IllegalStateException}. *
*/ publicvoid close() { if (closed) return; if (source instanceof Closeable) { try {
((Closeable)source).close();
} catch (IOException ioe) {
lastException = ioe;
}
}
sourceClosed = true;
source = null;
closed = true;
}
/** * Returns the {@code IOException} last thrown by this * {@code Scanner}'s underlying {@code Readable}. This method * returns {@code null} if no such exception exists. * * @return the last exception thrown by this scanner's readable
*/ public IOException ioException() { return lastException;
}
/** * Returns the {@code Pattern} this {@code Scanner} is currently * using to match delimiters. * * @return this scanner's delimiting pattern.
*/ public Pattern delimiter() { return delimPattern;
}
/** * Sets this scanner's delimiting pattern to the specified pattern. * * @param pattern A delimiting pattern * @return this scanner
*/ public Scanner useDelimiter(Pattern pattern) {
modCount++;
delimPattern = pattern; returnthis;
}
/** * Sets this scanner's delimiting pattern to a pattern constructed from * the specified {@code String}. * * <p> An invocation of this method of the form * {@code useDelimiter(pattern)} behaves in exactly the same way as the * invocation {@code useDelimiter(Pattern.compile(pattern))}. * * <p> Invoking the {@link #reset} method will set the scanner's delimiter * to the <a href= "#default-delimiter">default</a>. * * @param pattern A string specifying a delimiting pattern * @return this scanner
*/ public Scanner useDelimiter(String pattern) {
modCount++;
delimPattern = patternCache.forName(pattern); returnthis;
}
/** * Returns this scanner's locale. * * <p>A scanner's locale affects many elements of its default * primitive matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * * @return this scanner's locale
*/ public Locale locale() { returnthis.locale;
}
/** * Sets this scanner's locale to the specified locale. * * <p>A scanner's locale affects many elements of its default * primitive matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * * <p>Invoking the {@link #reset} method will set the scanner's locale to * the <a href= "#initial-locale">initial locale</a>. * * @param locale A string specifying the locale to use * @return this scanner
*/ public Scanner useLocale(Locale locale) { if (locale.equals(this.locale)) returnthis;
// In case where NumberFormat.getNumberInstance() returns // other instance (non DecimalFormat) based on the provider // used and java.text.spi.NumberFormatProvider implementations, // DecimalFormat constructor is used to obtain the instance
LocaleProviderAdapter adapter = LocaleProviderAdapter
.getAdapter(NumberFormatProvider.class, locale); if (!(adapter instanceof ResourceBundleBasedAdapter)) {
adapter = LocaleProviderAdapter.getResourceBundleBased();
}
String[] all = adapter.getLocaleResources(locale)
.getNumberPatterns();
df = new DecimalFormat(all[0], dfs);
}
// These must be literalized to avoid collision with regex // metacharacters such as dot or parenthesis
groupSeparator = "\\x{" + Integer.toHexString(dfs.getGroupingSeparator()) + "}";
decimalSeparator = "\\x{" + Integer.toHexString(dfs.getDecimalSeparator()) + "}";
// Quoting the nonzero length locale-specific things // to avoid potential conflict with metacharacters
nanString = Pattern.quote(dfs.getNaN());
infinityString = Pattern.quote(dfs.getInfinity());
positivePrefix = df.getPositivePrefix(); if (!positivePrefix.isEmpty())
positivePrefix = Pattern.quote(positivePrefix);
negativePrefix = df.getNegativePrefix(); if (!negativePrefix.isEmpty())
negativePrefix = Pattern.quote(negativePrefix);
positiveSuffix = df.getPositiveSuffix(); if (!positiveSuffix.isEmpty())
positiveSuffix = Pattern.quote(positiveSuffix);
negativeSuffix = df.getNegativeSuffix(); if (!negativeSuffix.isEmpty())
negativeSuffix = Pattern.quote(negativeSuffix);
// Force rebuilding and recompilation of locale dependent // primitive patterns
integerPattern = null;
floatPattern = null;
returnthis;
}
/** * Returns this scanner's default radix. * * <p>A scanner's radix affects elements of its default * number matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * * @return the default radix of this scanner
*/ publicint radix() { returnthis.defaultRadix;
}
/** * Sets this scanner's default radix to the specified radix. * * <p>A scanner's radix affects elements of its default * number matching regular expressions; see * <a href= "#localized-numbers">localized numbers</a> above. * * <p>If the radix is less than {@link Character#MIN_RADIX Character.MIN_RADIX} * or greater than {@link Character#MAX_RADIX Character.MAX_RADIX}, then an * {@code IllegalArgumentException} is thrown. * * <p>Invoking the {@link #reset} method will set the scanner's radix to * {@code 10}. * * @param radix The radix to use when scanning numbers * @return this scanner * @throws IllegalArgumentException if radix is out of range
*/ public Scanner useRadix(int radix) { if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) thrownew IllegalArgumentException("radix:"+radix);
if (this.defaultRadix == radix) returnthis;
modCount++; this.defaultRadix = radix; // Force rebuilding and recompilation of radix dependent patterns
integerPattern = null; returnthis;
}
// The next operation should occur in the specified radix but // the default is left untouched. privatevoid setRadix(int radix) { if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) thrownew IllegalArgumentException("radix:"+radix);
if (this.radix != radix) { // Force rebuilding and recompilation of radix dependent patterns
integerPattern = null; this.radix = radix;
}
}
/** * Returns the match result of the last scanning operation performed * by this scanner. This method throws {@code IllegalStateException} * if no match has been performed, or if the last match was * not successful. * * <p>The various {@code next} methods of {@code Scanner} * make a match result available if they complete without throwing an * exception. For instance, after an invocation of the {@link #nextInt} * method that returned an int, this method returns a * {@code MatchResult} for the search of the * <a href="#Integer-regex"><i>Integer</i></a> regular expression * defined above. Similarly the {@link #findInLine findInLine()}, * {@link #findWithinHorizon findWithinHorizon()}, and {@link #skip skip()} * methods will make a match available if they succeed. * * @return a match result for the last match operation * @throws IllegalStateException If no match result is available
*/ public MatchResult match() { if (!matchValid) thrownew IllegalStateException("No match result available"); return matcher.toMatchResult();
}
/** * <p>Returns the string representation of this {@code Scanner}. The * string representation of a {@code Scanner} contains information * that may be useful for debugging. The exact format is unspecified. * * @return The string representation of this scanner
*/ public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("java.util.Scanner");
sb.append("[delimiters=" + delimPattern + "]");
sb.append("[position=" + position + "]");
sb.append("[match valid=" + matchValid + "]");
sb.append("[need input=" + needInput + "]");
sb.append("[source closed=" + sourceClosed + "]");
sb.append("[skipped=" + skipped + "]");
sb.append("[group separator=" + groupSeparator + "]");
sb.append("[decimal separator=" + decimalSeparator + "]");
sb.append("[positive prefix=" + positivePrefix + "]");
sb.append("[negative prefix=" + negativePrefix + "]");
sb.append("[positive suffix=" + positiveSuffix + "]");
sb.append("[negative suffix=" + negativeSuffix + "]");
sb.append("[NaN string=" + nanString + "]");
sb.append("[infinity string=" + infinityString + "]"); return sb.toString();
}
/** * Returns true if this scanner has another token in its input. * This method may block while waiting for input to scan. * The scanner does not advance past any input. * * @return true if and only if this scanner has another token * @throws IllegalStateException if this scanner is closed * @see java.util.Iterator
*/ publicboolean hasNext() {
ensureOpen();
saveState();
modCount++; while (!sourceClosed) { if (hasTokenInBuffer()) { return revertState(true);
}
readInput();
} boolean result = hasTokenInBuffer(); return revertState(result);
}
/** * Finds and returns the next complete token from this scanner. * A complete token is preceded and followed by input that matches * the delimiter pattern. This method may block while waiting for input * to scan, even if a previous invocation of {@link #hasNext} returned * {@code true}. * * @return the next token * @throws NoSuchElementException if no more tokens are available * @throws IllegalStateException if this scanner is closed * @see java.util.Iterator
*/ public String next() {
ensureOpen();
clearCaches();
modCount++; while (true) {
String token = getCompleteTokenInBuffer(null); if (token != null) {
matchValid = true;
skipped = false; return token;
} if (needInput)
readInput(); else
throwFor();
}
}
/** * The remove operation is not supported by this implementation of * {@code Iterator}. * * @throws UnsupportedOperationException if this method is invoked. * @see java.util.Iterator
*/ publicvoid remove() { thrownew UnsupportedOperationException();
}
/** * Returns true if the next token matches the pattern constructed from the * specified string. The scanner does not advance past any input. * * <p> An invocation of this method of the form {@code hasNext(pattern)} * behaves in exactly the same way as the invocation * {@code hasNext(Pattern.compile(pattern))}. * * @param pattern a string specifying the pattern to scan * @return true if and only if this scanner has another token matching * the specified pattern * @throws IllegalStateException if this scanner is closed
*/ publicboolean hasNext(String pattern) { return hasNext(patternCache.forName(pattern));
}
/** * Returns the next token if it matches the pattern constructed from the * specified string. If the match is successful, the scanner advances * past the input that matched the pattern. * * <p> An invocation of this method of the form {@code next(pattern)} * behaves in exactly the same way as the invocation * {@code next(Pattern.compile(pattern))}. * * @param pattern a string specifying the pattern to scan * @return the next token * @throws NoSuchElementException if no such tokens are available * @throws IllegalStateException if this scanner is closed
*/ public String next(String pattern) { return next(patternCache.forName(pattern));
}
/** * Returns true if the next complete token matches the specified pattern. * A complete token is prefixed and postfixed by input that matches * the delimiter pattern. This method may block while waiting for input. * The scanner does not advance past any input. * * @param pattern the pattern to scan for * @return true if and only if this scanner has another token matching * the specified pattern * @throws IllegalStateException if this scanner is closed
*/ publicboolean hasNext(Pattern pattern) {
ensureOpen(); if (pattern == null) thrownew NullPointerException();
hasNextPattern = null;
saveState();
modCount++;
while (true) { if (getCompleteTokenInBuffer(pattern) != null) {
matchValid = true;
cacheResult(); return revertState(true);
} if (needInput)
readInput(); else return revertState(false);
}
}
/** * Returns the next token if it matches the specified pattern. This * method may block while waiting for input to scan, even if a previous * invocation of {@link #hasNext(Pattern)} returned {@code true}. * If the match is successful, the scanner advances past the input that * matched the pattern. * * @param pattern the pattern to scan for * @return the next token * @throws NoSuchElementException if no more tokens are available * @throws IllegalStateException if this scanner is closed
*/ public String next(Pattern pattern) {
ensureOpen(); if (pattern == null) thrownew NullPointerException();
modCount++; // Did we already find this pattern? if (hasNextPattern == pattern) return getCachedResult();
clearCaches();
// Search for the pattern while (true) {
String token = getCompleteTokenInBuffer(pattern); if (token != null) {
matchValid = true;
skipped = false; return token;
} if (needInput)
readInput(); else
throwFor();
}
}
/** * Returns true if there is another line in the input of this scanner. * This method may block while waiting for input. The scanner does not * advance past any input. * * @return true if there is a line separator in the remaining input * or if the input has other remaining characters * @throws IllegalStateException if this scanner is closed
*/ publicboolean hasNextLine() {
saveState();
modCount++;
String result = findWithinHorizon(linePattern(), 0); if (result != null) {
MatchResult mr = this.match();
String lineSep = mr.group(1); if (lineSep != null) {
result = result.substring(0, result.length() -
lineSep.length());
cacheResult(result);
/** * Advances this scanner past the current line and returns the input * that was skipped. * * This method returns the rest of the current line, excluding any line * separator at the end. The position is set to the beginning of the next * line. * * <p>Since this method continues to search through the input looking * for a line separator, it may buffer all of the input searching for * the line to skip if no line separators are present. * * @return the line that was skipped * @throws NoSuchElementException if no line was found * @throws IllegalStateException if this scanner is closed
*/ public String nextLine() {
modCount++; if (hasNextPattern == linePattern()) return getCachedResult();
clearCaches();
String result = findWithinHorizon(linePattern, 0); if (result == null) thrownew NoSuchElementException("No line found");
MatchResult mr = this.match();
String lineSep = mr.group(1); if (lineSep != null)
result = result.substring(0, result.length() - lineSep.length()); if (result == null) thrownew NoSuchElementException(); else return result;
}
// Public methods that ignore delimiters
/** * Attempts to find the next occurrence of a pattern constructed from the * specified string, ignoring delimiters. * * <p>An invocation of this method of the form {@code findInLine(pattern)} * behaves in exactly the same way as the invocation * {@code findInLine(Pattern.compile(pattern))}. * * @param pattern a string specifying the pattern to search for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed
*/ public String findInLine(String pattern) { return findInLine(patternCache.forName(pattern));
}
/** * Attempts to find the next occurrence of the specified pattern ignoring * delimiters. If the pattern is found before the next line separator, the * scanner advances past the input that matched and returns the string that * matched the pattern. * If no such pattern is detected in the input up to the next line * separator, then {@code null} is returned and the scanner's * position is unchanged. This method may block waiting for input that * matches the pattern. * * <p>Since this method continues to search through the input looking * for the specified pattern, it may buffer all of the input searching for * the desired token if no line separators are present. * * @param pattern the pattern to scan for * @return the text that matched the specified pattern * @throws IllegalStateException if this scanner is closed
*/ public String findInLine(Pattern pattern) {
ensureOpen(); if (pattern == null) thrownew NullPointerException();
clearCaches();
modCount++; // Expand buffer to include the next newline or end of input int endPosition = 0;
saveState(); while (true) { if (findPatternInBuffer(separatorPattern(), 0)) {
endPosition = matcher.start(); break; // up to next newline
} if (needInput) {
readInput();
} else {
endPosition = buf.limit(); break; // up to end of input
}
}
revertState(); int horizonForLine = endPosition - position; // If there is nothing between the current pos and the next // newline simply return null, invoking findWithinHorizon // with "horizon=0" will scan beyond the line bound. if (horizonForLine == 0) returnnull; // Search for the pattern return findWithinHorizon(pattern, horizonForLine);
}
/** * Attempts to find the next occurrence of a pattern constructed from the * specified string, ignoring delimiters. * * <p>An invocation of this method of the form
--> --------------------
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.