當(dāng)前位置：首頁 > 编程语言 > java >内容正文

java

Java FileReader InputStreamReader类源码解析

發(fā)布時間：2024/9/21 java 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 Java FileReader InputStreamReader类源码解析小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

FileReader

前面介紹FileInputStream的時候提到過，它是從文件讀取字節(jié)，如果要從文件讀取字符的話可以使用FileReader。FileReader是可以便利讀取字符文件的類，構(gòu)造器只能使用默認(rèn)的字符集編碼（系統(tǒng)的默認(rèn)字符集）、默認(rèn)的bytebuffer大小8KB。如果想要自己指定這些值的話，可以直接通過FileInputStream構(gòu)造一個InputStreamReader而不使用FileInputStream。

FileReader本身的代碼其實沒有什么可以分析的，就只有下面幾行，它的操作全部是基于父類來進(jìn)行的。傳入文件路徑名、具體的文件或者文件描述符來構(gòu)造一個文件字節(jié)輸入流，字符輸入流是基于字節(jié)流上轉(zhuǎn)換的。

public class FileReader extends InputStreamReader {public FileReader(String fileName) throws FileNotFoundException {super(new FileInputStream(fileName));}public FileReader(File file) throws FileNotFoundException {super(new FileInputStream(file));}public FileReader(FileDescriptor fd) {super(new FileInputStream(fd));}}

InputStreamReader

然后我們來看下FileReader的父類InputStreamReader，它是從字節(jié)流到字符流的橋梁：它讀取字節(jié)并使用特定的字符集解碼成字符。字符集可能通過名字來確定或者直接特別給出或者是平臺的默認(rèn)字符集。

InputStreamReader的每一個read方法的調(diào)用可能會引起一個或多個字節(jié)從字節(jié)輸出流中被讀取。為了使字節(jié)能夠有效的轉(zhuǎn)換為字符，可能會提前從流中讀取比當(dāng)前讀取操作所需字節(jié)數(shù)更多的字節(jié)。

為了提高效率，考慮將InputStreamReader嵌入到BufferedReader中，比如BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

InputStreamReader繼承了抽象類Reader，Reader中實現(xiàn)了一些具體方法，這些方法沒有在InputStreamReader中重寫，比如skip方法。InputStreamReader有一個核心內(nèi)部變量StreamDecoder，這個類的作用是將輸入的字節(jié)轉(zhuǎn)換為字符，后面會具體分析。

InputStreamReader的構(gòu)造函數(shù)有4種重載，必須的參數(shù)是InputStream，可選的是字符集參數(shù)可以輸入字符集的名字、直接指定字符集或者構(gòu)造一個CharsetDecoder作為參數(shù)

//創(chuàng)建一個使用默認(rèn)字符集的InputStreamReaderpublic InputStreamReader(InputStream in) {super(in);//Reader的lock是InputStreamtry {sd = StreamDecoder.forInputStreamReader(in, this, (String)null); // ## check lock object} catch (UnsupportedEncodingException e) {// The default encoding should always be available默認(rèn)的字符集總是有效的，所以無參構(gòu)造不會拋出throw new Error(e);}}//創(chuàng)建一個使用指定名字字符集的InputStreamReaderpublic InputStreamReader(InputStream in, String charsetName)throws UnsupportedEncodingException{super(in);if (charsetName == null)throw new NullPointerException("charsetName");sd = StreamDecoder.forInputStreamReader(in, this, charsetName);}//創(chuàng)建一個使用給出的字符集的InputStreamReaderpublic InputStreamReader(InputStream in, Charset cs) {super(in);if (cs == null)throw new NullPointerException("charset");sd = StreamDecoder.forInputStreamReader(in, this, cs);}//創(chuàng)建一個使用給出的字符集解碼器的InputStreamReaderpublic InputStreamReader(InputStream in, CharsetDecoder dec) {super(in);if (dec == null)throw new NullPointerException("charset decoder");sd = StreamDecoder.forInputStreamReader(in, this, dec);}

getEncoding方法通過StreamDecoder提供的getEncoding()返回這個流使用的字符編碼名字，如果編碼有歷史名則返回它，如果沒有的話返回官方名字。如果這個對象是通過InputStreamReader(InputStream, String)構(gòu)造的，返回的名字可能給傳給構(gòu)造函數(shù)的不同，如果流已經(jīng)被關(guān)閉會返回null。

public String getEncoding() {return sd.getEncoding();}

read、ready和close方法都是直接調(diào)用StreamDecoder對應(yīng)的方法

//讀取單個字符public int read() throws IOException {return sd.read();}//讀取字符到一個數(shù)組中，返回讀取的字符數(shù)，如果開始前就已經(jīng)到達(dá)末尾則返回-1public int read(char cbuf[], int offset, int length) throws IOException {return sd.read(cbuf, offset, length);}//該流是否準(zhǔn)備完畢讀取。當(dāng)輸入緩沖區(qū)是非空時，或者字節(jié)能夠從下方的字節(jié)流讀取時，InputStreamReader是準(zhǔn)備完的public boolean ready() throws IOException {return sd.ready();}//關(guān)閉輸入流，釋放資源public void close() throws IOException {sd.close();}

Reader

再來看InputStreamReader的父類Reader，它是一個字符讀取流的抽象類，子類必須實現(xiàn)的方法只有read(char[], int, int)和close()。但是大部分子類會重寫這里定義的一些方法來獲取更高的效率或者更多的功能。實現(xiàn)了兩個接口：Readable是字符來源，實現(xiàn)了這個接口字符可以通過CharBuffer來讀取，Closeable接口代表實現(xiàn)類對象可以關(guān)閉來釋放資源，比如打開的文件。根據(jù)對這個類和InputStreamReader的分析，InputStreamReader直接使用了Reader的skip方法也就是讀取到內(nèi)存后丟棄，所以比起能夠直接位移的實現(xiàn)方法效率不佳，此外也不能往回跳和使用mark/reset。

存在一個內(nèi)部變量lock，這個對象用于流中的同步操作。為了提高效率，一個字符流對象可能使用一個對象而不是它自己來保護(hù)臨界區(qū)。因此子類應(yīng)該使用這個對象而不是this或者synchronized方法。InputStreamReader傳入作為lock對象的是自己對象本身。

構(gòu)造函數(shù)可以傳入指定的lock對象，不傳入的話使用對象自身

protected Object lock;//創(chuàng)建一個新的字符流reader，它的臨界區(qū)依靠它自己來同步protected Reader() {this.lock = this;}//創(chuàng)建一個新的字符流reader，它的臨界區(qū)依靠提供的對象來同步protected Reader(Object lock) {if (lock == null) {throw new NullPointerException();}this.lock = lock;}

read方法根據(jù)重載的輸入?yún)?shù)不同，可以讀取單個字符也一個讀取多個字符到數(shù)組或者抽象類CharBuffer中，但是最終這些方法都是基于抽象方法read(char cbuf[], int off, int len)，也是子類必須要實現(xiàn)的方法之一

//嘗試讀取字符到具體的字符緩沖區(qū)中，緩沖區(qū)作為字符的倉庫：唯一的變更是put操作的結(jié)果，沒有翻轉(zhuǎn)或者倒回的操作。public int read(java.nio.CharBuffer target) throws IOException {int len = target.remaining();//讀取長度是緩沖區(qū)剩余的空間，也就是盡量填滿緩沖區(qū)char[] cbuf = new char[len];int n = read(cbuf, 0, len);//將字符讀取到char數(shù)組中，實現(xiàn)隨子類決定，n=讀取到的字符數(shù)if (n > 0)target.put(cbuf, 0, n);//將數(shù)組中的字符復(fù)制到CharBuffer，根據(jù)CharBuffer子類的實現(xiàn)方法具體操作不同return n;}//讀取一個單一的字符，這個方法會阻塞，直到有一個有效字符，或者一個IO錯誤發(fā)生或者，或者到達(dá)了流的尾端public int read() throws IOException {char cb[] = new char[1];if (read(cb, 0, 1) == -1)return -1;elsereturn cb[0];}//將字符讀取到數(shù)組中public int read(char cbuf[]) throws IOException {return read(cbuf, 0, cbuf.length);}abstract public int read(char cbuf[], int off, int len) throws IOException;

Reader默認(rèn)實現(xiàn)了skip方法，跳過n個字符，最大一次跳8KB，不能回跳，InputStreamReader沒有重寫這個方法。通過將字符讀取到skipBuffer數(shù)組然后丟棄來實現(xiàn)，所以會增加gc的工作量，效率不佳。

/** 最大跳過緩沖區(qū)大小8K */private static final int maxSkipBufferSize = 8192;/** 跳過緩沖區(qū)在分配前為null */private char skipBuffer[] = null;public long skip(long n) throws IOException {if (n < 0L)//跳過負(fù)數(shù)會拋出異常，也就是不能回跳，這點和FileInputStream不同throw new IllegalArgumentException("skip value is negative");int nn = (int) Math.min(n, maxSkipBufferSize);//最多跳過8KBsynchronized (lock) {//skip操作是同步的if ((skipBuffer == null) || (skipBuffer.length < nn))skipBuffer = new char[nn];long r = n;while (r > 0) {int nc = read(skipBuffer, 0, (int)Math.min(r, nn));//通過讀取后丟棄來實現(xiàn)跳躍if (nc == -1)break;r -= nc;}return n - r;}}

ready告知這個流是否準(zhǔn)備完讀取數(shù)據(jù)，若不重寫則永遠(yuǎn)返回false

public boolean ready() throws IOException {return false;}

Reader默認(rèn)也不支持mark/reset，所以以下幾個方法不重寫無法使用

//告知這個流是否支持mark()操作public boolean markSupported() {return false;}//標(biāo)記當(dāng)前位置，然后可以通過reset()回跳public void mark(int readAheadLimit) throws IOException {throw new IOException("mark() not supported");}public void reset() throws IOException {throw new IOException("reset() not supported");}

close關(guān)閉流并釋放相關(guān)的系統(tǒng)資源，一旦流被關(guān)閉，其他操作會拋出異常IOException。關(guān)閉一個已經(jīng)關(guān)閉的流沒有作用。是必須實現(xiàn)的兩個類之一。

abstract public void close() throws IOException;

StreamDecoder

最后要來分析的是sun.nio.cs.StreamDecoder，這個包里的源碼在OracleJDK里是沒有的，所以我去找了OpenJDK8里對應(yīng)的源碼來分析。這個類的作用是將字節(jié)解析為字符的解碼器，繼承了抽象類Reader。因為eclipse無法在sun包里的代碼中加斷點，所以只能肉眼調(diào)試了，可能以下分析的具體操作會有些出入，但總體思路應(yīng)該問題不大。

StreamDecoder一次至少要讀取兩個字符，如果調(diào)用者只需要一個字符，那么會將一個字符緩存在leftoverChar，下次需要讀取時再加入到返回內(nèi)容中。所以，使用了這個StreamDecoder的對象只能采取讀取后丟棄的方式進(jìn)行skip，否則會出現(xiàn)內(nèi)容錯亂。

// 為了解決替換問題我們決不能嘗試一次產(chǎn)生少于兩個字符。如果我們只要求返回一個字符，另外一個會存在這里之后再返回private boolean haveLeftoverChar = false;private char leftoverChar;

StreamDecoder只有在流關(guān)閉前才能工作，關(guān)閉后的操作會拋出IOException

private volatile boolean isOpen = true;// 流是否打開private void ensureOpen() throws IOException {if (!isOpen)throw new IOException("Stream closed");}

StreamDecoder自己的構(gòu)造函數(shù)是一個包所有權(quán)的方法，所以包外的類不能夠直接使用構(gòu)造函數(shù)

private Charset cs;private CharsetDecoder decoder;private ByteBuffer bb;// 下面兩個有一個不是nullprivate InputStream in;private ReadableByteChannel ch;StreamDecoder(InputStream in, Object lock, Charset cs) {this(in, lock, cs.newDecoder().onMalformedInput(CodingErrorAction.REPLACE)// 有畸形輸入錯誤時解碼器丟棄錯誤的輸入，替換為替代值然后繼續(xù)后面的操作.onUnmappableCharacter(CodingErrorAction.REPLACE));// 有不可用圖形表示的字符錯誤出現(xiàn)時解碼器丟棄錯誤的輸入，替換為替代值然后繼續(xù)后面的操作}//實際執(zhí)行的構(gòu)造函數(shù)StreamDecoder(InputStream in, Object lock, CharsetDecoder dec) {super(lock);// lock是InputStreamReader對象本身this.cs = dec.charset();this.decoder = dec;// 在directbuffer更快前不會進(jìn)入這個代碼塊，實際上因為堆外內(nèi)存的操作速度不如堆內(nèi)內(nèi)存所以這段是被棄用的if (false && in instanceof FileInputStream) {ch = getChannel((FileInputStream) in);if (ch != null)bb = ByteBuffer.allocateDirect(DEFAULT_BYTE_BUFFER_SIZE);}if (ch == null) {this.in = in;this.ch = null;bb = ByteBuffer.allocate(DEFAULT_BYTE_BUFFER_SIZE);//分配一個大小為8K的堆內(nèi)ByteBuffer}bb.flip(); // 為初始狀態(tài)為空/*flip的作用有兩個：1. 把limit設(shè)置為當(dāng)前的position值2. 把position設(shè)置為0然后處理的數(shù)據(jù)就是從position到limit直接的數(shù)據(jù)，也就是你剛剛讀取過來的數(shù)據(jù)*/}//從ReadableByteChannel讀取數(shù)據(jù)StreamDecoder(ReadableByteChannel ch, CharsetDecoder dec, int mbc) {this.in = null;this.ch = ch;this.decoder = dec;this.cs = dec.charset();this.bb = ByteBuffer.allocate(mbc < 0 ? DEFAULT_BYTE_BUFFER_SIZE : (mbc < MIN_BYTE_BUFFER_SIZE ? MIN_BYTE_BUFFER_SIZE : mbc));//mbc是ByteBuffer初始大小，為負(fù)數(shù)時取8KB，小于32B時取32Bbb.flip();}

我們可以看到在InputStreamReader中，構(gòu)造是通過工廠模式StreamDecoder.forInputStreamReader(in, this, charsetName)來完成的，這里會調(diào)用構(gòu)造函數(shù)來構(gòu)造StreamDecoder對象。這里forInputStreamReader是用于InputStreamReader的，而forDecoder則是用于java.nio.channels.Channels.newReader

// java.io.InputStreamReader工廠模式public static StreamDecoder forInputStreamReader(InputStream in, Object lock, String charsetName)throws UnsupportedEncodingException {String csn = charsetName;if (csn == null)csn = Charset.defaultCharset().name();// 若沒有給出字符集名字則使用平臺默認(rèn)字符集try {if (Charset.isSupported(csn))return new StreamDecoder(in, lock, Charset.forName(csn));// lock是InputStreamReader對象本身} catch (IllegalCharsetNameException x) {}throw new UnsupportedEncodingException(csn);}public static StreamDecoder forInputStreamReader(InputStream in, Object lock, Charset cs) {return new StreamDecoder(in, lock, cs);}public static StreamDecoder forInputStreamReader(InputStream in, Object lock, CharsetDecoder dec) {return new StreamDecoder(in, lock, dec);}// java.nio.channels.Channels.newReader工廠模式public static StreamDecoder forDecoder(ReadableByteChannel ch, CharsetDecoder dec, int minBufferCap) {return new StreamDecoder(ch, dec, minBufferCap);}

read是一個線程安全的操作，可以讀取單個字符，也可以讀取多個字符，讀取前要先檢查是否有緩存的字符，如果除了緩存的字符外沒有其他需求則直接返回，否則要使用implRead(char[], int, int)來進(jìn)行實際的讀取

public int read() throws IOException {return read0();}@SuppressWarnings("fallthrough")private int read0() throws IOException {synchronized (lock) {// 如果緩存中有未返回的字符則直接返回并清空緩存if (haveLeftoverChar) {haveLeftoverChar = false;return leftoverChar;}// Convert more byteschar cb[] = new char[2];int n = read(cb, 0, 2);// 嘗試讀取兩個字符switch (n) {case -1:return -1;// 已到文件結(jié)束符返回-1case 2:leftoverChar = cb[1];// 讀取了2個字符，緩存第二個字符haveLeftoverChar = true;// FALL THROUGH繼續(xù)進(jìn)入case1case 1:return cb[0];// 返回第一個字符default:assert false : n;return -1;}}}public int read(char cbuf[], int offset, int length) throws IOException {int off = offset;int len = length;synchronized (lock) {// 同時只能有一個線程進(jìn)行read操作ensureOpen();// 確保流是打開的if ((off < 0) || (off > cbuf.length) || (len < 0) || ((off + len) > cbuf.length) || ((off + len) < 0)) {throw new IndexOutOfBoundsException();}if (len == 0)return 0;int n = 0;if (haveLeftoverChar) {// 將leftover緩存中的字符復(fù)制到數(shù)組中cbuf[off] = leftoverChar;off++;len--;haveLeftoverChar = false;n = 1;if ((len == 0) || !implReady())// 如果不需要更多字符或者讀到?jīng)]有剩余數(shù)據(jù)則返回return n;}if (len == 1) {// 只讀一個字符時調(diào)用read0，視為讀取兩個緩存一個int c = read0();if (c == -1)return (n == 0) ? -1 : n;cbuf[off] = (char) c;return n + 1;}return n + implRead(cbuf, off, off + len);// 直接調(diào)用read()不會進(jìn)入前兩個if代碼塊，返回實際讀取的字符數(shù)}}

read方法調(diào)用了implRead進(jìn)行讀取，先通過readBytes將字節(jié)盡可能多地讀取到ByteBuffer中，然后通過CharsetDecoder.decode解碼為字符

int implRead(char[] cbuf, int off, int end) throws IOException {//為了處理替代對，這個方法要求調(diào)用者試圖讀取至少兩個字符，如果有的話保存多余字符，在更高的層次比在這里更容易處理這個問題。assert (end - off > 1);CharBuffer cb = CharBuffer.wrap(cbuf, off, end - off);if (cb.position() != 0)// Ensure that cb[0] == cbuf[off]cb = cb.slice();boolean eof = false;//decode的第三個參數(shù)只有在調(diào)用者確保除了buffer中的字節(jié)外沒有其他字節(jié)了才是truefor (;;) {CoderResult cr = decoder.decode(bb, cb, eof);//將ByteBuffer內(nèi)的字節(jié)解碼存入CharBufferif (cr.isUnderflow()) {//向下溢出，CharBuffer沒有填滿if (eof)break;if (!cb.hasRemaining())break;if ((cb.position() > 0) && !inReady())break; // 最多阻塞一次int n = readBytes();//將字節(jié)盡可能多地讀取到ByteBuffer，返回讀取的字節(jié)數(shù)if (n < 0) {eof = true;//已經(jīng)到結(jié)束符了if ((cb.position() == 0) && (!bb.hasRemaining()))break;decoder.reset();//重置decoder，清除內(nèi)部狀態(tài)}continue;}if (cr.isOverflow()) {//向上溢出，CharBuffer滿了assert cb.position() > 0;break;}cr.throwException();}if (eof) {// ## Need to flush decoderdecoder.reset();}if (cb.position() == 0) {if (eof)return -1;assert false;}return cb.position();}private int readBytes() throws IOException {bb.compact();//使ByteBuffer中的字節(jié)變得緊密連接，如果從有字節(jié)是在position到limit的位置，把它們復(fù)制到頭上去，使得position和capacity保持一致try {if (ch != null) {// 從通道中讀取字節(jié)填滿ByteBuffer或者文件中沒有剩余數(shù)據(jù)int n = ch.read(bb);if (n < 0)return n;} else {// 從流中讀取，更新緩沖區(qū)int lim = bb.limit();int pos = bb.position();assert (pos <= lim);//pos>lim會直接拋出異常int rem = (pos <= lim ? lim - pos : 0);//剩余的字節(jié)數(shù)assert rem > 0;int n = in.read(bb.array(), bb.arrayOffset() + pos, rem);//從InputStream中將全部剩余字節(jié)讀取到ByteBuffer直到緩沖區(qū)所需的內(nèi)容裝滿if (n < 0)return n;if (n == 0)throw new IOException("Underlying input stream returned zero bytes");//返回0說明有異常發(fā)生，流中沒數(shù)據(jù)返回的是-1assert (n <= rem) : "n = " + n + ", rem = " + rem;bb.position(pos + n);}} finally {// Flip even when an IOException is thrown,// otherwise the stream will stutterbb.flip();}int rem = bb.remaining();assert (rem != 0) : rem;return rem;}

ready方法被重寫了，檢查緩沖區(qū)或者文件中是否有可以讀取的數(shù)據(jù)

public boolean ready() throws IOException {synchronized (lock) {ensureOpen();return haveLeftoverChar || implReady();// 緩存中有字符或者文件中還有剩余數(shù)據(jù)}}boolean implReady() {return bb.hasRemaining() || inReady();//ByteBuffer中有剩余內(nèi)容或者輸入流中還有剩余內(nèi)容}private boolean inReady() {try {return (((in != null) && (in.available() > 0)) || (ch instanceof FileChannel)); // ## RBC.available()?} catch (IOException x) {return false;}}

close方法將isOpen設(shè)為false，然后關(guān)閉ReadableByteChannel或者InputStream，重復(fù)調(diào)用不會生效。

public void close() throws IOException {synchronized (lock) {if (!isOpen)return;implClose();isOpen = false;}}void implClose() throws IOException {if (ch != null)ch.close();elsein.close();}

getChannel獲取文件通道有重復(fù)調(diào)用失敗立即退出的機(jī)制

// 在早期版本中還沒有構(gòu)建完NIO的native代碼，為了保證第一次嘗試中捕捉到UnsatisfiedLinkError后面再嘗試會立即失敗，所以有了這個標(biāo)記private static volatile boolean channelsAvailable = true;private static FileChannel getChannel(FileInputStream in) {if (!channelsAvailable)return null;try {return in.getChannel();} catch (UnsatisfiedLinkError x) {channelsAvailable = false;return null;}}

總之StreamDecoder就是將讀取到的字節(jié)轉(zhuǎn)換為字符，然后返回。理論上來說除了解碼的具體實現(xiàn)需要依賴底層實現(xiàn)外其他自己重寫應(yīng)該問題不大，一次至少讀取兩個字符是為了處理替代對。關(guān)于用InputStreamReader和使用字節(jié)流讀取后再new String轉(zhuǎn)換成字符哪個比較快有待測試。

總結(jié)

以上是生活随笔為你收集整理的Java FileReader InputStreamReader类源码解析的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： HTML5 之新特性 + 新对象
下一篇： java美元兑换,（Java实现）美元