當(dāng)前位置：首頁 > 编程语言 > java >内容正文

java

Java I/O 扩展

發(fā)布時間：2025/3/17 java 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 Java I/O 扩展小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

Java I/O 擴(kuò)展

標(biāo)簽： Java基礎(chǔ)

NIO

Java 的NIO(新IO)和傳統(tǒng)的IO有著相同的目的: 輸入輸出 .但是NIO使用了不同的方式來處理IO,NIO利用內(nèi)存映射文件(此處文件的含義可以參考Unix的名言一切皆文件)來處理IO, NIO將文件或文件的一段區(qū)域映射到內(nèi)存中(類似于操作系統(tǒng)的虛擬內(nèi)存),這樣就可以像訪問內(nèi)存一樣來訪問文件了.

Channel 和 Buffer是NIO中的兩個核心概念:

Channel是對傳統(tǒng)的IO系統(tǒng)的模擬,在NIO系統(tǒng)中所有的數(shù)據(jù)都需要通過Channel傳輸;Channel與傳統(tǒng)的InputStream OutputStream 最大的區(qū)別在于它提供了一個map()方法,可以直接將一塊數(shù)據(jù)映射到內(nèi)存中.如果說傳統(tǒng)的IO系統(tǒng)是面向流的處理, 則NIO則是面向塊的處理;
Buffer可以被理解成一個容器, 他的本質(zhì)是一個數(shù)組; Buffer作為Channel與程序的中間層, 存入到Channel中的所有對象都必須首先放到Buffer中(Buffer -> Channel), 而從Channel中讀取的數(shù)據(jù)也必須先放到Buffer中(Channel -> Buffer).

Buffer

從原理來看, java.nio.ByteBuffer就像一個數(shù)組,他可以保存多個類型相同的數(shù)據(jù).Buffer只是一個抽象類,對應(yīng)每種基本數(shù)據(jù)類型(boolean除外)都有相應(yīng)的Buffer類: CharBuffer ShortBuffer ByteBuffer等.

這些Buffer除了ByteBuffer之外, 都采用相同或相似的方法來管理數(shù)據(jù), 只是各自管理的數(shù)據(jù)類型不同而已.這些Buffer類都沒有提供構(gòu)造器, 可以通過如下方法來得到一個Buffer對象.

// Allocates a new buffer. static XxxBuffer allocate(int capacity);

其中ByteBuffer還有一個子類MappedByteBuffer,它表示Channel將磁盤文件全部映射到內(nèi)存中后得到的結(jié)果, 通常MappedByteBuffer由Channel的map()方法返回.

Buffer中的幾個概念:

capacity: 該Buffer的最大數(shù)據(jù)容量;
limit: 第一個不應(yīng)該被讀出/寫入的緩沖區(qū)索引;
position: 指明下一個可以被讀出/寫入的緩沖區(qū)索引;
mark: Buffer允許直接將position定位到該mark處.

0 <= mark <= position <= limit <= capacity

Buffer中常用的方法:

方法解釋

int capacity()	Returns this buffer’s capacity.
int remaining()	Returns the number of elements between the current position and the limit.
int limit()	Returns this buffer’s limit.
int position()	Returns this buffer’s position.
Buffer position(int newPosition)	Sets this buffer’s position.
Buffer reset()	Resets this buffer’s position to the previously-marked position.
Buffer clear()	Clears this buffer.(并不是真的清空, 而是為下一次插入數(shù)據(jù)做好準(zhǔn)備
Buffer flip()	Flips this buffer.(將數(shù)據(jù)封存,為讀取數(shù)據(jù)做好準(zhǔn)備)

除了這些在Buffer基類中存在的方法之外, Buffer的所有子類還提供了兩個重要的方法:

put() : 向Buffer中放入數(shù)據(jù)
get() : 從Buffer中取數(shù)據(jù)

當(dāng)使用put/get方法放入/取出數(shù)據(jù)時, Buffer既支持單個數(shù)據(jù)的訪問, 也支持(以數(shù)組為參數(shù))批量數(shù)據(jù)的訪問.而且當(dāng)使用put/get方法訪問Buffer的數(shù)據(jù)時, 也可分為相對和絕對兩種:

相對 : 從Buffer的當(dāng)前position處開始讀取/寫入數(shù)據(jù), position按處理元素個數(shù)后移.
絕對 : 直接根據(jù)索引讀取/寫入數(shù)據(jù), position不變.

/*** @author jifang* @since 16/1/9下午8:31.*/ public class BufferTest {@Testpublic void client() {ByteBuffer buffer = ByteBuffer.allocate(64);displayBufferInfo(buffer, "init");buffer.put((byte) 'a');buffer.put((byte) 'b');buffer.put((byte) 'c');displayBufferInfo(buffer, "after put");buffer.flip();displayBufferInfo(buffer, "after flip");System.out.println((char) buffer.get());displayBufferInfo(buffer, "after a get");buffer.clear();displayBufferInfo(buffer, "after clear");// 依然可以訪問到數(shù)據(jù)System.out.println((char) buffer.get(2));}private void displayBufferInfo(Buffer buffer, String msg) {System.out.println("---------" + msg + "-----------");System.out.println("position: " + buffer.position());System.out.println("limit: " + buffer.limit());System.out.println("capacity: " + buffer.capacity());} }

通過allocate()方法創(chuàng)建的Buffer對象是普通Buffer, ByteBuffer還提供了一個allocateDirect()方法來創(chuàng)建DirectByteBuffer. DirectByteBuffer的創(chuàng)建成本比普通Buffer要高, 但DirectByteBuffer的讀取效率也會更高.所以DirectByteBuffer適用于生存期比較長的Buffer.
只有ByteBuffer才提供了allocateDirect(int capacity)方法, 所以只能在ByteBuffer級別上創(chuàng)建DirectByteBuffer, 如果希望使用其他類型, 則可以將Buffer轉(zhuǎn)換成其他類型的Buffer.

Channel

像上面這樣使用Buffer感覺是完全沒有誘惑力的(就一個數(shù)組嘛,還整得這么麻煩⊙﹏⊙b).其實(shí)Buffer真正的強(qiáng)大之處在于與Channel的結(jié)合,從Channel中直接映射一塊內(nèi)存進(jìn)來,而沒有必要一一的get/put.

java.nio.channels.Channel類似于傳統(tǒng)的流對象, 但與傳統(tǒng)的流對象有以下兩個區(qū)別:

Channel可以直接將指定文件的部分或者全部映射成Buffer
程序不能直接訪問Channel中的數(shù)據(jù), 必須要經(jīng)過Buffer作為中間層.

Java為Channel接口提供了FileChannel DatagramChannel Pipe.SinkChannel Pipe.SourceChannel SelectableChannel
SocketChannel ServerSocketChannel. 所有的Channel都不應(yīng)該通過構(gòu)造器來直接創(chuàng)建, 而是通過傳統(tǒng)的InputStream OutputStream的getChannel()方法來返回對應(yīng)的Channel, 當(dāng)然不同的節(jié)點(diǎn)流獲得的Channel不一樣. 例如, FileInputStream FileOutputStream 返回的是FileChannel, PipedInputStream PipedOutputStream 返回的是Pipe.SourceChannel Pipe.SinkChannel;

Channel中最常用的三個方法是MappedByteBuffer map(FileChannel.MapMode mode, long position, long size) read() write(), 其中map()用于將Channel對應(yīng)的部分或全部數(shù)據(jù)映射成ByteBuffer, 而read/write有一系列的重載形式, 用于從Buffer中讀寫數(shù)據(jù).

/*** @author jifang* @since 16/1/9下午10:55.*/ public class ChannelTest {private CharsetDecoder decoder = Charset.forName("utf-8").newDecoder();@Testpublic void client() throws IOException {try (FileChannel inChannel = new FileInputStream("save.txt").getChannel();FileChannel outChannel = new FileOutputStream("attach.txt").getChannel()) {MappedByteBuffer buffer = inChannel.map(FileChannel.MapMode.READ_ONLY, 0,new File("save.txt").length());displayBufferInfo(buffer, "init buffer");// 將Buffer內(nèi)容一次寫入另一文件的ChanneloutChannel.write(buffer);buffer.flip();// 解碼CharBuffer之后輸出System.out.println(decoder.decode(buffer));}}// ... }

Charset

Java從1.4開始提供了java.nio.charset.Charset來處理字節(jié)序列和字符序列(字符串)之間的轉(zhuǎn)換, 該類包含了用于創(chuàng)建解碼器和編碼器的方法, 需要注意的是, Charset類是不可變類.

Charset提供了availableCharsets()靜態(tài)方法來獲取當(dāng)前JDK所支持的所有字符集.

/*** @author jifang* @since 16/1/10下午4:32.*/ public class CharsetLearn {@Testpublic void testGetAllCharsets() {SortedMap<String, Charset> charsetMap = Charset.availableCharsets();for (Map.Entry<String, Charset> charset : charsetMap.entrySet()) {System.out.println(charset.getKey() + " aliases -> " + charset.getValue().aliases() + " chaset -> " + charset.getValue());}} }

執(zhí)行上面代碼可以看到每個字符集都有一些字符串別名(比如UTF-8還有unicode-1-1-utf-8 UTF8的別名), 一旦知道了字符串的別名之后, 程序就可以調(diào)用Charset的forName()方法來創(chuàng)建對應(yīng)的Charset對象:

@Test public void testGetCharset() {Charset utf8 = Charset.forName("UTF-8");Charset unicode11 = Charset.forName("unicode-1-1-utf-8");System.out.println(utf8.name());System.out.println(unicode11.name());System.out.println(unicode11 == utf8); }

在Java 1.7 之后, JDK又提供了一個工具類StandardCharsets, 里面提供了一些靜態(tài)屬性來表示標(biāo)準(zhǔn)的常用字符集:

@Test public void testGetCharset() {// 使用UTF-8屬性Charset utf8 = StandardCharsets.UTF_8;Charset unicode11 = Charset.forName("unicode-1-1-utf-8");System.out.println(utf8.name());System.out.println(unicode11.name());System.out.println(unicode11 == utf8); }

獲得了Charset對象之后,就可以使用decode()/encode()方法來對ByteBuffer CharBuffer進(jìn)行編碼/解碼了

方法功能

ByteBuffer encode(CharBuffer cb)	Convenience method that encodes Unicode characters into bytes in this charset.
ByteBuffer encode(String str)	Convenience method that encodes a string into bytes in this charset.
CharBuffer decode(ByteBuffer bb)	Convenience method that decodes bytes in this charset into Unicode characters.

或者也可以通過Charset對象的newDecoder() newEncoder() 來獲取CharsetDecoder解碼器和CharsetEncoder編碼器來完成更加靈活的編碼/解碼操作(他們肯定也提供了encode和decode方法).

@Test public void testDecodeEncode() throws IOException {File inFile = new File("save.txt");FileChannel in = new FileInputStream(inFile).getChannel();MappedByteBuffer byteBuffer = in.map(FileChannel.MapMode.READ_ONLY, 0, inFile.length());// Charset utf8 = Charset.forName("UTF-8");Charset utf8 = StandardCharsets.UTF_8;// 解碼// CharBuffer charBuffer = utf8.decode(byteBuffer);CharBuffer charBuffer = utf8.newDecoder().decode(byteBuffer);System.out.println(charBuffer);// 編碼// ByteBuffer encoded = utf8.encode(charBuffer);ByteBuffer encoded = utf8.newEncoder().encode(charBuffer);byte[] bytes = new byte[(int) inFile.length()];encoded.get(bytes);for (int i = 0; i < bytes.length; ++i) {System.out.print(bytes[i]);}System.out.println();}

String類里面也提供了一個getBytes(String charset)方法來使用指定的字符集將字符串轉(zhuǎn)換成字節(jié)序列.

使用WatchService監(jiān)控文件變化

在以前的Java版本中,如果程序需要監(jiān)控文件系統(tǒng)的變化,則可以考慮啟動一條后臺線程,這條后臺線程每隔一段時間去遍歷一次指定目錄的文件,如果發(fā)現(xiàn)此次遍歷的結(jié)果與上次不同,則認(rèn)為文件發(fā)生了變化. 但在后來的NIO.2中,Path類提供了register方法來監(jiān)聽文件系統(tǒng)的變化.

WatchKey register(WatchService watcher, WatchEvent.Kind<?>... events); WatchKey register(WatchService watcher, WatchEvent.Kind<?>[] events, WatchEvent.Modifier... modifiers);

其實(shí)是Path實(shí)現(xiàn)了Watchable接口, register是Watchable提供的方法.

WatchService代表一個文件系統(tǒng)監(jiān)聽服務(wù), 它負(fù)責(zé)監(jiān)聽Path目錄下的文件變化.而WatchService是一個接口, 需要由FileSystem的實(shí)例來創(chuàng)建, 我們往往這樣獲取一個WatchService

WatchService service = FileSystems.getDefault().newWatchService();

一旦register方法完成注冊之后, 接下來就可調(diào)用WatchService的如下方法來獲取被監(jiān)聽的目錄的文件變化事件:

方法釋義

WatchKey poll()	Retrieves and removes the next watch key, or null if none are present.
WatchKey poll(long timeout, TimeUnit unit)	Retrieves and removes the next watch key, waiting if necessary up to the specified wait time if none are yet present.
WatchKey take()	Retrieves and removes next watch key, waiting if none are yet present.

獲取到WatchKey之后, 就可調(diào)用其方法來查看到底發(fā)生了什么事件, 得到WatchEvent

方法釋義

List<WatchEvent<?>> pollEvents()	Retrieves and removes all pending events for this watch key, returning a List of the events that were retrieved.
boolean reset()	Resets this watch key.

WatchEvent

方法釋義

T context()	Returns the context for the event.
int count()	Returns the event count.
WatchEvent.Kind<T> kind()	Returns the event kind.

/*** @author jifang* @since 16/1/10下午8:00.*/ public class ChangeWatcher {public static void main(String[] args) {watch("/Users/jifang/");}public static void watch(String directory) {try {WatchService service = FileSystems.getDefault().newWatchService();Paths.get(directory).register(service,StandardWatchEventKinds.ENTRY_CREATE,StandardWatchEventKinds.ENTRY_DELETE,StandardWatchEventKinds.ENTRY_MODIFY);while (true) {WatchKey key = service.take();for (WatchEvent event : key.pollEvents()) {System.out.println(event.context() + " 文件發(fā)生了 " + event.kind() + " 事件!");}if (!key.reset()) {break;}}} catch (IOException | InterruptedException e) {throw new RuntimeException(e);}} }

通過使用WatchService, 可以非常優(yōu)雅的監(jiān)控指定目錄下的文件變化, 至于文件發(fā)生變化后的處理, 就取決于業(yè)務(wù)需求了, 比如我們可以做一個日志分析器, 定時去掃描日志目錄, 查看日志大小是否改變, 當(dāng)發(fā)生改變時候, 就掃描發(fā)生改變的部分, 如果發(fā)現(xiàn)日志中有異常產(chǎn)生(比如有Exception/Timeout類似的關(guān)鍵字存在), 就把這段異常信息截取下來, 發(fā)郵件/短信給管理員.

Guava IO

平時開發(fā)中常用的IO框架有Apache的commons-io和Google Guava的IO模塊; 不過Apache的commons-io包比較老,更新比較緩慢(最新的包還是2012年的); 而Guava則更新相對頻繁, 最近剛剛發(fā)布了19.0版本, 因此在這兒僅介紹Guava對Java IO的擴(kuò)展.
使用Guava需要在pom.xml中添加如下依賴:

<dependency><groupId>com.google.guava</groupId><artifactId>guava</artifactId><version>19.0</version> </dependency>

最近我在寫一個網(wǎng)頁圖片抓取工具時, 最開始使用的是Java的URL.openConnection() + IOStream操作來實(shí)現(xiàn), 代碼非常繁瑣且性能不高(詳細(xì)代碼可類似參考java 使用URL來讀取網(wǎng)頁內(nèi)容). 而使用了Guava之后幾行代碼就搞定了網(wǎng)頁的下載功能:

public static String getHtml(String url) {if (StringUtils.isBlank(url)) {return null;}try {return Resources.toString(new URL(url), StandardCharsets.UTF_8);} catch (IOException e) {LOGGER.error("getHtml error url = {}", url, e);throw new RuntimeException(e);} }

代碼清晰多了.

還可以使用Resources類的readLines(URL url, Charset charset, LineProcessor<T> callback)方法來實(shí)現(xiàn)只抓取特定的網(wǎng)頁內(nèi)容的功能:

public static List<String> processUrl(String url, final String regexp) {try {return Resources.readLines(new URL(url), StandardCharsets.UTF_8, new LineProcessor<List<String>>() {private Pattern pattern = Pattern.compile(regexp);private List<String> strings = new ArrayList<>();@Overridepublic boolean processLine(String line) throws IOException {Matcher matcher = pattern.matcher(line);while (matcher.find()) {strings.add(matcher.group());}return true;}@Overridepublic List<String> getResult() {return strings;}});} catch (IOException e) {LOGGER.error("processUrl error, url = {}, regexp = {}", url, regexp, e);throw new RuntimeException(e);} }

而性能的話, 我記得有這么一句話來評論STL的

STL性能可能不是最高的, 但絕對不是最差的!

我認(rèn)為這句話同樣適用于Guava; 在Guava IO中, 有三類操作是比較常用的:

對Java傳統(tǒng)的IO操作的簡化;
Guava對源與匯的支持;
Guava Files Resources對文件/資源的支持;

Java IO 簡化

在Guava中,用InputStream/OutputStream Readable/Appendable來對應(yīng)Java中的字節(jié)流和字符流(Writer實(shí)現(xiàn)了Appendable接口,Reader實(shí)現(xiàn)了Readable接口).并用com.google.common.io.ByteStreams和com.google.common.io.CharStreams來提供對傳統(tǒng)IO的支持.

這兩個類中, 實(shí)現(xiàn)了很多static方法來簡化Java IO操作,如:

static long copy(Readable/InputStream from, Appendable/OutputStream to)
static byte[] toByteArray(InputStream in)
static int read(InputStream in, byte[] b, int off, int len)
static ByteArrayDataInput newDataInput(byte[] bytes, int start)
static String toString(Readable r)

/*** 一行代碼讀取文件內(nèi)容** @throws IOException*/ @Test public void getFileContent() throws IOException {FileReader reader = new FileReader("save.txt");System.out.println(CharStreams.toString(reader)); }

關(guān)于ByteStreams和CharStreams的詳細(xì)介紹請參考Guava文檔

Guava源與匯

Guava提出源與匯的概念以避免總是直接跟流打交道.
源與匯是指某個你知道如何從中打開流的資源,如File或URL.
源是可讀的，匯是可寫的.

Guava的源有 ByteSource 和 CharSource; 匯有ByteSink CharSink

源與匯的好處是它們提供了一組通用的操作(如:一旦你把數(shù)據(jù)源包裝成了ByteSource,無論它原先的類型是什么,你都得到了一組按字節(jié)操作的方法). 其實(shí)就源與匯就類似于Java IO中的InputStream/OutputStream, Reader/Writer. 只要能夠獲取到他們或者他們的子類, 就可以使用他們提供的操作, 不管底層實(shí)現(xiàn)如何.

/*** @author jifang* @since 16/1/11下午4:39.*/ public class SourceSinkTest {@Testpublic void fileSinkSource() throws IOException {File file = new File("save.txt");CharSink sink = Files.asCharSink(file, StandardCharsets.UTF_8);sink.write("- 你好嗎?\n- 我很好.");CharSource source = Files.asCharSource(file, StandardCharsets.UTF_8);System.out.println(source.read());}@Testpublic void netSource() throws IOException {CharSource source = Resources.asCharSource(new URL("http://www.sun.com"), StandardCharsets.UTF_8);System.out.println(source.readFirstLine());} }

獲取源與匯

獲取字節(jié)源與匯的常用方法有:

字節(jié)源字節(jié)匯

Files.asByteSource(File)	Files.asByteSink(File file, FileWriteMode... modes)
Resources.asByteSource(URL url)	-
ByteSource.wrap(byte[] b)	-
ByteSource.concat(ByteSource... sources)	-

獲取字符源與匯的常用方法有:

字符源字符匯

Files.asCharSource(File file, Charset charset)	Files.asCharSink(File file, Charset charset, FileWriteMode... modes)
Resources.asCharSource(URL url, Charset charset)	-
CharSource.wrap(CharSequence charSequence)	-
CharSource.concat(CharSource... sources)	-
ByteSource.asCharSource(Charset charset)	ByteSink.asCharSink(Charset charset)

使用源與匯

這四個源與匯提供通用的方法進(jìn)行讀/寫, 用法與Java IO類似,但比Java IO流會更加簡單方便(如CharSource可以一次性將源中的數(shù)據(jù)全部讀出String read(), 也可以將源中的數(shù)據(jù)一次拷貝到Writer或匯中l(wèi)ong copyTo(CharSink/Appendable to))

@Test public void saveHtmlFileChar() throws IOException {CharSource source = Resources.asCharSource(new URL("http://www.google.com"), StandardCharsets.UTF_8);source.copyTo(Files.asCharSink(new File("save1.html"), StandardCharsets.UTF_8)); }@Test public void saveHtmlFileByte() throws IOException {ByteSource source = Resources.asByteSource(new URL("http://www.google.com"));//source.copyTo(new FileOutputStream("save2.html"));source.copyTo(Files.asByteSink(new File("save2.html"))); }

其他詳細(xì)用法請參考Guava文檔

Files與Resources

上面看到了使用Files與Resources將URL和File轉(zhuǎn)換成ByteSource與CharSource的用法,其實(shí)這兩個類還提供了很多方法來簡化IO, 詳細(xì)請參考Guava文檔
Resources常用方法

Resources 方法釋義

static void copy(URL from, OutputStream to)	Copies all bytes from a URL to an output stream.
static URL getResource(String resourceName)	Returns a URL pointing to resourceName if the resource is found using the context class loader.
static List<String> readLines(URL url, Charset charset)	Reads all of the lines from a URL.
static <T> T readLines(URL url, Charset charset, LineProcessor<T> callback)	Streams lines from a URL, stopping when our callback returns false, or we have read all of the lines.
static byte[] toByteArray(URL url)	Reads all bytes from a URL into a byte array.
static String toString(URL url, Charset charset)	Reads all characters from a URL into a String, using the given character set.

Files常用方法

Files 方法釋義

static void append(CharSequence from, File to, Charset charset)	Appends a character sequence (such as a string) to a file using the given character set.
static void copy(File from, Charset charset, Appendable to)	Copies all characters from a file to an appendable object, using the given character set.
static void copy(File from, File to)	Copies all the bytes from one file to another.
static void copy(File from, OutputStream to)	Copies all bytes from a file to an output stream.
static File createTempDir()	Atomically creates a new directory somewhere beneath the system’s temporary directory (as defined by the java.io.tmpdir system property), and returns its name.
static MappedByteBuffer map(File file, FileChannel.MapMode mode, long size)	Maps a file in to memory as per FileChannel.map(java.nio.channels.FileChannel.MapMode, long, long) using the requested FileChannel.MapMode.
static void move(File from, File to)	Moves a file from one path to another.
static <T> T readBytes(File file, ByteProcessor<T> processor)	Process the bytes of a file.
static String readFirstLine(File file, Charset charset)	Reads the first line from a file.
static List<String> readLines(File file, Charset charset)	Reads all of the lines from a file.
static <T> T readLines(File file, Charset charset, LineProcessor<T> callback)	Streams lines from a File, stopping when our callback returns false, or we have read all of the lines.
static byte[] toByteArray(File file)	Reads all bytes from a file into a byte array.
static String toString(File file, Charset charset)	Reads all characters from a file into a String, using the given character set.
static void touch(File file)	Creates an empty file or updates the last updated timestamp on the same as the unix command of the same name.
static void write(byte[] from, File to)	Overwrites a file with the contents of a byte array.
static void write(CharSequence from, File to, Charset charset)	Writes a character sequence (such as a string) to a file using the given character set.

參考:

Google Guava官方教程（中文版）

Google Guava官方文檔

總結(jié)

以上是生活随笔為你收集整理的Java I/O 扩展的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

JAVA

上一篇：退化函数及多种复原方法【Matlab】
下一篇： java美元兑换,（Java实现）美元