Effective C# 原则11:选择foreach循环
Effective C# 原則11:選擇foreach循環
Item 11: Prefer foreach Loops
C#的foreach語句是從do,while,或者for循環語句變化而來的,它相對要好一些,它可以為你的任何集合產生最好的迭代代碼。它的定義依懶于.Net框架里的集合接口,并且編譯器會為實際的集合生成最好的代碼。當你在集合上做迭代時,可用使用foreach來取代其它的循環結構。檢查下面的三個循環:
int [] foo = new int[100];
// Loop 1:
foreach ( int i in foo)
? Console.WriteLine( i.ToString( ));
// Loop 2:
for ( int index = 0;? index < foo.Length;? index++ )
? Console.WriteLine( foo[index].ToString( ));
// Loop 3:
int len = foo.Length;
for ( int index = 0;? index < len;? index++ )
? Console.WriteLine( foo[index].ToString( ));
對于當前的C#編譯器(版本1.1或者更高)而言,循環1是最好的。起碼它的輸入要少些,這會使你的個人開發效率提提升。(1.0的C#編譯器對循環1而言要慢很多,所以對于那個版本循環2是最好的。) 循環3,大多數C或者C++程序員會認為它是最有效的,但它是最糟糕的。因為在循環外部取出了變量Length的值,從而阻礙了JIT編譯器將邊界檢測從循環中移出。
C#代碼是安全的托管代碼里運行的。環境里的每一塊內存,包括數據的索引,都是被監視的。稍微展開一下,循環3的代碼實際很像這樣的:
// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;? index < len;? index++ )
{
? if ( index < foo.Length )
??? Console.WriteLine( foo[index].ToString( ));
? else
??? throw new IndexOutOfRangeException( );
}
C#的JIT編譯器跟你不一樣,它試圖幫你這樣做了。你本想把Length屬性提出到循環外面,卻使得編譯做了更多的事情,從而也降低了速度。CLR要保證的內容之一就是:你不能寫出讓變量訪問不屬于它自己內存的代碼。在訪問每一個實際的集合時,運行時確保對每個集合的邊界(不是len變量)做了檢測。你把一個邊界檢測分成了兩個。
你還是要為循環的每一次迭代做數組做索引檢測,而且是兩次。循環1和循環2要快一些的原因是因為,C#的JIT編譯器可以驗證數組的邊界來確保安全。任何循環變量不是數據的長度時,邊界檢測就會在每一次迭代中發生。(譯注:這里幾次說到JIT編譯器,它是指將IL代碼編譯成本地代碼時的編譯器,而不是指將C#代碼或者其它代碼編譯成IL代碼時的編譯器。其實我們可以用不安全選項來迫使JIT不做這樣的檢測,從而使運行速度提高。)
原始的C#編譯器之所以對foreach以及數組產生很慢的代碼,是因為涉及到了裝箱。裝箱會在原則17中展開討論。數組是安全的類型,現在的foreach可以為數組生成與其它集合不同的IL代碼。對于數組的這個版本,它不再使用IEnumerator接口,就是這個接口須要裝箱與拆箱。
IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
? int i = (int) it.Current; // box and unbox here.
? Console.WriteLine( i.ToString( ) );
}
取而代之的是,foreach語句為數組生成了這樣的結構:
for ( int index = 0;? index < foo.Length;? index++ )
? Console.WriteLine( foo[index].ToString( ));
(譯注:注意數組與集合的區別。數組是一次性分配的連續內存,集合是可以動態添加與修改的,一般用鏈表來實現。而對于C#里所支持的鋸齒數組,則是一種折衷的處理。)
foreach總能保證最好的代碼。你不用操心哪種結構的循環有更高的效率:foreach和編譯器為你代勞了。
如果你并不滿足于高效,例如還要有語言的交互。這個世界上有些人(是的,正是他們在使用其它的編程語言)堅定不移的認為數組的索引是從1開始的,而不是0。不管我們如何努力,我們也無法破除他們的這種習慣。.Net開發組已經嘗試過。為此你不得不在C#這樣寫初始化代碼,那就是數組從某個非0數值開始的。
// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });
這段代碼應該足夠讓所有人感到畏懼了(譯注:對我而言,確實有一點)。但有些人就是很頑固,無認你如何努力,他們會從1開始計數。很幸運,這是那些問題當中的一個,而你可以讓編譯器來“欺騙”。用foreach來對test數組進行迭代:
foreach( int j in test )
? Console.WriteLine ( j );
foreach語句知道如何檢測數組的上下限,所以你應該這樣做,而且這和for循環的速度是一樣的,也不用管某人是采用那個做為下界。
對于多維數組,foreach給了你同樣的好處。假設你正在創建一個棋盤。你將會這樣寫兩段代碼:
private Square[,] _theBoard = new Square[ 8, 8 ];
// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? _theBoard[ i, j ].PaintSquare( );
取而代之的是,你可以這樣簡單的畫這個棋盤:
foreach( Square sq in _theBoard )
? sq.PaintSquare( );
(譯注:本人不贊成這樣的方法。它隱藏了數組的行與列的邏輯關系。循環是以行優先的,如果你要的不是這個順序,那么這種循環并不好。)
foreach語句生成恰當的代碼來迭代數組里所有維數的數據。如果將來你要創建一個3D的棋盤,foreach循環還是一樣的工作,而另一個循環則要做這樣的修改:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
????? _theBoard[ i, j, k ].PaintSquare( );
(譯注:這樣看上去雖然代碼很多,但我覺得,只要是程序員都可以一眼看出這是個三維數組的循環,但是對于foreach,我看沒人一眼可以看出來它在做什么! 個人理解。當然,這要看你怎樣認識,這當然可以說是foreach的一個優點。)
事實上,foreach循環還可以在每個維的下限不同的多維數組上工作(譯注:也就是鋸齒數組)。 我不想寫這樣的代碼,即使是為了做例示。但當某人在某時寫了這樣的集合時,foreach可以勝任。
foreach也給了你很大的伸縮性,當某時你發現須要修改數組里底層的數據結構時,它可以盡可能多的保證代碼不做修改。我們從一個簡單的數組來討論這個問題:
int [] foo = new int[100];
假設后來某些時候,你發現它不具備數組類(array class)的一些功能,而你又正好要這些功能。你可能簡單把一個數組修改為ArrayList:
// Set the initial size:
ArrayList foo = new ArrayList( 100 );
任何用for循環的代碼被破壞:
int sum = 0;
for ( int index = 0;
? // won't compile: ArrayList uses Count, not Length
? index < foo.Length;
? index++ )
? // won't compile: foo[ index ] is object, not int.
? sum += foo[ index ];
然而,foreach循環可以根據所操作的對象不同,而自動編譯成不同的代碼來轉化恰當的類型。什么也不用改。還不只是對標準的數組可以這樣,對于其它任何的集合類型也同樣可以用foreach.
如果你的集合支持.Net環境下的規則,你的用戶就可以用foreach來迭代你的數據類型。為了讓foreach語句認為它是一個集合類型,一個類應該有多數屬性中的一個:公開方法GetEnumerator()的實現可以構成一個集合類。明確的實現IEnumerable接口可以產生一個集合類。實現IEnumerator接口也可以實現一個集合類。foreach可以在任何一個上工作。
foreach有一個好處就是關于資源管理。IEnumerable接口包含一個方法:GetEnumerator()。foreach語句是一個在可枚舉的類型上生成下面的代碼,優化過的:
IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
? while ( it.MoveNext( ))
? {
??? int elem = ( int ) it.Current;
??? sum += elem;
? }
}
如果斷定枚舉器實現了IDisposable接口,編譯器可以自動優化代碼為finally塊。但對你而言,明白這一點很重要,無論如何,foreach生成了正確的代碼。
foreach是一個應用廣泛的語句。它為數組的上下限自成正確的代碼,迭代多維數組,強制轉化為恰當的類型(使用最有效的結構),還有,這是最重要的,生成最有效的循環結構。這是迭代集合最有效的方法。這樣,你寫出的代碼更持久(譯注:就是不會因為錯誤而改動太多的代碼),第一次寫代碼的時候更簡潔。這對生產力是一個小的進步,隨著時間的推移會累加起來。
=========================
Item 11: Prefer foreach Loops
The C# foreach statement is more than just a variation of the do, while, or for loops. It generates the best iteration code for any collection you have. Its definition is tied to the collection interfaces in the .NET Framework, and the C# compiler generates the best code for the particular type of collection. When you iterate collections, use foreach instead of other looping constructs. Examine these three loops:
int [] foo = new int[100];
// Loop 1:
foreach ( int i in foo)
? Console.WriteLine( i.ToString( ));
// Loop 2:
for ( int index = 0;
? index < foo.Length;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
// Loop 3:
int len = foo.Length;
for ( int index = 0;
? index < len;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
?
For the current and future C# compilers (version 1.1 and up), loop 1 is best. It's even less typing, so your personal productivity goes up. (The C# 1.0 compiler produced much slower code for loop 1, so loop 2 is best in that version.) Loop 3, the construct most C and C++ programmers would view as most efficient, is the worst option. By hoisting the Length variable out of the loop, you make a change that hinders the JIT compiler's chance to remove range checking inside the loop.
C# code runs in a safe, managed environment. Every memory location is checked, including array indexes. Taking a few liberties, the actual code for loop 3 is something like this:
// Loop 3, as generated by compiler:
int len = foo.Length;
for ( int index = 0;
? index < len;
? index++ )
{
? if ( index < foo.Length )
??? Console.WriteLine( foo[index].ToString( ));
? else
??? throw new IndexOutOfRangeException( );
}
?
The JIT C# compiler just doesn't like you trying to help it this way. Your attempt to hoist the Length property access out of the loop just made the JIT compiler do more work to generate even slower code. One of the CLR guarantees is that you cannot write code that overruns the memory that your variables own. The runtime generates a test of the actual array bounds (not your len variable) before accessing each particular array element. You get one bounds check for the price of two.
You still pay to check the array index on every iteration of the loop, and you do so twice. The reason loops 1 and 2 are faster is that the C# compiler and the JIT compiler can verify that the bounds of the loop are guaranteed to be safe. Anytime the loop variable is not the length of the array, the bounds check is performed on each iteration.
The reason that foreach and arrays generated very slow code in the original C# compiler concerns boxing, which is covered extensively in Item 17. Arrays are type safe. foreach now generates different IL for arrays than other collections. The array version does not use the IEnumerator interface, which would require boxing and unboxing:
IEnumerator it = foo.GetEnumerator( );
while( it.MoveNext( ))
{
? int i = (int) it.Current; // box and unbox here.
? Console.WriteLine( i.ToString( ) );
}
?
Instead, the foreach statement generates this construct for arrays:
for ( int index = 0;
? index < foo.Length;
? index++ )
? Console.WriteLine( foo[index].ToString( ));
?
foreach always generates the best code. You don't need to remember which construct generates the most efficient looping construct: foreach and the compiler will do it for you.
If efficiency isn't enough for you, consider language interop. Some folks in the world (yes, most of them use other programming languages) strongly believe that index variables start at 1, not 0. No matter how much we try, we won't break them of this habit. The .NET team tried. You have to write this kind of initialization in C# to get an array that starts at something other than 0:
// Create a single dimension array.
// Its range is [ 1 .. 5 ]
Array test = Array.CreateInstance( typeof( int ),
new int[ ]{ 5 }, new int[ ]{ 1 });
?
This code should be enough to make anybody cringe and just write arrays that start at 0. But some people are stubborn. Try as you might, they will start counting at 1. Luckily, this is one of those problems that you can foist off on the compiler. Iterate the test array using foreach:
foreach( int j in test )
? Console.WriteLine ( j );
?
The foreach statement knows how to check the upper and lower bounds on the array, so you don't have toand it's just as fast as a hand-coded for loop, no matter what different lower bound someone decides to use.
foreach adds other language benefits for you. The loop variable is read-only: You can't replace the objects in a collection using foreach. Also, there is explicit casting to the correct type. If the collection contains the wrong type of objects, the iteration throws an exception.
foreach gives you similar benefits for multidimensional arrays. Suppose that you are creating a chess board. You would write these two fragments:
private Square[,] _theBoard = new Square[ 8, 8 ];
// elsewhere in code:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? _theBoard[ i, j ].PaintSquare( );
?
Instead, you can simplify painting the board this way:
foreach( Square sq in _theBoard )
? sq.PaintSquare( );
?
The foreach statement generates the proper code to iterate across all dimensions in the array. If you make a 3D chessboard in the future, the foreach loop just works. The other loop needs modification:
for ( int i = 0; i < _theBoard.GetLength( 0 ); i++ )
? for( int j = 0; j < _theBoard.GetLength( 1 ); j++ )
??? for( int k = 0; k < _theBoard.GetLength( 2 ); k++ )
????? _theBoard[ i, j, k ].PaintSquare( );
?
In fact, the foreach loop would work on a multidimensional array that had different lower bounds in each direction. I don't want to write that kind of code, even as an example. But when someone else codes that kind of collection, foreach can handle it.
foreach also gives you the flexibility to keep much of the code intact if you find later that you need to change the underlying data structure from an array. We started this discussion with a simple array:
int [] foo = new int[100];
?
Suppose that, at some later point, you realize that you need capabilities that are not easily handled by the array class. You can simply change the array to an ArrayList:
// Set the initial size:
ArrayList foo = new ArrayList( 100 );
?
Any hand-coded for loops are broken:
int sum = 0;
for ( int index = 0;
? // won't compile: ArrayList uses Count, not Length
? index < foo.Length;
? index++ )
? // won't compile: foo[ index ] is object, not int.
? sum += foo[ index ];
?
However, the foreach loop compiles to different code that automatically casts each operand to the proper type. No changes are needed. It's not just changing to standard collections classes, eitherany collection type can be used with foreach.
Users of your types can use foreach to iterate across members if you support the .NET environment's rules for a collection. For the foreach statement to consider it a collection type, a class must have one of a number of properties. The presence of a public GetEnumerator() method makes a collection class. Explicitly implementing the IEnumerable interface creates a collection type. Implementing the IEnumerator interface creates a collection type. foreach works with any of them.
foreach has one added benefit regarding resource management. The IEnumerable interface contains one method: GetEnumerator(). The foreach statement on an enumerable type generates the following, with some optimizations:
IEnumerator it = foo.GetEnumerator( ) as IEnumerator;
using ( IDisposable disp = it as IDisposable )
{
? while ( it.MoveNext( ))
? {
??? int elem = ( int ) it.Current;
??? sum += elem;
? }
}
?
The compiler automatically optimizes the code in the finally clause if it can determine for certain whether the enumerator implements IDisposable. But for you, it's more important to see that, no matter what, foreach generates correct code.
foreach is a very versatile statement. It generates the right code for upper and lower bounds in arrays, iterates multidimensional arrays, coerces the operands into the proper type (using the most efficient construct), and, on top of that, generates the most efficient looping constructs. It's the best way to iterate collections. With it, you'll create code that is more likely to last, and it's simpler for you to write in the first place. It's a small productivity improvement, but it adds up over time.
?
總結
以上是生活随笔為你收集整理的Effective C# 原则11:选择foreach循环的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 全 球 最 老 金 鱼 病 逝
- 下一篇: 幸福的一家