by skeet via Jon Skeet: Coding Blog on 7/27/2010 6:52:23 PM
Do you know the hardest thing about presenting code with surprising results? It's hard to do so without effectively inviting readers to look for the trick. Not that that's always enough - I failed the last of Neal and Eric's C# puzzlers at NDC, for example. (If you haven't already watched the video, please do so now. It's far more entertaining than this blog post.) Anyway, this one may be obvious to some of you, but there are some interesting aspects even when you've got the twist, as it were.
What does the following code print?
If you guessed "1, 2, Done" despite the title of the post and the hints that it was surprising, then you're at least brave and firm in your beliefs. I suspect most readers will correctly guess that it prints "1, 1, 1" - but I also suspect some of you won't have worked out why.
Let's look at the signature of List<T>.GetEnumerator(). We'd expect it to be
List<T>.GetEnumerator()
right? That's what IEnumerable<T> says it'll look like. Well, no. List<T> uses explicit interface implementation for IEnumerable<T>. The signature actually looks like this:
IEnumerable<T>
Hmm... that's an unfamiliar type. Let's have another look in MSDN...
(It's nested in List<T> of course.) Now that's odd... it's a struct. You don't see many custom structs around, beyond the familiar ones in the System namespace. And hang on, don't iterators fundamentally have to be mutable.
List<T>
System
Ah. "Mutable value type" - a phrase to strike terror into the hearts of right-headed .NET developers everywhere.
So what's going on? If we're losing all the changes to the value, why is it printing "1, 1, 1" instead of throwing an exception due to printing out Current without first moving?
Current
Well, we're fetching the iterator into a variable of type List<int>.Enumerator, and then calling ShowNext() three times. On each call, the value is boxed (creating a copy), and the reference to the box is passed to ShowNext().
List<int>.Enumerator
ShowNext()
Within ShowNext(), the value within the box changes when we call MoveNext() - which is how it's able to get the real first element with Current. So that mutation isn't lost... until we return from the method. The box is now eligible for garbage collection, and no change has been made to the iterator variable's value. On the next call to ShowNext(), a new box is created and we see the first item again...
MoveNext()
iterator
There are various things we can do to fix the code - or at least, to make it display "1, 2, Done". We can then find other ways of breaking it again :)
values
How does the compiler work out the type of the iterator variable? Why, it looks at the return type of values.GetEnumerator(). And how does it find that? It looks at the type of the values variable, and then finds the GetEnumerator() method. In this case it finds List<int>.GetEnumerator(), so it makes the iterator variable type List<int>.Enumerator.
values.GetEnumerator()
GetEnumerator()
List<int>.GetEnumerator()
If suppose just change values to be of type IList<int> (or IEnumerable<int>, or ICollection<int>):
IList<int>
IEnumerable<int>
ICollection<int>
The compiler uses the interface implementation of GetEnumerator() on List<T>. Now that could return a different type entirely - but it actually returns a boxed List<T>.Enumerator. We can see that by just printing out iterator.GetType().
List<T>.Enumerator
iterator.GetType()
So if it's just returning the same value as before, why does it work?
Well, this time we're boxing once - the iterator gets boxed on its way out of the GetEnumerator() method, and the same box is used for all three calls to ShowNext(). No extra copies are created, and the changes within the box don't get lost.
This is exactly the same as the previous fix - except we don't need to change the type of values. We can just explicitly state the type of iterator:
The reason this works is the same as before - we box once, and the changes within the box don't get lost.
The initial problem was due to the mutations involved in ShowNext() getting lost due to repeated boxing. We've seen how to solve it by reducing the number of boxing operations down to one, but can we remove them entirely?
Well yes, we can. If we want changes to the value of the parameter in ShowNext() to be propagated back to the caller, we just need to pass the variable by reference. When passing by reference the parameter and argument types have to match exactly of course, so we can't leave the iterator variable being type List<T>.Enumerator without changing the parameter type. Now we could explicitly change the type of the parameter to List<T>.Enumerator - but that would tie our implementation down rather a lot, wouldn't it? Let's use generics instead:
Now we can pass iterator by reference and the compiler will infer the type. The interface members (MoveNext() and Current) will be called using constrained calls, so there's no boxing involved...
... except that when you try to just change the method calls to use ref, it doesn't work - because apparently you can't pass a "using variable" by reference. I'd never come across that rule before. Interesting. Fortunately, we can (roughly) expand out the using statement ourselves, like this:
ref
using
Again, this fixes the problem - and this time there's no boxing involved.
Let's quickly look at one more example of it not working, before I finish...
What happens if we change the type of iterator to dynamic (and set everything else back the way it was)? I'll readily admit, I really didn't know what was going to happen here. There are two competing forces:
dynamic
object
What actually happens? I believe it actually boxes the value returned from GetEnumerator() - and then the C# binder DLR makes sure that the value type behaviour is preserved by copying the box before passing it to ShowNext(). In other words, both bits of intuition are right, but the second effectively overrules the first. Wow. (See the comment below from Chris Burrows for more information about this. I'm sure he's right that it's the only design that makes sense. This is a pretty pathological example in various ways.)
Just say "no" to mutable value types. They do weird things.
(Fortunately the vast majority of the time this particular one won't be a problem - it's rare to use iterators explicitly in the first place, and when you do you very rarely pass them to another method.)
Original Post: Iterate, damn you!
The content of the postings is owned by the respective author. CSharpFeeds is not responsible for the contents of the postings. This site is automatically generated and cannot be reviewed for abusive content. If you find abusive content on CSharpFeeds, please contact us. Designated trademarks and brands are the property of their respective owners. All rights reserved.