What is a Collection?

Link. October 18, 2006. Comments [0]. Posted in: .NET | LINQ

Just found Mads Torgersen's blog on the MSDN site; check out his first serious entry here on what the definition of a collection is and how it affected the C# 3.0/LINQ design.

Rediscovering IEnumerable<T>

Link. October 13, 2006. Comments [0]. Posted in: .NET | Architecture | LINQ

My recent experiments with LINQ have taken me to rediscover the power of the IEnumerable<T> interface in .NET. As many of you know by now, one of the cornerstones on which LINQ is built and that makes it so accessible is that it fully supports the very basic IEnumerable<T> interface.

This is extremely powerful because it means the barrier to entry to play in the LINQ world is very low: Pretty much all typed containers in .NET support IEnumerable<T> in one way or another, which means that if you're using any of them then you're ready to start using LINQ and take advantage of its powerful query capabilities. Obviously, for more advanced scenarios, LINQ does define a more powerful mechanism in the IQueryable<T> interface.

However, the goodness of IEnumerable<T> does not start or stop with LINQ. It's far better than that, because you can take advantage of its properties right now. The primary benefits of the IEnumerable<T> interface are two:

  1. It enables a very simple and useful language construct: The "for each" loop.
  2. It is a highly restricted interface.

The second property is the key that makes IEnumerable<T> so nice to use on your designs: Because it serves pretty much only one purpose (to allow iteration), it means that your purposes with regards to the interface are extremely clear to users of your API, and thus allows them more flexibility. It might sound strange at first that using a more restrictive interface leads to more flexibility, so let me explain what I mean.

When you're exposing .NET collections directly as part of your API interface, either as properties or as arguments or returns values of functions, you're encouraged to prefer using one of the base collection interfaces instead of the collection types by tools like FxCop [1] (except if you're using arrays directly). So, for example, instead of using List<T>, you're encouraged to use IList<T>; instead of using Dictionary<K,V>, you're encouraged to use IDictionary<K,V> instead.

This is a good practice because it gives you some flexibility in switching collection implementations later on if your original choice wasn't the right one, and I've made a habit of trying to follow it whenever I remember to do so (and FxCop is always there to remind me, as well). However, for many scenarios, even the base interfaces are way to open. For a given scenario, you really have to ask yourself: Do I really need the full capabilities expressed by an interface like IList<T>?

For collection properties, I would say that most of the time you do actually need them, so I won't consider them here. However, for method arguments and return values, that's actually more debatable. A lot of scenarios don't require all those capabilities, and indeed, just the capability to iterate over the collection is more than enough! Those are the scenarios were you will want to restrict your public interface and just use IEnumerable<T> instead.

Method Arguments

If all you want to do is walk over a collection inside one of your methods, then by all means you'll want to restrict your interface and ask for an IEnumerable<T> instead:

void DoSomething(IEnumerable<X> list) {
}

As expressed, the interface for the method presents a clearer intent. For the method implementor, it says: All you're supposed to do with this collection is iterate over it. For the method user, it gives him a lot more flexibility in choosing what he passes in, and gives him a clear idea of what the method does with the collection.

This is possibly the case where you more often will want the restricted interface because you, as the developer writing the method, have full control and knowledge about the intent.

Method Return Values

If you're defining a method that will return a collection, think what you expect the user to do with the collection you return. Normally, returning the full collection interface here is the right thing to do because if you're returning a "free flying" (i.e. detached) collection, you want your client to be able to do with it whatever it pleases after that.

However, in some scenarios the opossite is true. Sometimes you get to define a method on which you'll be the consumer and not the implementor. For example, if you're defining an interface to act as an extension hook into your own library/application, then you get to design that interface and your own code will consume it, but someone else is going to be implementing it. In those cases, then you can apply the same reasoning as the "Method Arguments" case above! Since you're the consumer and you have intimate knowledge of what you'll do with the collection you get, you can restrict it to IEnumerable<T> and thus give more flexibility to those that are going to be implementing your extension interface.

public IEnumerable<X> GetAll(){
}

One of the nice side effects of having a method defined as returning IEnumerable<T> instead of a collection is that however implements it might even do away with returning a collection at all. This is particularly nice because you can then implement the method using the Iterators feature in C# 2.0 [2] with the "yield return" statement instead of manually building a collection and returning it. This can also lead to memory savings under the right circumstances.

[1] Actually, FxCop does give you a choice: Either use the collection interfaces, or use the collection types in System.Collections.ObjectModel.

[2] It might not be evident at first look, but you can define iterator methods as returning either an IEnumerator<T> or an IEnumerable<T>. Juwal Lowy has a good discussion of iterators in his MSDN Magazine Article.

Object Initializers in C# 3.0

Link. August 28, 2006. Comments [2]. Posted in: .NET | LINQ

One of the new language features in C# 3.0, thanks to the LINQ stuff, is the new syntax for object and collection initializers. Basically, this feature allows you to initialize an object by both creating the object instance (i.e. a new expression) as well as assign values to one or more properties in a single sentence, or to initialize a collection with a set of values to add to it in a single expression similar to how you already do with arrays in C# < 3.0.

Here's an example of using the new syntax:

Person tomas = new Person() { Name="Tomas", Age=28, Hobby="Music" };
Person clone = new Person() { tomas.Hobby, tomas.Name, tomas.Age };

List<Person> people = new List<Person>() { tomas, clone };

The first line initializes a Person object with the default constructor (though you could call a constructor with parameters here), and assigns explicit values to the Name, Age and Hobby properties.

The second line initializes another Person object with the default constructor, and again assigns values to all the properties. However, notice that here we don't especify the name of the properties to initialize explicitly. Instead, the compiler will "guess" which property to initialize by matching the name of the property of the object used in the expression (i.e. tomas.Hobby is matched to the Hobby property).

The third line initializes a collection of type List<Person> with the objects tomas and clone as items in it. Look how the expression is in general very similar to an array initialization expression.

This entire feature can be quite handy and I believe it can significantly improve readability of the code when used correctly. Personally, I prefer to use as much as possible the explicit form of it in which you clearly especify which property you're assigning a value to (i.e. Hobby=tomas.Hobby), particularly when using the very similar syntax to initialize anonymous types:

var minime = new { Name=tomas.Name, Age=tomas.Age/2 };

One thing did strike me as curious about the new object initialization syntax in C# 3.0: C# already had a special syntax that did something very similar: The syntax for attributes:

[Attribute(1, Property=value)]

While this was a very restricted syntax, it served much of the same purpose, and it was somewhat curious to me why a different syntax was used instead of reusing the existing one. That said, I think I can guess at some of the reasons this was not done:

  1. The new object initialization syntax is more inline with the array initialization syntax
  2. The new sytnax clearly separates what is being initialized as part of the object construction process (i.e. everything in the ()) and what is being initialized with property setters (i.e. everything in the {}).

Number 2 in particular might be important because it is more explicit when you introduce the whole "automatic matching of member being used for intitialization to the name of the member being initialized" feature. If the existing attribute initialization syntax had been reused, this would've been far more confusing as there would be no easy way to spot whether a given initializing expression was actually using one of the available constructors or it was implicitly initializing a given property, which would've significantly reduce the readability of the language in this cases.

That said, I don't care all that much for the automatic property name guessing; I much prefer to be explicit as to what property I'm initializing.

About

Tomas Restrepo is co-founder of devdeo ltda. His interests include .NET, Connected Systems, PowerShell and, lately, dynamic programming languages. More...

email: tomas@winterdom.com
msn: tomasr@passport.com
twitter: tomas_restrepo

Technorati Profile

devdeo logo

View my profile on LinkedIn

MVP logo

Syndicate

Ads

Links

Tag Cloud

.NET (232) Architecture (47) ASP.NET (6) BizTalk (170) Blogging (64) C++ (3) Castle (2) Commerce Server (3) Development (118) DLR (7) Enterprise Services (25) Fonts (4) Host Integration Server (1) LINQ (3) Linux (5) NHibernate (1) Personal (143) PowerShell (22) QuickCounters (4) Tools (74) Vista (38) VS Color Scheme (10) VSTO (2) WCF (64) Web Services (87) WinFX (80) Workflow (47) WPF (5) XML (21)

Statistics

Total Posts: 986
This Year: 56
This Month: 6
This Week: 0
Comments: 755

Blogroll

Post Archive

Other

Copyright © 2002-2008, Tomas Restrepo.

Powered by: newtelligence dasBlog 1.9.7174.0

Sign In