While using LINQ, you may want to choose unique items in a list with the DISTINCT clause. To use this clause with your objects or projections or an object that represents stored proc. results, you will need to provide the logic used to compare 2 object instances and return if they are equal or not. This is because .NET does not inherently know how to compare these objects.

Lets say you have a Person class:

public class Person {
    public long ID;
    public string firstName;
    public string lastName;

The comparison logic is defined in a class PersonComparer needs to implement the IEqualityComparer interface that has 2 methods as shown.

public class PersonComparer : IEqualityComparer<Person>
    #region IEqualityComparer<Person> Members

    public bool Equals(Person object1, Person object2)

        //return true if equal
        return object1.firstName.Equals(object2.firstName)
               && object1.lastName.Equals(object2.lastName);


    public int GetHashCode(Person obj)
        return obj.ID.GetHashCode();


Now to use this, let’s say we have a list of Person objects IEnumerable<Person> lstPerson, and we want to get unique objects in this.

Declare an instance of your Comparer object

PersonComparer myComparer = new PersonComparer();

and then use it with the Distinct clause

IEnumerable<Person> lstPersonDistinct = lstPerson.Distinct(myComparer)

 If you want to go deeper, Distinct is defined in the .NET framework as an extension on class Enumerable.

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer);

So essentially, the interface (IEqualityComparer) of myComparer is used in your Distinct call.

Hope this post helps you to implement Distinct for your objects.


I see a lot of questions and articles about doing CRUD operations with LINQ to SQL (DLinq), especially about Updates, DAL design, etc., and think a little guidance could help a few people…
This article only focuses on update scenarios, typically encountered with web/async apps.
Let’s start with a simple scenario without Transactions or multiple Data Contexts.

Say we want to update a Person’s name in our Employee database, from our app.

First retrieve the Person row, then Update it

dbPerson p = (from p in myContext.Persons where p.Id = myId select p).First()
p.Name = “foo”;

Note: Do not use Attach unless your database object, in this case p, comes from another Data Context instance. If it is in a single unit of work, there is no need to use Attach, and it can get pretty confusing if used incorrectly.

In the update above, LINQ takes an optimistic concurrency approach under the covers, and generates a SQL with a complex Where clause, that essentially compares all the fields from your passed in object to the ones in the database.

This provides a concurrency check, whereby if another user snuck in and updated the Person record, the update will fail because the old value checks will detect that the underlying SQL data has changed after the fetch.
This is a good thing, but there is an overhead of the Where clause, which is proportional to the number of columns. So, if you had 30 columns in the Person table, the Where clause would include 30 comparisons.

Although this might be jarring, if the Update is not used that often a little overhead  is fine, and that’s the price you pay for the convenience of LINQ.
However, at the other extreme end, if the Update operation is on a critical path and needs to be perform like a dragster with clean pipes, use a stored procedure instead.
As an Architect, I recommend that you first understand your app’s profile because software design is always subjective.

A middle ground, without using stored procs but improving performance by avoid the various Where clause checks, is to use a Timestamp column in DLinq as follows…


update linq Let’s say we have a LastModifiedAt column in the database, to use as a timestamp.

As shown, select it in the OR/M designer and View Properties.
Set the Timestamp property to True, as shown.

Now, DLinq will use this column for concurrency checks instead of the in-efficient Where clause


Remember to set the default value of the timestamp column in your database to getUtcDate() or something, and don’t update this column with LINQ.

Note the Property values that are there. They can be set to improve the efficiency for different scenarios. For e.g. I set Auto-sync to Never because I don’t update an Entity multiple times, so I don’t need it to be updated in LINQ every time.

In the future, I will explore Transactions and multiple Data Contexts when it comes to DAL design.

In this post, I will lay out a basic Design Pattern for you to show data in a View.

Let’s start from the web request. This is typically routed to a method on a Controller.
In the method, you would typically execute some business logic and that may call down in the data layer that accesses the database.
After all that, you would end up returning a View of some sort, that is then rendered.
http request –> Controller.action –> returns View –> rendered by Framework

Controller Action

In the Action/method, let’s call a business method that returns a PersonModel for a given Id.
The business method could check for Authorization, etc., and maybe call a method GetPerson in the Data Layer. You could have a thin DataHelper layer that would translate Database entities to Business objects/Models; it provides re-usability to the translation functionality, for consumption by multiple business objects.

Action –> DataHelper –> Data Ops -> Database

Database Operations

With .NET 4.0, Microsoft is pushing us away from LINQ to SQL and towards Entity Data framework which is greatly improved. So, that’s what we’ll use.
Choose Project: Add new item: (I typically click on the Data group to filter): ADO.NET Entity Data Model.
It’s pretty straightforward to connect to a database, etc., and after you’re done, drag tables from the Server Explorer onto the ER designer surface. In our example, you would drag the Person table onto it, and that would create a Person Entity object mapping to the database.

Let’s say you add a GetPerson method to the DataHelper that takes an id, and returns a PersonModel, and the method access the database to get the data.

GetPerson –> GetDBPerson(id) –> Create and populate a PersonModel

The getDBPerson could be a method that fires a LINQ query and returns the ER entity.
For e.g.

MyContext db = new MyContext();

var pers = from p in db.Persons
                     where p.Id = id
                     select p.

return p.FirstOrDefault();

GetPerson could then translate the ER Entity to a model.

PersonModel pers = new PersonModel
                      Name = dbPers.FirstName;
return pers;

So, our business method now has the (populated) model.

The controller needs to take this and return a View.

Returning Typed Views

You may know MVC 2 supports strongly typed Views. What this means is that you can create a View based on a typed object, and directly access that object in the aspx code while rendering the View. This provides compile-time checks that make you code more robust and tightly coupled, and also provides Intellisense benefits.

This is done by a View inheriting from a typed ViewPage, which is declared in the View markup by:


The View is created like this:

PersonModel person = DataHelper.GetPerson(12345);

return View(person);

In the markup, you can access the model (named as model) directly.
For e.g. <%: model.FirstName) %>

In this case, model is of type Person.

Hopefully now you understand typed Views, and have a basic pattern for showing database data on the web page from a View.

As you know, data operations on a database that need to execute together should always be in a transaction, in order to maintain data integrity.

LINQ automatically uses database transactions.
So, when you call UpdateChanges(), all LINQ operations done from creation of the data context or the last UpdateChanges are executed within a database Transaction.

Unlike what many developers think, you do not need to use TransactionScope if you just use a single call to UpdateChanges(), after a number of InsertOnSubmit(), DeleteOnSubmit(), etc..

If you do use multiple UpdateChanges(), you will need to manually handle the transactions by wrapping the Operations in a TransactionScope block. My previous post has a scenario in which you would need to do this.

Contrary to what you might think, SQL that gets executed on a database as a result of LINQ operations, might not be in the order of your LINQ calls.

Let’s consider a scenario in which Customer data needs to be updated. You’d like to delete the old row and then insert the new data rather than an Update, because of some triggers which run some complex business logic.

Unfortunately, a sequence of an Insert followed by a Delete does not execute on SQL Server like you would expect. For some reason, LINQ always executes the DELETE operation at the end, no matter where it was in the execution sequence.
So, a Delete and Insert in your LINQ would execute as an Insert and a Delete on SQL Server. This could cause problems like Primary key violations, depending on your table design.
There is no way to get around this, except by calling SubmitChanges() after the different Operations.
So, you would have Delete, DataContext.SubmitChanges and then Insert and DataContext.SubmitChanges. This would preserve the sequence of Deleting before an Insert.

To maintain data integrity, the executions should be in a Transaction. Since you call SubmitChanges() more than once, you need to manually create/manage the Transaction.

This is done by enclosing the operations in a block:

using (var ts = new TransactionScope())
    try {
   ts.Complete();   //commit the transaction

         ts.Dispose();   //rollback

Note: To use TransactionScope, you need to reference System.Transactions.

If you are using SQL Profiler to watch the SQL being executed, note that you will always see the Update and Insert statements irrespective of whether the Transaction was rolled back or not, because SQL Profiler does not indicate Transactions, but let that not confuse you. The data will not be affected in case of a Rollback.

If you use LINQ in a data-access component, and are looking to manage connection strings in various projects and environments, this article is for you…

When you create a new ER mapping in LINQ (as a dbml), you may end up with the connection string in the app.config file of your DA component. When your website or other client project uses this component, it can get quite confusing to know where the connection string is picked up from, and you most likely have pulled out your hair out in that process.
It is likely that the connection string is actually hard-coded in your Settings.Designer.cs file in the Properties folder in our project. There is a default setting attribute for the connection key property in your settings class, and if the required key is not found in your configs at run-time, this default value will be used.
So, what are the best practices to specify connection strings for LINQ? Here is my solution…

The key is to have the right settings key (no pun intended) in the connectionStrings section in your client config file.
Something like:

  <add name=”NamespaceFoo.SettingFoo.ConnectionFoo” connectionString=”DataSource=xxx;blah-blah”
    providerName=”System.Data.SqlClient” />

To figure out the precise key name, open your Settings.Designer.cs file. Note the Namespace, the partial class name and the name of the related connection string Property. Concatenate these together with periods and you have your config key. e.g. Outerspace.Settings.EarthDBConnection

Add this setting to the config file of your calling project, e.g. web.config of your main website. At run-time, the connection string value for that key will be used by your LINQ data context.

As you’re aware, different environments like dev, integration, qa, staging, production, etc. have different connection strings. You could change the value of the connection string but how do you maintain the various values? In the past, I have had these all in the config files and commented and uncommented them as required, but that was clunky and exposed them to everybody.
A good solution to this is to have different config files for your environments with the required security privileges and to specify which one to use in your config, like so:

<connectionStrings configSource=”dev.config” />

Create a dev.config in the same folder and specify the value in there. Note, that it should be the entire xml node

  <add name=”xxx” connectionString=”yyy”
    providerName=”System.Data.SqlClient” />

For QA, create another file qa.config with the required settings, and simply change the configSource value to point to qa.config. That way, you just need to change this value in a post-deployment step. Of course, your constraints for security and who sees the config files may be different but you have all the building blocks to adapt to your modus-operandi.
Hope that helps…