:::: MENU ::::

Friday, February 29, 2008

Microsoft's ADO.NET Team readies Entity Framework and Tools 1.0 for release as a VS 2008 add-in with enterprise-level features that LINQ to SQL doesn't offer -- domain object modeling, flexible inheritance techniques, multiple database vendors, and do-it-yourself n-tier deployment.

 

The ADO.NET Entity Framework (EF) will be Microsoft's first production-quality object/relational mapping (O/RM) platform that promises to be fully competitive with entrenched open source or commercial O/RM tools and object-persistence code generators for .NET. The goal of these tools is to minimize the time and effort required to transfer business objects to and from storage in relational databases. OR/M tools reduce the programming disconnect between relational and object models and, in many cases, generates the class code for the business objects. But O/RM platforms -- often combined with automated, template-based Web-site generation frameworks (scaffold generators), such as Ruby on Rails and SubSonic -- have become critical components of professional developers' toolkits. .NET O/RM tools and code generators are a growth industry: In early 2008, the Sharp Toolbox's .NET Object-Relational Mapping category listed 48 products, not including EF and LINQ to SQL, most of which are commercial offerings. NHibernate and NPersist are widely used open source O/RM tools for .NET; LLBLGen Pro and WilsonORMapper are popular commercial offerings.

 

The Entity Data Model (EDM), Entity SQL (eSQL) query language, and LINQ for Entities are the components that distinguish EF from other .NET O/RMs. Microsoft released EF beta 3 and the EF Tools community technical preview (CTP) 2 in early December 2007; you can expect the release to manufacturing (RTM) versions in the first half of 2008. I'll give you a quick EF refresher, describe what's new in EF Beta 3 and EF Tools CTP 2, show you how to quickly create an EDM from the Northwind sample database, demonstrate simple eSQL and LINQ to Entities queries, preview the ADO.NET Team's EntityBag components for n-tier EF deployment, and compare EF with LINQ to SQL for production object persistence use.

EDM is an entity-relationship (ER) data model that's based on the pioneering work of Dr. Peter Chen, who introduced the concept as a "Unified View of Data" in a 1976 Association for Computing Machinery paper (see Additional Resources). EDM defines entities as instances of EntityTypes (Order, for example) and EntitySets as keyed collections of entities (Orders). An EntityKey (usually--but not necessarily--a primary key, such as OrderID) implements object identity with an EntityReference, which uniquely identifies an entity instance for creating, retrieving, updating, or deleting (CRUD operations) and prevents creating duplicate instances in memory. The EntityKey lets entities participate in relationships, which are logical connections between entities called associations. EntitySets implement 1:many associations (Customer.Orders) and EntityReferences (Order.Employee) implement many:1 Associations. NavigationProperty instances identify entities at either End of an Association (see Figure 1). The Multiplicity of an association's End is indicated by 0..1 for zero or more, 1 for exactly one, and * for many. Unfortunately, EF also introduces a terminology disconnect for relationally oriented .NET developers.

Layer Logical and Conceptual Schemas
Three layers of XML mapping files implement the EDM, which minimizes dependency of domain object design on the underlying database schema. An EF-enabled ADO.NET data store provider connects the database instance to the logical data store layer, which a ModelName.ssdl schema file matches to the physical database schema. A ModelName.mdl mapping schema file defines the relationship between the logical layer and the conceptual layer's ModelName.csdl schema that defines the EDM (see Figure 2). The mapping schema isolates the EDM at the conceptual layer from subsequent changes to the database schema or vice versa and enables support for the three common domain object inheritance models: table per hierarchy (TPH), table per type (TPT), and table per concrete type (TPCT). It's possible to design your domain model in EF and then implement the database and mapping schemas by hand. However, it's much more practical to use EF v1's graphical mapping tools and code generator to create the mapping schemas and classes.

The ADO.NET Team describes EntityClient as an ADO.NET data provider that's "a gateway for entity-level queries." EntityClient executes queries against the EDM's conceptual layer by using its own provider-agnostic query language, eSQL, with the familiar Connection, Command, and Parameter objects whose names carry an Entity prefix. EntityCommands that you run against EntityClient can execute eSQL text and parameterized queries as well as stored procedures. The EntityDataReader returns tabular or hierarchical DBDataReader objects, depending on the query and associations of the affected entities. eSQL is a SQL dialect that extends ANSI SQL with object-oriented keywords that support EntitySets, EntityTypes, and composability, but it doesn't include join-related commands. Joins are implemented by the eSQL NAVIGATE operator, which operates on associations.

Here's a simple EntityCommand against the nwEntities EntityContainer that's shown in Figure 1's Model Browser pane:

EntityCommand eCmd = 
                    nwEntities.CreateCommand(); 
                    eCmd.CommandText = @" SELECT VALUE o 
                    FROM nwEntities.Orders AS o 
                    WHERE o.ShipCountry = 'USA' 
                    ORDER BY o.OrderDate DESC"; 

eSQL requires aliases, such as o in the preceding example, and its SELECT VALUE clause tells the query processor to return a collection (Order entities) instead of implicitly wrapped rows. SELECT VALUE queries are limited to returning a sequence of a single EntityType and don't support projections, such as a column list. eSQL doesn't support T-SQL's * wildcard to represent all columns of a table; Count(*) becomes Count(0) in eSQL.

In its query pipeline, EntityClient parses the eSQL query, validates it against the conceptual model, and then sends the query to the database-specific data provider in the form of a Canonical Query Tree (CQT). The provider translates the CQT to the database's SQL dialect.

Object Services Does the Heavy Lifting
Object Services is the EF's top layer; it orchestrates O/RM operations and data transfers. Object Services autogenerates the code for CLR partial classes that define strongly typed EntityTypes and EntitySets<EntityType> collections. Object Services implements querying with LINQ or eSQL, object-identity management based on EntityKeys, event-based change tracking for object state management, eager and lazy loading of related entities, value-based optimistic concurrency management, and routine CRUD operations. eSQL v1.0 doesn't include SQL-style data manipulation language (DML) commands, so you must manipulate domain objects directly to emulate traditional SQL CRUD commands; alternatively, both Entity Client and Object Services support stored procedures for CRUD operations. Object Services dramatically reduces the amount of planning, design, and programming effort to manage the transition to and from a relational object persistence store.

ObjectContext is Object Service's top-level object and contains an EntityConnection to the data model. In fact, when you name the connection string in the Entity Data Model Wizard, you name the derived EntityContainer type for the model; the default is DatabaseName Entities. The connection string contains a metadata section that specifies the names and locations of the .CSDL .SSDL, and .MSL files separated by the pipe symbol (|) and the connection string or fully qualified name of the EF-enabled store-specific data provider. The remainder of the string corresponds to a conventional ADO.NET connection string plus the EntityClient providerName; this example utilizes the Northwind database running on SQL Server 2005+:

connectionString="metadata=.\Northwind.csdl|.
                    \Northwind.ssdl|.\Northwind.msl;
                    provider=System.Data.SqlClient;
                    provider connection string="
                    Data Source=localhost\SQLEXPRESS;
                    Initial Catalog=Northwind;
                    Integrated Security=True;
                    MultipleActiveResultSets=True"" 
                    providerName="System.Data.EntityClient"

Entity Framework Tools CTP 2 and later add MultipleActiveResultSets=True to enable MARS automatically for SQL Server 2005 and later. By default, the EDM Wizard places connection strings in the App.context or Web.context file.

ObjectContext is represented by the autogenerated EntityContainer derived from it; it's also the EF's programming target. ObjectContext encapsulates the metadata defined by the conceptual schema (ModelName .csdl) in a MetadataWorkspace object and the ObjectStateManager that's responsible for identifying and managing entity instances in the memory cache. The EntityContainer holds the autogenerated EntitySets and AssociationSets, which act as the data source for an ObjectQuery that adds entity instances to the cache. Persisting updates to entity instances made by modifying their properties or adding them to or deleting them from EntitySets requires executing the ObjectContext .SaveChanges() method. ObjectContext supports the Unit of Work pattern with implicit local DBTransaction or by explicitly enlisting in a distributed System.Transaction.

ObjectQuery implements IQueryable<T> and IEnumerable<T> and supports two eSQL query expression formats: query-string and query-builder methods. This is the eSQL query-string version of the earlier EntityCommand example for nwEntities that returns an ObjectQuery<Order> for iteration in a foreach loop:

string query =
                    @" SELECT VALUE o 
                    FROM nwEntities.Orders AS o 
                    WHERE o.ShipCountry = 'USA' 
                    ORDER BY o.OrderDate DESC"; 
ObjectQuery<Order> orderQuery = 
                    new ObjectQuery<Order>(query, 
                    nwEntities, MergeOption.NoTracking); 

ObjectQueries share LINQ queries' lazy execution feature. The preceding query won't execute until it's iterated in a foreach loop, assigned to a List<T> collection, or run explicitly by invoking the Execute() method. The MergeOption enumeration offers AppendOnly (default), NoTracking, OverwriteChanges, and PreserveChanges options for concurrency management. NoTracking makes no changes to the ObjectStateManager when the query executes.

The query-builder method emulates LINQ's method call syntax to enable composable queries:

ObjectQuery<Order> orderQuery = 
                    nwEntities.Orders
                    .Where("it.ShipCountry = 'USA'") 
                    .Orderby("it.OrderDate DESC");

The "Query Builder Method (Entity Framework)" topic of the Entity Framework API online help file has a list of query-builder methods and their eSQL command counterparts. (The EF Tools setup program installs the help file as an ADO.NET Entity Framework Tools Preview menu choice.) Each query-builder ObjectQuery execution returns an ObjectQuery, which enables query chaining. The output of a query-builder method query is an ObjectQuery<T> that can be the data source for another ObjectQuery<T>. You can mix and match LINQ to Entities' Standard Query Operators (SQOs) with query-builder methods, but adding an SQO returns IQueryable<T>, not ObjectQuery<T>.

LINQ to Entities is the EDM's most important query technique, primarily because you don't need to master eSQL to return the entities or scalar values you want. Like Entity Client, LINQ to Entities uses a command tree to communicate queries to the store-specific ADO.NET data provider, which translates the command tree to the databases' SQL dialect. This snippet illustrates the LINQ to Entities version of the preceding query example:

 var usOrders = 
                    from o in nwEntities.Orders 
                    where o.ShipCountry = 'USA' 
                    orderby o.OrderDate descending
                    select o;

The Entity Framework beta 3 Samples download from CodePlex (EFBeta3Samples.zip, 28MB) includes an EF version of the LINQ Query Explorer included with Visual Studio (VS) 2008's LINQ samples (see Additional Resources). The Entity Framework Sample Query Explorer demonstrates query-string, query-builder method, and LINQ to Entities queries with tree-view Output, Text Output, and Generated SQL output tabs. The samples include some other simple examples of EF running in WinForms and WebForms.

Test Drive the New EDM Designer
The ADO.NET EF team has delivered a surfeit of new EF features and enhancements since the EF August 2006 CTP that I wrote about in "Objectify Data with ADO.NET vNext" in the October 2006 issue (see Additional Resources). EF beta 1, which arrived in Orcas beta 1 without a visual designer, was far from ready to compete with the then-current stable of .NET O/RM tools. Creating an EDM required running the EdmGen.exe command-line tool and then manually editing the three schema files. However, August 2007's EF beta 2/EF Tools CTP 1 restored and improved the graphic EDM Designer and EF beta 3/EF Tools CTP 2 of Dec. 6, 2007, delivered numerous improvements to both EF and the Designer, including a substantial performance boost with compiled queries (see Table 1). If you haven't downloaded the latest EF bits, now's the time (see Additional Resources). According to a post in the MSDN ADO.NET (Pre-Release) forum, one more beta/CTP is schedule prior to EF's RTM.

Creating a nwModel EDM and nwEntities EntityContainer is almost as quick as generating a new DataContext with LINQ to SQL. The EntityDataSource component for ASP.NET -- EF's answer to LINQ to SQL's LinqDataSource -- isn't ready yet, so start a new Visual Basic or C# WinForm project, add a new ADO.NET Entity Data Model template with the file name changed to Northwind.edmx to start the EDM Wizard, click on next to accept the default Generate from Database option in the EDM Wizard's Choose Model Contents dialog, and click on next to select a Northwind database connection. Next, rename the connection string to nwEntities in the Choose Your Data Connection dialog. Finally, select the tables and stored procedures to add to your model and change the model's namespace to nwModel in the Choose Your Database Objects dialog and click on Finish to let the EDM Wizard populate the Northwind.edmx EDM Designer file and autogenerate the nwEntities class files. When autogeneration completes, save the diagram, and right-click on an entity or association line to open the Mapping Details pane. Optionally, singularize the names of the entity types by double clicking on their names to open an edit box.

Databinding to EF entities isn't as simple as binding to LINQ to SQL entities; for example, Object Data Sources created from EF EntitySets don't expose associations as relationships to serve as the DataMember for child grids in parent/child forms. Instead, you must navigate the association with a query that returns an IEnumerable<T> from the association's EntityCollection. This requires that you write a query in the event handler for the parent's BindingSource_CurrentChanged event to load the correct set of orders:

var query = ctxNwind.Customers
                    .Where(cust => cust.CustomerID == 
                    customerIDTextBox.Text)
                    .Select(c => c.Orders.Select(o => o))
                    .FirstOrDefault()
                    .OrderByDescending(o => o.OrderID);
orderBindingSource.DataSource = query.ToList();

The preceding query is based on the design of the Entity Framework Sample Query Explorer's LINQ to Entities | Relationship Navigation | Relationship Collection 1 (LinqToEntities70) query. The first Select() operator returns the Customer entity, and the second Select() returns an EntitySet of Order entities; FirstOrDefault() returns the IEnumerable<Order> sequence that supplies the BindingSource's DataSource property value. Download this article's sample code and run the NwindEdmCS.sln project, a simple WinForm databinding example with BindingSources and DataGridViews.

The ADO.NET Team wanted to avoid LINQ to SQL's "no out-of-the box n-tier story" stigma but, like LINQ to SQL's DataContext, the ObjectContext isn't serializable with Winows Communication Foundation (WCF)'s DataContextSerializer. So Daniel Simmons, a development lead on Microsoft's ADO.NET EF team, crafted a top-level EntityBag object that creates a ContextSnapshot, which is a DataContractSerializable Data Transfer Object (DTO) containing the ObjectContext's contents, including original and modified values. Messages with a ContextSnapshot as a SOAP payload pass between the WCF service's ObjectContext and a service client ObjectContext that's identical to the service's, but doesn't have a database connection. To learn more about the EntityBag and its related classes and extension methods, see the sidebar, "Retrieve and Update Entities Across Tiers with WCF and the EntityBag." The sample code includes the EntityBagCS.sln project that retrieves and updates Northwind entities with SOAP messages and a basicHttpBinding.

Numerous posts on MSDN's LINQ Project General and ADO.NET (Pre-Release) forums indicate that developers find it difficult to decide whether to adopt LINQ to SQL or EF and LINQ to Entities for upcoming data-intensive .NET 3.5 projects. LINQ to SQL is part of the VS 2008 RTM bits and EF is scheduled to release in the "first half of 2008." But the promised EntityDataSource, which will substitute for ASP.NET's LinqDataSource for LINQ to SQL, hadn't appeared as a beta implementation when I wrote this article. In any case, LINQ to SQL is riveted to 1:1 table:entity mapping, only supports LINQ queries, and is connected at the hip to SQL Server 2000+. (LINQ to SQL offers partial support for SQL Server Compact Edition 3.5.) EF offers extremely flexible mapping, is database-agnostic, and enables three distinctly different query mechanisms (Table 2, for a detailed, feature-by-feature comparison).

As Data Programmability architect Mike Pizzo noted in his "Data Access API of the Day: Programming to the Conceptual Model" blog post of Jan. 23, 2007, "[T]he bet on the Entity Framework, and the Entity Data Model in particular, is big. You can expect to see more and more services within SQL Server, as well as technologies throughout the company, embrace and leverage the Entity Data Model as the natural way to describe data in terms of real-world concepts." Microsoft is devoting extraordinary resources to ensuring that EF will RTM on time as an enterprise-class O/RM tool. ADO.NET Data Services (formerly codenamed "Project Astoria") has adopted EF as its default data source for relational data, and it's likely that other new data-related Microsoft products will deploy with EF. I'm betting that EF will eclipse LINQ to SQL as the preferred Microsoft OR/M tool in the next year or two.

Read More