:::: MENU ::::

Wednesday, January 28, 2009

This example demonstrates three ways to compare two file listings:

·         By querying for a Boolean value that specifies whether the two file lists are identical.

·         By querying for the intersection to retrieve the files that are in both folders.

·         By querying for the set difference to retrieve the files that are in one folder but not the other.

Bb546137.alert_note(en-us,VS.90).gifNote:

The techniques shown here can be adapted to compare sequences of objects of any type.

The FileComparer class shown here demonstrates how to use a custom comparer class together with the Standard Query Operators. The class is not intended for use in real-world scenarios. It just uses the name and length in bytes of each file to determine whether the contents of each folder are identical or not. In a real-world scenario, you should modify this comparer to perform a more rigorous equality check.

http://i.msdn.microsoft.com/Global/Images/clear.gif Example

Visual Basic

http://i.msdn.microsoft.com/Global/Images/clear.gifCopy Code

Module CompareDirs
    Public Sub Main()
 
        ' Create two identical or different temporary folders 
        ' on a local drive and add files to them.
        ' Then set these file paths accordingly.
        Dim pathA As String = "C:\TestDir"
        Dim pathB As String = "C:\TestDir2"
 
        ' Take a snapshot of the file system.
        Dim list1 = GetFiles(pathA)
        Dim list2 = GetFiles(pathB)
 
        ' Create the FileCompare object we'll use in each query
        Dim myFileCompare As New FileCompare
 
        ' This query determines whether the two folders contain
        ' identical file lists, based on the custom file comparer
        ' that is defined in the FileCompare class.
        ' The query executes immediately because it returns a bool.
        Dim areIdentical As Boolean = list1.SequenceEqual(list2, myFileCompare)
        If areIdentical = True Then
            Console.WriteLine("The two folders are the same.")
        Else
            Console.WriteLine("The two folders are not the same.")
        End If
 
        ' Find common files in both folders. It produces a sequence and doesn't execute
        ' until the foreach statement.
        Dim queryCommonFiles = list1.Intersect(list2, myFileCompare)
 
        If queryCommonFiles.Count() > 0 Then
 
 
            Console.WriteLine("The following files are in both folders:")
            For Each fi As System.IO.FileInfo In queryCommonFiles
                Console.WriteLine(fi.FullName)
            Next
        Else
            Console.WriteLine("There are no common files in the two folders.")
        End If
 
        ' Find the set difference between the two folders.
        ' For this example we only check one way.
        Dim queryDirAOnly = list1.Except(list2, myFileCompare)
        Console.WriteLine("The following files are in dirA but not dirB:")
        For Each fi As System.IO.FileInfo In queryDirAOnly
            Console.WriteLine(fi.FullName)
        Next
 
        ' Keep the console window open in debug mode
        Console.WriteLine("Press any key to exit.")
        Console.ReadKey()
    End Sub
 
    ' This implementation defines a very simple comparison
    ' between two FileInfo objects. It only compares the name
    ' of the files being compared and their length in bytes.
    Public Class FileCompare
        Implements System.Collections.Generic.IEqualityComparer(Of System.IO.FileInfo)
 
        Public Function Equals1(ByVal x As System.IO.FileInfo, ByVal y As System.IO.FileInfo) _
            As Boolean Implements System.Collections.Generic.IEqualityComparer(Of System.IO.FileInfo).Equals
 
            If (x.Name = y.Name) And (x.Length = y.Length) Then
                Return True
            Else
                Return False
            End If
        End Function
 
        ' Return a hash that reflects the comparison criteria. According to the 
        ' rules for IEqualityComparer(Of T), if Equals is true, then the hash codes must
        ' also be equal. Because equality as defined here is a simple value equality, not
        ' reference identity, it is possible that two or more objects will produce the same
        ' hash code.
        Public Function GetHashCode1(ByVal fi As System.IO.FileInfo) _
            As Integer Implements System.Collections.Generic.IEqualityComparer(Of System.IO.FileInfo).GetHashCode
            Dim s As String = fi.Name & fi.Length
            Return s.GetHashCode()
        End Function
    End Class
 
    ' Function to retrieve a list of files. Note that this is a copy
    ' of the file information.
    Function GetFiles(ByVal root As String) As IEnumerable(Of System.IO.FileInfo)
        Return From file In My.Computer.FileSystem.GetFiles _
                  (root, FileIO.SearchOption.SearchAllSubDirectories, "*.*") _
               Select New System.IO.FileInfo(file)
    End Function
End Module
 

C#

http://i.msdn.microsoft.com/Global/Images/clear.gifCopy Code

namespace QueryCompareTwoDirs
{
    class CompareDirs
    {
 
        static void Main(string[] args)
        {
            // Create two identical or different temporary folders 
            // on a local drive and change these file paths.
            string pathA = @"C:\TestDir";
            string pathB = @"C:\TestDir2";
 
            // Take a snapshot of the file system.
            IEnumerable<System.IO.FileInfo> list1 = GetFiles(pathA);
            IEnumerable<System.IO.FileInfo> list2 = GetFiles(pathB);
 
            //A custom file comparer defined below
            FileCompare myFileCompare = new FileCompare();
 
            // This query determines whether the two folders contain
            // identical file lists, based on the custom file comparer
            // that is defined in the FileCompare class.
            // The query executes immediately because it returns a bool.
            bool areIdentical = list1.SequenceEqual(list2, myFileCompare);
 
            if (areIdentical == true)
            {
                Console.WriteLine("the two folders are the same");
            }
            else
            {
                Console.WriteLine("The two folders are not the same");
            }
 
            // Find the common files. It produces a sequence and doesn't 
            // execute until the foreach statement.
            var queryCommonFiles = list1.Intersect(list2, myFileCompare);
 
            if (queryCommonFiles.Count() > 0)
            {
                Console.WriteLine("The following files are in both folders:");
                foreach (var v in queryCommonFiles)
                {
                    Console.WriteLine(v.FullName); //shows which items end up in result list
                }
            }
            else
            {
                Console.WriteLine("There are no common files in the two folders.");
            }
 
            // Find the set difference between the two folders.
            // For this example we only check one way.
            var queryList1Only = (from file in list1
                                  select file).Except(list2, myFileCompare);
 
            Console.WriteLine("The following files are in list1 but not list2:");
            foreach (var v in queryList1Only)
            {
                Console.WriteLine(v.FullName);
            }
 
            // Keep the console window open in debug mode.
            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }            
 
        // This method assumes that the application has discovery 
        // permissions for all folders under the specified path.
        static IEnumerable<System.IO.FileInfo> GetFiles(string path)
        {
            if (!System.IO.Directory.Exists(path))
                throw new System.IO.DirectoryNotFoundException();
 
            string[] fileNames = null;
            List<System.IO.FileInfo> files = new List<System.IO.FileInfo>();
 
            fileNames = System.IO.Directory.GetFiles(path, "*.*", System.IO.SearchOption.AllDirectories);            
            foreach (string name in fileNames)
            {
                files.Add(new System.IO.FileInfo(name));
            }
            return files;
        }
    }
 
    // This implementation defines a very simple comparison
    // between two FileInfo objects. It only compares the name
    // of the files being compared and their length in bytes.
    class FileCompare : System.Collections.Generic.IEqualityComparer<System.IO.FileInfo>
    {
        public FileCompare() { }
 
        public bool Equals(System.IO.FileInfo f1, System.IO.FileInfo f2)
        {
            return (f1.Name == f2.Name &&
                    f1.Length == f2.Length);
        }
 
        // Return a hash that reflects the comparison criteria. According to the 
        // rules for IEqualityComparer<T>, if Equals is true, then the hash codes must
        // also be equal. Because equality as defined here is a simple value equality, not
        // reference identity, it is possible that two or more objects will produce the same
        // hash code.
        public int GetHashCode(System.IO.FileInfo fi)
        {
            string s = String.Format("{0}{1}", fi.Name, fi.Length);
            return s.GetHashCode();
        }
    }
}
 

http://i.msdn.microsoft.com/Global/Images/clear.gif Compiling the Code

·         Create a Visual Studio project that targets the .NET Framework version 3.5. By default, the project has a reference to System.Core.dll and a using directive (C#) or Imports statement (Visual Basic) for the System.Linq namespace. In C# projects, add a using directive for the System.IO namespace.

·         Copy the code into your project.

·         Press F5 to compile and run the program.

·         Press any key to exit the console window.

http://i.msdn.microsoft.com/Global/Images/clear.gif Robust Programming

For intensive query operations over the contents of multiple types of documents and files, consider using the Windows Desktop Search engine.

http://i.msdn.microsoft.com/Global/Images/clear.gif See Also

Concepts

LINQ to Objects

LINQ and File Directories