Replace System.IO.Directory with WinAPI

Posted by johnny on Sun, 19 May 2019 21:05:21 +0200

Catalog

introduce

Use code

How does the code work

introduce

Recently, I was working on a project that needed to read the contents of Windows directories, so I used the EnumerateDirectories, EnumerateFiles and EnumerateFileSystem Entries methods in the System.IO.Directory class provided by. NET. Unfortunately, using these functions has a big disadvantage: if they encounter a denial of access to the current user's file system entries, they will immediately interrupt -- instead of handling such errors and continuing, they will only return to whatever they collect when they interrupt -- and will not complete their work.

It is impossible to process it from the outside of the method, because if you process it, you will only get the returned IEnumerable partial results.

I looked everywhere for a solution to this problem, but I couldn't find a solution that didn't use the above method. So I decided to use the Windows API and create alternatives. Not only is the result better (to some extent, the method will not be compromised by denial of access), but it seems to be faster than the original method of. NET.

Use code

The project itself is a class library type, which is not executable, but building it compiles the method into a DLL file, which you can refer to in another project and use from there, as follows:

using System.IO;

DirectoryAlternative.EnumerateDirectories
(path, "*", SearchOption.AllDirectories).ToList<string>();

I use the same namespace as the original process (System.IO) and name this class DirectoryAlternative -- so use it as similar as possible to the original class.

The methods themselves are named in the same way, they use the same parameters and look exactly the same from the outside.

The following are examples of method usage:

System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
string path = "V:\\MUSIC";
List<string> en = new List<string>();
sw.Start();
try { en = Directory.EnumerateDirectories
  (path, "*", SearchOption.AllDirectories).ToList<string>(); } catch { }
sw.Stop();
Console.WriteLine("Directory.EnumerateDirectories : {0} ms / {1} entries", 
  sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));
sw.Reset();
en = new List<string>();
sw.Start();
en = DirectoryAlternative.EnumerateDirectories(path, "*", 
  SearchOption.AllDirectories).ToList<string>();
sw.Stop();
Console.WriteLine("DirectoryAlternative.EnumerateDirectories : 
  {0} ms / {1} entries", sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));
sw.Reset();
en = new List<string>();
sw.Start();
try { en = Directory.EnumerateFiles(path, "*", 
  SearchOption.AllDirectories).ToList<string>(); } catch { }
sw.Stop();
Console.WriteLine("Directory.EnumerateFiles : {0} ms / {1} entries", 
  sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));
sw.Reset();
en = new List<string>();
sw.Start();
en = DirectoryAlternative.EnumerateFiles
  (path, "*", SearchOption.AllDirectories).ToList<string>();
sw.Stop();
Console.WriteLine("DirectoryAlternative.EnumerateFiles : {0} ms / {1} entries", 
  sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));
sw.Reset();
en = new List<string>();
sw.Start();
try { en = Directory.EnumerateFileSystemEntries
  (path, "*", SearchOption.AllDirectories).ToList<string>(); } catch { }
sw.Stop();
Console.WriteLine("Directory.EnumerateFileSystemEntries : {0} ms / {1} entries", 
  sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));
sw.Reset();
en = new List<string>();
sw.Start();
en = DirectoryAlternative.EnumerateFileSystemEntries
  (path, "*", SearchOption.AllDirectories).ToList<string>();
sw.Stop();
Console.WriteLine("DirectoryAlternative.EnumerateFileSystemEntries : {0} ms / {1} entries", 
  sw.ElapsedMilliseconds.ToString("N0"), en.Count.ToString("N0"));

Console.ReadKey();

The above code snippet directly compares the performance of the original method with the DirectoryAlternative method -- I used a very large directory with 70.000 + file system entries:

As you can see, the DirectoryAlternative method runs nearly twice as fast.

How does the code work

The code uses several Win API functions to move the file system (I believe these same functions are used in the original. NET method):

[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)]
struct WIN32_FIND_DATA
{
    public uint dwFileAttributes;
    public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
    public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
    public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
    public uint nFileSizeHigh;
    public uint nFileSizeLow;
    public uint dwReserved0;
    public uint dwReserved1;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
    public string cFileName;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
    public string cAlternateFileName;
}

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Unicode)]
private static extern bool FindClose(IntPtr hFindFile);

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Unicode)]
private static extern IntPtr FindFirstFile
  (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Unicode)]
private static extern bool FindNextFile
  (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

In short:

  • FindFirstFile searches for the first filesystem entry it can find using the provided pattern (lpFileName) and returns HANDLE (IntPtr) to this file.
  • FindNextFile searches for the next file system entry that matches the specified pattern - we use this method to iterate through all files/directories
  • FindClose for closing HANDLE

All file information is collected inside the WIN32_FIND_DATA struct and returned as an out type parameter.

For more information on these methods, you can Here Find.

The main method is the Enumerate method. All other approaches revolve around this approach.

private static void Enumerate(string path, string searchPattern, 
  SearchOption searchOption, ref List<string> retValue, EntryType entryType)
{
    WIN32_FIND_DATA findData;
    if (path.Last<char>() != '\\') path += "\\";
    IntPtr hFile = FindFirstFile(path + searchPattern, out findData);
    List<string> subDirs = new List<string>();

    if (hFile.ToInt32() != -1)
    {
        do
        {
            if (findData.cFileName == "." || findData.cFileName == "..") continue;
            if ((findData.dwFileAttributes & 
               (uint)FileAttributes.Directory) == (uint)FileAttributes.Directory)
            {
                subDirs.Add(path + findData.cFileName);
                if (entryType == EntryType.Directories || 
                    entryType == EntryType.All) retValue.Add(path + findData.cFileName);
            }
            else
            {
                if (entryType == EntryType.Files || 
                    entryType == EntryType.All) retValue.Add(path + findData.cFileName);
            }
        } while (FindNextFile(hFile, out findData));
        if (searchOption == SearchOption.AllDirectories)
            foreach (string subdir in subDirs)
                Enumerate(subdir, searchPattern, searchOption, ref retValue, entryType);
    }
    FindClose(hFile);
}

This method obtains all parameters from the original Enumerate method (path, search pattern, search Option), plus the reference parameters retValue and entryType, which is an enum:

private enum EntryType { All = 0, Directories = 1, Files = 2 };

This enum is used as a selector to return only directories, files or both.

The Enumerate method calls FindFirstFile by calling and then traverses all other file system entries through FindNextFile. If entryType = Files, it will add all the files in the retValue list. For Directories, it only adds directories, and for All, it adds both.

If searchOption = AllDirectories, the method is called recursively. The results of all (recursive) calls are collected in a variable retValue. In the first version of the method, I used the return type List and connected it to the retValue variable of the calling function after each recursive call, but the solution using the by-ref parameter proved much faster.

Finally, each file search that calls HANDLE is closed by calling the FindClose method.

 

Original address: https://www.codeproject.com/Articles/1383832/System-IO-Directory-Alternative-using-WinAPI

Topics: Windows