An Improved Stream.CopyToAsync() that Reports Progress

Updated on 2020-07-23

Add progress reporting to your downloading or copying using this code

Introduction

The .NET Framework provides a Stream.CopyToAsync() method that provides for cancellation but not progress reporting, even though there is a standard interface, IProgress which can be used to report progress for tasks. That makes it undesirable for extremely long copies, or from slower streams like NetworkStream where you'd like to update the user periodically. I aim to provide you with one that does, and explain how it works, both in terms of using it, and in terms of creating it. It's a very simple project. The demo itself is more complicated than the CopyToAsync() with progress implementation. The demo itself, "furl" is a little tool primarily intended for downloading from remote sites, but can also be used to copy files locally.

Conceptualizing this Mess

I recently fell in love with the Task framework and awaitable methods, leading me to dive into dark corners of it and produce some articles recently. I decided to come up for air, and do something simple, but useful. We like the Task framework, am I right? Well, happily Stream provides awaitable methods in the TAP pattern but we can't pass in a progress object. A progress object is simply an instance of a type that implements IProgress. Such a class allows for the task to report the progress back to the initiator. If CopyToAsync() did accept a progress object, it would be most of the code necessary to do lengthy downloads with progress reporting using a single call. What we want is an extension method that creates overloads for CopyToAsync() that accept a progress object.

The IProgress Interface

This interface provides one member which is the Report() method. The consumer task calls Report() on it periodically in order to report the progress of the task. How that gets reported back to the caller isn't defined, but the base Progress class the framework provides implements this interface while exposing a ProgressChanged event that can be hooked to update the progress.

The New CopyToAsync() Extension Methods

These methods provide overloads that accept a progress object. In order to do so, we had to reinvent the functionality in the base CopyToAsync() methods so that we could report progress inside the copy loop. We use asynchronous reads and writes to accomplish this.

Coding this Mess

The New CopyToAsync() Extension Methods

We'll cover these first, since they're the heart of the project. There's really only one that has any code in it, as the rest are just overloads for the primary method, which is here:

/// <summary>
/// Copys a stream to another stream
/// </summary>
/// <param name="source">The source <see cref="Stream"/> to copy from</param>
/// <param name="sourceLength">The length of the source stream,
/// if known - used for progress reporting</param>
/// <param name="destination">The destination <see cref="Stream"/> to copy to</param>
/// <param name="bufferSize">The size of the copy block buffer</param>
/// <param name="progress">An <see cref="IProgress{T}"/> implementation
/// for reporting progress</param>
/// <param name="cancellationToken">A cancellation token</param>
/// <returns>A task representing the operation</returns>
public static async Task CopyToAsync(
    this Stream source,
    long sourceLength,
    Stream destination,
    int bufferSize,
    IProgress<KeyValuePair<long,long>> progress,
    CancellationToken cancellationToken)
{
    if (0 == bufferSize)
        bufferSize = _DefaultBufferSize;
    var buffer = new byte[bufferSize];
    if(0>sourceLength && source.CanSeek)
        sourceLength = source.Length - source.Position;
    var totalBytesCopied = 0L;
    if (null != progress)
        progress.Report(new KeyValuePair<long, long>(totalBytesCopied, sourceLength));
    var bytesRead = -1;
    while(0!=bytesRead && !cancellationToken.IsCancellationRequested)
    {
        bytesRead = await source.ReadAsync(buffer, 0, buffer.Length);
        if (0 == bytesRead || cancellationToken.IsCancellationRequested)
            break;
        await destination.WriteAsync(buffer, 0, buffer.Length);
        totalBytesCopied += bytesRead;
        if (null != progress)
            progress.Report(new KeyValuePair<long, long>(totalBytesCopied, sourceLength));
    }
    if(0<totalBytesCopied)
        progress.Report(new KeyValuePair<long, long>(totalBytesCopied, sourceLength));
    cancellationToken.ThrowIfCancellationRequested();
}

The first thing you'll probably notice in the code is that there are a lot of parameters to this method. That's okay, as there are several overloads that emit one or more parameters, but each takes a progress argument.

Next, you'll see we're setting the bufferSize to the _DefaultBufferSize (81920 like the framework versions of CopyToAsync()). Then we create a buffer of the size. Next if the sourceLength wasn't specified (0) and the source Stream CanSeek we use that feature to compute the length of the copy operation. Otherwise, we can't report a bounded progress - we must resort to reporting an unbounded progress. This can happen over the web if you're downloading content that was sent chunked, with no Content-Length response header, which is common.

Next, if the progress argument is not null, we Report() the initial progress at zero. Inside the loop, which terminates if the operation is canceled or if there are no more bytes, we asynchronously read into our earlier buffer. Then we check to see if no bytes were read or if the operation was cancelled, in which case we stop. Otherwise, we write and update our total bytes copied, which we use for progress reporting.

Once again, if the progress is not null, we Report() the current progress, in bytes.

If we actually copied anything, then we Report() the final status.

Finally, if the operation is cancelled, we throw to let the Task know that it was cancelled.

The Demo Project "furl"

This project as I said will download a url to a file, or copy one file to another. It's pretty simple, thanks to the above. Here's our entry point code so you see how we make the request:

if(2!=args.Length)
{
    _PrintUsage();
    throw new ArgumentException("Two arguments expected");
}
var url = args[0];
var stopwatch = new Stopwatch();
stopwatch.Start();
if (-1 < url.IndexOf("://"))
{
    var wreq = WebRequest.Create(url);
    using (var wresp = await wreq.GetResponseAsync())
    {
        var httpresp = wresp as HttpWebResponse;
        var sourceLen = -1L;
        if (null != httpresp)
            sourceLen = httpresp.ContentLength;
        // disposed with wresp:
        var src = wresp.GetResponseStream();
        await _CopyToDstAsync(src, args[1],stopwatch);
    }
} else
    using (var src = File.OpenRead(url))
        await _CopyToDstAsync(src, args[1],stopwatch);

We have two major code paths here depending on whether you specified a file or an URL. Both end up delegating to _CopyToDstAsync() to do the copying. The URL fork sees if the response is HTTP and if it is, it looks for a Content-Length header. It uses that as the copy operation's total length. This way downloading from the web will at least sometimes give you a bounded progress. Let's look at _CopyToDstAsync():

// BUG: Progress doesn't always report the last block, so it may not end at 100%
// I'm not sure why
var totalBytes = 0L;
using (var dst = File.Create(path, 81920, FileOptions.Asynchronous))
{
    dst.SetLength(0L);
    var prog = new Progress<KeyValuePair<long, long>>();
    var first = true;
    var i = 0;

    prog.ProgressChanged += delegate (object s, KeyValuePair<long, long> p)
    {
        var str = " Downloaded";
        lock (_Lock)
        {
            if (-1 != p.Value)
            {
                ConsoleUtility.WriteProgressBar((int)(p.Key / (double)p.Value * 100), !first);
                str += string.Format(" {0}kb/{1}kb", p.Key / 1024, p.Value / 1024);
            }
            else
            {
                ConsoleUtility.WriteProgress(i, true);
                ++i;
                str += string.Format(" {0}kb/???kb", p.Key / 1024);
            }
            totalBytes = p.Key;
            first = false;
            Console.Write(str);
            Console.Write(new string('\b', str.Length));
        }
    };
    await src.CopyToAsync(-1, dst, prog);
    stopwatch.Stop();
}
lock (_Lock)
{
    Console.WriteLine();
    Console.WriteLine("Done @ " +
            Math.Round(totalBytes / 1024d / stopwatch.Elapsed.TotalSeconds) + "kbps");
}

Note the bug. For reasons I can't seem to find, sometimes the progress doesn't report on the final block, leaving the progress reporting stuck at 99% or so. See the Bugs section at the end.

Bugs

Moving on, we create a file for asynchronous I/O and then we set the length to zero. I always do this because sometimes the framework likes to open an existing file and set the position to the beginning, instead of deleting it which means if the new file is shorter than the old one, the difference in extra bytes will remain at the end of the file. SetLength(0) ensures that doesn't happen.

Next, we create a progress object and a couple of bookkeeping variables. Then we hook the ProgressChanged event on the progress object we just created and in there we lock and then write the progress to the Console. The reason for the lock is if we don't have it, the console won't necessarily write everything in order, leading to a messed up status screen.

Bugs

Bugs

There is one I haven't been able to track down, wherein the progress reported by the demo sometimes doesn't report the final progress so it sticks at 99% for example. I don't think this impacts the CopyToAsync() method I wrote. I believe the bug is in the demo project but I'm not sure where. It's intermittent.

Since this was in the demo app, and not show stopping, I decided to release as is.

If anyone spots this bug, please say so in the comments.

Thanks!

History

  • 24th July, 2020 - Initial submission