Updated on 2020-05-04
Start out with the basics for command line argument processing and exception handling for your console based utilities
I write a lot of command line utilities - typically code generators and such. These tools follow a basic overall pattern of command line parsing, usage reporting and exception handling. Due to this basic structure being similar across tools, I start with some boilerplate code which I then modify to create them. I will be sharing and explaining that code here. After trying many different techniques for building these tools over the years, I've settled on a basic process that works well and that begins with this code.
We have several problems here we need to solve. These problems are general to virtually any command line tool application: We need to provide a usage screen, process command line arguments, and report errors.
Using our technique, switches take the form of /
With the code as listed, it also takes a variable number of unswitched arguments before any of the switches. This can be changed in the code.
Processing command line arguments is something that almost seems like it can be generated or generalized. The truth is, it can be, but the additional complexity cost often isn't worth it. Even if we had a way to generalize argument processing, it still holds that adding, removing or changing an argument will require changes in the logic of the application itself, so you're not really removing much - if any work. This has been my experience for the most part. The other issue is that while it's quite easy to generalize 80% of the argument cases, the other 20% are non-trivial. It becomes less justified than writing specialized argument processing code.
What we're going to do is take the partially processed string[] args array that C# gives us in our Main() and do a simple loop over it driving a switch/case that handles our flags. The upside is it's easy to maintain and modify and it handles quoted filenames and such. The downside is it's basic, and will also weirdly accept things like "/switch" with the quotes around the switch itself. That's okay, it's just not ideal but it shouldn't do any harm either. There's a small wrinkle when you accept an arbitrary number of unnamed arguments prior to any switches. We handle this in the boilerplate code provided.
On a related note, experience has taught me it's not a great idea to accept input from STDIN despite it enabling piping because the console "cooks" input which can corrupt it, causing hard to track down errors. At the same time, it can be desirable to send data to STDOUT for display or printing purposes, or directly to a file, for an "uncooked" copy. For this reason, my tools require you to specify at least one input file if they take input at all, but they don't require an output file to be specified. This is one of those things that you learn just from experience. Some of my earlier projects were designed for piping, and it could create problems, especially for Unicode streams.
Also note that because we do not require an output file we send any messages to Console.Error, not Console/Console.Out. This is because if we're sending output to STDOUT, we want out of band information to go to STDERR. This is important for a clean command line interface.
The idea is for our executable to report any errors and then return 0 as its exit code on success, or some other value on failure. We handle this by wrapping the whole mess in a try/catch/finally block and then using the catch portion to return an error code divined from the exception thrown. One small wrinkle here is that this global exception handling is not what we want while debugging, so if we're compiling with the DEBUG compile time constant, we nix the catch block and just let the exception be raised. This greatly simplifies debugging the application.
The usage screen is our built in "help" feature. It gives a basic description of the application, the version, and the information on the command line usage and what the switches mean. We report it any time an error occurs simply because we assume something must be wrong with the input parameters. You can change this behavior if it's undesirable. We gather much of this information from the assembly attributes specified in AssemblyInfo.cs.
I've included this because I use it in virtually all of my command line generator tools. This is critical for tools that can take a long time to work, such as DFA lexers and parser generators, but that can be true of so many projects that I think there are more use cases for the feature than against it. If your tool generates output files and might take significant time to execute, it can be helpful to provide a feature that will only rerun if the output is older than the input. This makes it so the tool will only perform work if the input files have changed. We allow for this feature through the /ifstale switch. If your tool does not generate files this won't be necessary, but it was provided here since it's often the case that a tool will generate a file. Note that we also check if the executable itself is newer than the output. This makes it easier to use as a pre-build step in a related project while also working on developing the tool itself. Basically, if the executable has changed, it will rerun the generation process.
Here's the boilerplate code almost in its entirety. The only things I've omitted are the surrounding namespace and the using declarations. Otherwise, we'll cover the code from top to bottom:
class Program
{
static readonly string _CodeBase =
Assembly.GetEntryAssembly().GetModules()[0].FullyQualifiedName;
static readonly string _File = Path.GetFileName(_CodeBase);
static readonly Version _Version = Assembly.GetEntryAssembly().GetName().Version;
static readonly string _Name = _GetName();
static readonly string _Description = _GetDescription();
static int Main(string[] args)
{
int result=0; // the exit code
// command line args
List<string> inputFiles = new List<string>(args.Length);
string outputFile = null;
bool ifStale = false;
// holds the output writer
TextWriter output = null;
try
{
// no args prints the usage screen
if (0 == args.Length)
{
_PrintUsage();
result = -1;
}
else if (args[0].StartsWith("/"))
{
throw new ArgumentException("Missing input files.");
}
else
{
int start = 0;
// process the command line args:
// process input file args. keep going until we find a switch
for (start = 0; start < args.Length; ++start)
{
var a = args[start];
if (a.StartsWith("/"))
break;
inputFiles.Add(a);
}
// process the switches
for (var i = start; i < args.Length; ++i)
{
switch (args[i].ToLowerInvariant())
{
case "/output":
if (args.Length - 1 == i) // check if we're at the end
throw new ArgumentException(string.Format("The parameter
\"{0}\" is missing an argument", args[i].Substring(1)));
++i; // advance
outputFile = args[i];
break;
case "/ifstale":
ifStale = true;
break;
default:
throw new ArgumentException
(string.Format("Unknown switch {0}", args[i]));
}
}
// now that the switches are parsed
// would be a good time to validate them
// now let's check if our output is stale
var stale = true;
if (ifStale && null != outputFile)
{
stale = false;
foreach (var f in inputFiles)
{
if (_IsStale(f, outputFile) || _IsStale(_CodeBase, outputFile))
{
stale = true;
break;
}
}
}
if (!stale)
{
Console.Error.WriteLine("{0} skipped generation of {1}
because it was not stale.", _Name, outputFile);
}
else
{
// DO WORK HERE!
// TextWriter output will be cleaned up automatically on exit,
// so set it to your output source when ready to generate.
// It's a good idea not to open the output until everything
// else has been done so that errors in the input will not
// cause an existing file to be overwritten.
}
}
}
#if !DEBUG
// error reporting (Release only)
catch (Exception ex)
{
result = _ReportError(ex);
}
#endif
finally
{
// clean up
if (null != outputFile && null != output)
{
output.Close();
output = null;
}
}
return result;
}
static string _GetName()
{
foreach (var attr in Assembly.GetEntryAssembly().CustomAttributes)
{
if (typeof(AssemblyTitleAttribute) == attr.AttributeType)
{
return attr.ConstructorArguments[0].Value as string;
}
}
return Path.GetFileNameWithoutExtension(_File);
}
static string _GetDescription()
{
foreach (var attr in Assembly.GetEntryAssembly().CustomAttributes)
{
if (typeof(AssemblyDescriptionAttribute) == attr.AttributeType)
{
return attr.ConstructorArguments[0].Value as string;
}
}
return "";
}
#if !DEBUG
// do our error handling here (release builds)
static int _ReportError(Exception ex)
{
_PrintUsage();
Console.Error.WriteLine("Error: {0}", ex.Message);
return -1;
}
#endif
static bool _IsStale(string inputfile, string outputfile)
{
var result = true;
// File.Exists doesn't always work right
try
{
if (File.GetLastWriteTimeUtc(outputfile) >= File.GetLastWriteTimeUtc(inputfile))
result = false;
}
catch { }
return result;
}
static void _PrintUsage()
{
var t = Console.Error;
// write the name of our app. this actually uses the
// name of the executable so it will always be correct
// even if the executable file was renamed.
t.WriteLine("{0} Version {1}", _Name,_Version);
t.WriteLine(_Description);
t.WriteLine();
t.Write(_File);
t.WriteLine(" <inputfile1> { <inputfileN> } [/output <outputfile>] [/ifstale]");
t.WriteLine();
t.WriteLine(" <inputfile> An input file to use.");
t.WriteLine(" <outputfile> The output file to use - default stdout.");
t.WriteLine(" <ifstale> Do not generate unless
<outputfile> is older than <inputfile>.");
t.WriteLine();
t.WriteLine("Any other switch displays this screen and exits.");
t.WriteLine();
}
}
The first thing you'll notice is several static readonly fields. These contain basic information about our executable, primarily for use by the usage screen, but it's not unlikely that it may be of use elsewhere in your code so they're provided here for easy access.
After that, there's the Main() routine. Note we return an int here. This is so we can have as much control over our return value as we need, which is critical if this tool is to be used in batch files or build steps. For the most part however, we'll be dealing with errors by throwing exceptions just like we usually would, and let the boilerplate logic translate that to an exit code.
The next point of interest is the list of command line arg variables. I like to make one per argument. Whenever we add or remove a command line argument, its corresponding variable should be declared or removed from here. That makes things clearer to have them declared all in one spot. Whenever I modify these, the next thing I do is modify the _PrintUsage() routine accordingly so I don't forget.
Now we have our output argument. This should be set to Console.Out if /output is not specified, or to the specified outputFile via a StreamWriter or some such. When the program exits, this will be closed automatically. It's important to only set this at the last possible moment so that if any errors occur prior it won't wipe the previous contents of your output file. Obviously, if you're not going to have an output file, all of this corresponding code should be removed.
Now we may need to keep variables for any resources we're going to need to close later. If you access a database for example, you're going to probably want to hang on to a connection, and then close it later. If so, declare a variable for it here and set it to null. Populate it later. There's a finally block where we'll close it below.
Finally, we have the beginning of our try/catch/finally block that surrounds most of our code. In here is where we start to do real work.
After that, we do some pre-argument validation, starting with printing the usage screen and exiting if no arguments were specified.
Next, we loop until we find a leading / indicating a switch. Each argument that occurs until then winds up in the inputFiles list. If your app doesn't take multiple input files, you'll want to modify this code to only read the first argument into an inputFile variable instead of looping and reading into inputFiles. Obviously, if you'll not be using input files at all, all of the associated code should be removed.
Now we do the switch processing. Basically, we spin a loop and in each iteration, we see what switch we're at. If the switch takes an argument, we need to check to make sure we are not on the last argument, and then, we need to increment i one extra time after we store the result, as follows:
case "/output":
if (args.Length - 1 == i) // check if we're at the end
throw new ArgumentException(string.Format
("The parameter \"{0}\" is missing an argument", args[i].Substring(1)));
++i; // advance
outputFile = args[i];
break;
Here, since /output expects a single argument, we check to make sure we're not at the end, throwing if we are. Otherwise, we advance i by 1, and then set the appropriate command arg variable. This can be copied and pasted to make new switches that accept a single argument, like so:
case "/name":
if (args.Length - 1 == i) // check if we're at the end
throw new ArgumentException(string.Format
("The parameter \"{0}\" is missing an argument", args[i].Substring(1)));
++i; // advance
name = args[i];
break;
I've highlighted the two changes in bold to illustrate a /name switch taking a single argument. The code was made to be copied and pasted.
It's a similar thing with a boolean switch:
case "/ifstale":
ifStale = true;
break;
The same two basic changes need to be made as above to add more.
If you need to create a switch that takes a variable number of arguments, you'd create code under your new switch that works very much like the inputFiles gathering code, except it will use i instead of start as its working variable.
The default case throws since that indicates an unrecognized switch.
The overarching idea, in case it wasn't clear already, is that this switch/case sets the command-line variables that were declared earlier.
Sometimes, you'll have command line arguments that are illegal to specify alongside other command line arguments. For example, you might have a /debug option you can't specify with the /optimize option. After the switch loop is done, you'll want to do any post validation on your command-line variables to handle these cases, throwing exceptions as necessary. There is no code to that here, since we don't have such a scenario in the boilerplate code.
Now we move on to the /ifstale feature. As before, it skips the generation of the output unless the input is newer than the output, or unless the executable itself is newer than the output. The code that handles this is in the section that immediately follows the post-validation above. The only thing you may need to change, is if you're only working with a single input file, you'll have to remove the loop inside the stale checking code block and make it work on inputFile instead of inputFiles.
After all that, here we are, in the else block that is commented is where we do our work. The steps here are to gather your data, process it, and then finally, open the output stream and generate the output. You can delegate to a routine here to do the work, and that might be a good idea, but I didn't want to confuse the flow. The only issue with delegating here is you'll probably need to pass a lot of variables - namely most of the command line argument variables you've declared. In practice, I find whether or not, and how I do this exactly depends heavily on the application, but in practice I find it easier to just do a lot of the work here, which itself delegates to other things, like a code generator class.
In the finally block that follows, you'll want to release any resources in addition to output, like if you declared a database connection from the earlier hypothetical. Remember to check for nulls.
Nothing that follows except for the _PrintUsage() should need to be modified in your applications, as it's all support code that gathers assembly attributes and compares file dates. Note that when we compare files, we don't rely on File.Exists() since it doesn't like to work for UNC network paths.
Making your app MSBuild "friendly" in terms of communicating with things like Visual Studio when running as a pre-build step involves structuring your console messages the way that MSBuild likes them. You'll have to modify the error reporting, and you'll have to be careful how you structure your status messages besides, but doing so is beyond the scope of this article. Even if your tool doesn't do this, it will still work with Visual Studio. It just won't have some of the frills, like getting error and warning details with line numbers to show up in the build errors list. Know that it's possible, though.