Good questions, and I don't claim to have definitive answers.
Excellence is subjective, but there are definitely technical aspects that are excellent predictors of overall font quality. The signature mark of an amateur freeware font is poor spacing. Of course, using only spacing would underrate fonts like Mrs. Eaves, but I think that's the exception that proves the rule, and I think it's certainly possible to discuss the overall excellence of that font in other areas.
There are also uninspired faces with perfect geometric spacing, at the sacrifice of other qualities, too. While spacing might be the first thing to look for, it is indeed only part of the puzzle.
No matter where you draw the line, you're going to have faces that straddle it. For example, Centaur is on the "arty" side of the workhorse text spectrum, and is more than viable as a display or titling face. Perhaps faces on the borderline get reviewed in both categories, with different criteria?
Perhaps there should be a spectrum to choose from. Instead of breaking it by formation of the letters, or history, or other existing ontologies, let the submitter describe the intended functionality? At one end densest use, at the other, sparsest? Meaning that the only codification would be the foundry's anticipated application. Off the top of my head, from densest it might go: long copy, short copy, highlights, display, decorative? Dingbats could even go out past decorative. Better names required, obviously.
Anyway. Depending on which category a submitter felt their work belonged in, it could be graded differently. Primarily, The individual leterforms, and large-scale proportions would matter most at one end, while the other emphasized color, consistency, etc.
First weakness I can think of to this is that where we draw the lines between the applications changes over time. Existing fonts would move along that scale over a course of years. Are the workhorse sands drifting as we speak from short copy to long copy?