They can make determination holistic and opaque, like the super-selective universities, but then they would be accused of corruption, favoritism, and unfairness at every turn. Or they can preset some automatically gradable criteria beforehand to avoid that perception, but then they can get into the kind of problem that you are describing where the preset criteria inadequately show who really is a top-end achiever, and/or may create undesired incentives.