Programming the Evaluation Robot
Stating where you want to go is acknowledging that you are not yet
there. Seen from this perspective, the CCMC's recently published
vision statement is an interesting list of its own
non-accomplishments.
Especially when it comes to terms like “objectivity” and “repeatability,” used by the visionaries of the new direction in Common Criteria like in a Tibetan prayer wheel. I understand that the CCMC has utterly failed to establish a common and sufficiently high standard for evaluations underlying the CC certificates. Apparently, some nations concluded that they cannot trust other nation's certificates up to the assurance levels that have been accepted in the past.
Don't get me wrong; I too, want evaluations to be as objective and repeatable as possible. However, from all my experience working in the field of IT security over the past 30 years, I know as a fact that an evaluation that reduces itself to a checklist activity has no value at all. It is the ultimate victory of form over contents. In IT security, not looking at the contents is not looking at security at all.
This really makes me nervous. From my worm's eye view, I get the impression that some CCMC members, rather than bringing their labs to a higher standard, seem to believe that they could be so specific with their collaborative Protection Profiles that they can program their evaluation robots from such a specification, and that without further ado, these evaluation robots will deliver objective and repeatable results at a sufficiently meaningful level.
It won't work. It cannot work. The “simple” reason: complexity! Even if products fall into the same category, they are not standardized to a point where the evaluation robot could deal with them in a meaningful way.
Do you remember the hype with artificial intelligence and expert
systems in the 80's? If you don't, ask yourself why this topic has so
silently disappeared? My explanation is that all the rules and
checklists cannot possibly go down to a single individual or product,
but must always stop at some higher level and deal with some
uncertainty below that. That's what the CCMC needs to understand.
Think of medicine or jurisdiction. Did you ever wonder why you
cannot go to court, state your cause in front of a machine, turn a
crank and get the verdict? No, you did not. You know that even with a
vast amount of laws and regulations, you cannot possibly cram life's
complexity into a set of rules dealing with every aspect and
combination. Instead, you rely on a judge to come to a verdict; he
shall take all the relevant details of your cause into account, even
if these details have not been spelled out explicitly, but can only
be deducted from other cases. You want judges to have sufficient
experience and enough common sense to come to a fair verdict (o.k.,
you don't insist on “fair” as long as you win ;-) ). What you also
want is that the verdict comes with a rationale that allows you, your
lawyer or other judges to follow the chain of arguments that led to
that verdict, and to challenge it if it is not sound.
Medicine provides similar examples. You don't want to grab somebody off the street, give them a checklist and have them diagnose you. Similarly, you don't want to be diagnosed by a robot which cannot look left and right from its pre-programmed algorithm. Again, what you expect is expertise, experience and common sense.
Evaluations are very similar to these scenarios:
Especially when it comes to terms like “objectivity” and “repeatability,” used by the visionaries of the new direction in Common Criteria like in a Tibetan prayer wheel. I understand that the CCMC has utterly failed to establish a common and sufficiently high standard for evaluations underlying the CC certificates. Apparently, some nations concluded that they cannot trust other nation's certificates up to the assurance levels that have been accepted in the past.
Don't get me wrong; I too, want evaluations to be as objective and repeatable as possible. However, from all my experience working in the field of IT security over the past 30 years, I know as a fact that an evaluation that reduces itself to a checklist activity has no value at all. It is the ultimate victory of form over contents. In IT security, not looking at the contents is not looking at security at all.
This really makes me nervous. From my worm's eye view, I get the impression that some CCMC members, rather than bringing their labs to a higher standard, seem to believe that they could be so specific with their collaborative Protection Profiles that they can program their evaluation robots from such a specification, and that without further ado, these evaluation robots will deliver objective and repeatable results at a sufficiently meaningful level.
It won't work. It cannot work. The “simple” reason: complexity! Even if products fall into the same category, they are not standardized to a point where the evaluation robot could deal with them in a meaningful way.
Every product is different, therefore every
evaluation is different, too!
No checklist will be detailed enough!
Medicine provides similar examples. You don't want to grab somebody off the street, give them a checklist and have them diagnose you. Similarly, you don't want to be diagnosed by a robot which cannot look left and right from its pre-programmed algorithm. Again, what you expect is expertise, experience and common sense.
Evaluations are very similar to these scenarios:
- I expect an ITSEF to diagnose a product under evaluation and
come to a verdict based on expertise, experience and common sense. Actually, I don't have a problem if a doctor has some checklists
that I fill out as an efficient start for an anamnesis, and if he
might use one just to be sure that no important step was forgotten.
I would, however, leave as soon as he told me that he was only
allowed and capable to diagnose the diseases on his list.
- I expect ITSEFs (and CBs) to be qualified for their job.
However, I don't expect all of them to be at the same level of
expertise and to specialize in all product types. I'm fine with visiting
my doctor if I have a cold, but I would not have him do brain
surgery on me.
- I expect ITSEFs to be accredited by the
CBs much as a doctor gets an approbation, i.e., based on proven
expertise and experience. As a patient, I expect doctors to be under
supervision and charlatans to be banned from
practice.
- I expect ITSEFs to document their evaluation work in a way
that I can understand what they did, how they did it and what the
arguments were that lead to their verdict. I accept that two judges
may come to different verdicts on the same case, although I would
prefer if that did not happen. However, if it does, it is crucial
that the verdict comes with a rationale that can be followed and
allows a higher court to assess if all aspects had been considered
with due care.
by
Gerald Krummeck
Head of the German ITSEF
Head of the German ITSEF
Reading this, I think that objectivity should be a goal -- and should already be achieved. The opposite of objectivity is bias towards a vendor or other organization, and there should be no trace of that in evaluations.
ReplyDeleteI agree that repeatability is harder to acheive, and that a checklist approach is not the way to achieve that. Repeatability, however, can be achieved in a number of ways. It can be achieved through dictating the specifics of how something is to be tested. I'll argue that's more than repeatability -- that's standardized testing (akin to what we seen in US public schools), which doesn't always have the intended effects of creating quality.
But repeatability can be achieved in other ways, such as through documentation of test plans and procedures, in such a way that another organization could come in and perform the exact same tests and, presuming the same product, get the same results. That also is repeatability... and that is something that is achievable without degredation of quality.
Dear Dan,
ReplyDeleteI fully agree with your comments. My current concern is that assurance is lowered with the argument that evaluation activities are not objective and repeatable, using a very narrow definition of these terms.
We shall strive for objectivity and repeatability, but we also shall accept that there will be differences in the details, and that such individual variations are not bad as long as they are documented and well-founded, rather than arbitrary, as you have said. I know that in many areas, when the evaluator's experience comes into play, this adds tremendous value to the evaluation, and I don't want this value to be thrown away with pseudo-formal arguments.
Regards,
Gerald
Gerald,
ReplyDeleteYour blog makes some sense but you don't have any discussion of the reason for this change in the first place; the time, effort, cost of the current process; which leads to evaluations of products that are no longer being sold or supported. The problem NIAP is trying to solve is getting COTS in the hands of the acquires of a product that is actually still sold by the vendor. By the way, the one area that is not being compromised in the new approach is VAN and I would argue that is the one area where some subjectivity is ok.