P O S T E D B Y A L B E R T
I was recently invited by Bill Schambra, director of the Bradley Center for Philanthropy & Civic Renewal, to speak on a panel titled Metrics Mania. Panelists were asked to comment on an essay commissioned from Gary Walker, the founding director of Public/Private Ventures, on the subject of evaluating social programs. Here is the text of my remarks ...
Bill Schambra saw my role on this panel as the guy who when asked about metrics replies “metrics schmetrics,” but I’m going to have to disappoint him—to some extent. Measurement and evaluation, when done properly, are not just a bit of value-added for philanthropic or nonprofit work, they’re absolutely essential. Only a fool would disagree with that proposition.
But here I mean not just the kinds of formal evaluations described by Gary Walker in his essay, but informal evaluation as well: the kinds of course corrections we naturally make when we embark on a project, take a false step, and adjust what we do accordingly. Evaluation is not and should not be the sole province of the highly compensated consultant. We evaluate all the time; our own eyes and ears notice things the most astute consultant will never notice; and we’ll often be our own worst critics.
Now here’s where the metrics schmetrics comes in, perhaps: We’ve written more nonsense about evaluation than just about any other subject in philanthropy. Worries about evaluation, engendered in part by logic models the length of whale intestines, have become the math anxiety of the philanthropic world.
My general thesis—if I could call it that—is that from the perspective of somebody like Mr. Walker whose organization has been commissioned to conduct lucrative, large-scale evaluations of social programs (lucrative by nonprofit standards), the Impact Revolution might seem like a good thing. But from the ground, from the perspective of many people working in community-based organizations, this so-called revolution has brought with it new sources of irritation, new ways of adding meaningless make-work to already overburdened nonprofit staff members.
It has not been a people’s revolution, in other words, but rather one championed by elites—like myself, I’m afraid— sometimes unable to see far enough beyond our own measuring sticks to understand the limitations of formal evaluation techniques, and the trade-offs in staff time and other resources that these formal techniques require.
Some of these limitations have been well rehearsed—I’m thinking, for example, of the charge that this revolution has in many cases (not all) prompted us to measure the things that can be measured rather than the things that are important. Other limitations, like the absurdity of many logic models, are more technical and less well understood or discussed.
I’ll get to what I believe is perhaps the most challenging aspect of this revolution in a minute, but first I’d like to make a few quick comments about Gary’s paper:
After pointing out that “outcomes” are not the same as “impacts,” Gary comes to the startling conclusion that “the overwhelming majority of social programs with impact studies do not show a significant change in participants’ lives a year or two after the program.” He tries to soften the apparent harshness of this conclusion by claiming that these dismal results are an “artifact” of our approach to conducting these impact studies. He points out, for example, that these studies are typically conducted for new projects that haven’t been given the time or resources to work out their kinks. New projects, lots of problems that haven’t been worked out, dismal results.
As you ponder Gary’s startling claim, I urge you to keep the following in mind:
1. First, those of us who have worked in federally-funded programs, like those evaluated by Gary’s organization, know how notoriously stingy the feds are when it comes to payments for program expenses and overhead. Throw in the Byzantine reporting requirements and I’d be surprised if any of these programs succeeded.
2. It’s difficult to assess Gary’s claim without having all the data in front of us. Is it possible that the same organizations conducting the impact studies might also have been called upon to fix the problems they identified? If so, couldn’t this have introduced a source of bias?
3. Given that Gary himself says that impacts might not be apparent for ten or fifteen years—how many of the impact studies he refers to were conducted over this very long time span? How can any impact study with a pretense to being scientific, ever control for the bias introduced by the self-selection of the clients for a particular program? And the list of methodological worries goes on.
Unfortunately, the ink hasn’t had time to dry on Gary’s essay before his words are yanked out of their context and given pride of place in the invitation for today’s panel discussion. And here I quote: “Why is it that philanthropy has learned so much about metrics and yet has so little by way of measurable success to show for it?”
I vigorously reject both of the implied claims. I’m not convinced we’ve learned much about metrics—we’re doing more of it, perhaps, but we’re not doing it better; and I certainly reject the notion that philanthropy has little to show by way of measurable success.
But the damage is done. The meme is loose and I can’t call it back.
I resonate most with Gary’s essay when he writes that “the first things funders need to be accountable for is the quality of the program which they’re funding. That requires patience, and a use of funds for things like training ....” I can think of other investments funders should be willing to make, but I like how Gary’s words suggest a certain amount of care and support for grantees, a genuine collegiality.
And this gets to my primary worry about the metrics revolution: I find the image of a funder with a stopwatch in one hand and a clipboard in the other, hunched over a perspiring grantee, rather ghastly, frankly. It’s uncivilized, so clearly opposed to what I believe should be the ethos of the charitable sector, an ethos rooted in love for our fellow men and women, expressed through our work, and incorporating the values of cooperation and mutual support, among others.
Too often, however, we funders use evaluations like blunt weapons, barely understanding—if at all—the limitations of these tools, and certainly being unwilling, for the most part, to turn these weapons on ourselves.
A few very quick points about this impact revolution as seen from the ground, then I’m done.
1. I’d like to suggest an “outputs counter-revolution.” I find the whole progression from outputs to outcomes to impacts one of the great bugbears of contemporary thinking about evaluation. For those of you unfamiliar with this bit of chicanery, outputs are the things you do, like mentoring a young person for five hours a week; outcomes happen because of the things you do; and impacts, I assume, are outcomes that stick or that extend beyond our original goals.
I wouldn’t be surprised if someday we started requiring grantees to demonstrate not just their impacts but their—what would we call them?— hyperimpacts: the effects of a given social program on the afterlife, or on universes parallel to our own. Note that it would be absurd for us to call the gas company, thank them for their outputs (namely, the gas they deliver to our houses), and then complain that they haven’t demonstrated to us any outcomes or impacts. Why is it that we reserve this requirement for the people who work in the nonprofit sector?
And what’s so bad about outputs? As a donor, it’s enough for me to know that you delivered a quality youth-development program to 25 kids in a church basement who wouldn’t otherwise have the opportunity. For God’s sake, don’t incur the expense of trying to track the effect of your program on these kids ten or fifteen years down the pike. That would be a ridiculous waste of resources.
Unfortunately, all this talk of outputs, outcomes, and impacts blinds us to the fact that in many cases—again not all—simple outputs are all that we can reasonably hope for or require.
2. I don’t know how many of you have followed the development of “distributed computing.” In this model, instead of having and running an entire application like Photoshop on your desktop, it’s distributed across two or more computers connected by a network. There are many advantages to this model, among them, your ability to access and use the most updated version of Photoshop without having to purchase the entire thing directly and load it onto your machine.
Just like we’ve seen the advent of distributed computing in the digital world, I’d like to suggest we try something like “distributed evaluation” in the nonprofit world. Here’s how it would work in the case of a youth development organization, to take one example. We assume that academics and others have already researched to death and determined those elements of a youth development program likely to yield good outcomes for young people. If you’re a grantee, suppose we make it your responsibility to demonstrate that you’ve incorporated these elements into your program. We then make it part of my responsibility as a funder to know (because I’ve done the homework and read the literature) what these success-generating characteristics are, and to verify that they’re characteristics of your program. In this way, the burden of evaluation is shared three ways, and neither the funder nor the grantee needs to prove for the eleven-billionth time that young people respond well to nurturing environments that stimulate their hearts and minds.
3. I want to make clear that I’m not in the least anti-evaluation. As I've written elsewhere, I’m concerned that we tend to seek a kind of scientific or moral certainty from a formal evaluation where none exists. The questions that funders most often bring to an evaluator—Was this program worth our $25,000 investment? Should we continue funding it?—are questions only they, the funders, can answer. Say we measure a 25 percent drop in the truancy rate for a hundred kids in some program, and a 25 percent increase in their test scores. Is that worth $25,000 to you? Each donor needs to answer that question for him- or herself. As donors we will never be absolved of our responsibility to use our good judgment.
One of the greatest benefits an organization like Grantmakers for Effective Organizations can provide to the field is not a training on how to conduct evaluations—we have plenty of those—but on the questions that evaluations will never be able to answer. We might also benefit from being reminded that in a business context, we often strive to convert all our currencies to a single coin—namely, money. But that in many nonprofit contexts, values like mercy, justice, and love frequently motivate decisions that don’t always make sense to the bottom line and whose effects can’t always be measured.