Why not data science?

Before we talk about anything, we need to know what ‘it’ is.  What is data science?

First, what is science?  Science is “a systematic study, by observation or experiment; by an expert in scientific method and the field of study.”

And what is data?  Data is anything which can be measured, counted, or put in sequence.

For valuation, the systematic study, towards useful information, involves four basic steps:  1) problem identification/assumptions; 2) data selection; 3) predictive algorithm; and 4) Information delivery [enhanced data stream].

For valuation, the choice cannot be ‘experiment’.  We can only rely on observation of market actions.

For valuation, the best expert is the experienced appraiser.  What is needed is a full understanding of today’s scientific method for valuation and risk measures.

The traditional, vintage appraisal “approaches to value” are scientific.  They follow the same pattern as above:  problem, data, prediction, report.  What is different is that today we can use most, if not all the relevant data in our analysis.

When I became an appraiser, data (comps) took up 80% of my time.  Once I had four or five comps, my job was almost done.  The typist took care of most of the rest.

Later, MLS, public records, and published comp services gave me printed, then electronic data sheets.  My job became easier.  I could look at all the data, then get rid of all but six or seven, and ‘analyze” four or five.  Easy.  Data discarding was the magic tool.  Faster and easier.  Just get rid of what feels odd, or difficult to verify.  Easy.

So what happened.  We had good science, based on the sparse and difficult data we had.  We had good science based on the three algorithms/models possible – the three ‘approaches’ to value.

I always think of the old approach as ‘tippy toeing’ up to the market data.  But don’t dive in, look at the full data set.  Just ‘approach’ it gingerly!  Why look at the whole market segment, when its more comfortable to just look at three.  Comfortable, easy, and accepted by all my peers!  Comfortable, easy, and accepted by all my user clients!  No need to change the way I’ve done things all along.  No need to fear the obtuse statistical stuff.  No need to learn anything new.

Tippy toe works.  It was good science once.  Just outdated today.

Today’s science says:  1) ask the right question; 2) use the right analyses; 3) and provide the user clear perception of the research.  Today’s science says strive for the ‘true’ value.  But also strive for (and measure) for the most reliable result (sureness).

Sureness and reliability require the use of all the relevant, available data – not just part of it.

When you use only part of the available information, the result is biased.  It is biased by accident of selection.

Selection bias is not personal bias.  Selection bias is analytic bias.  But analytic bias can enable, or even cover up personal bias.  Analytic bias is not personal bias.  It just makes prejudicial bias possible.

So why data science?  The science of data addresses this issue of selection bias.  It helps block the potential for personal bias, whether accidental, or intentional.

Until we are able to implement correct, modern analytics, we will not be able to intelligently address the other bias issues we see today.  Good science is the path to good fairness.  It is good public policy.