Saturday, August 20, 2011

A look at Prosper's 2008 Loans

Previously I published a list of words which did poorly in pre-2008 loans and words that did well in pre-2008 loans. I wanted to see if those lists could predict what would happen in 2008 loans on Prosper.

I created a new value, WordValue, which will become negative if a loan has more words which had previously failed in it and become positive if a loan has more words which had previously succeeded in it. (Additional description of the value is below.)

Suffice it to say, I expected that lower WordValues would repay less often than higher WordValues. It turns out that this was not the case for 2008 loans:

DescriptionPaidRecoveredNever Recovered
2008, D, All49.8%50.7%34.7%
2008, D, WordValue <-150.9%51.7%34.2%
2008, D, WordValue >= -147.5%48.3%35.8%
2008, E, WordValue < -144.7%45.6%39.1%
2008, E, WordValue >= -142.2%42.7%46.4%

What I found is that lower WordValues actually repayed at a higher rate than higher WordValues--exactly the opposite of what I was expecting. This means that some of the low performing words in loans before 2008 performed better than average in 2008 loans.

Since my original conjecture is that the word "need" performs less well than average I tested that on this same set of loans and found the following:

DescriptionPaidRecoveredNever Recovered
2008, D, All49.8%50.7%34.7%
2008, D, Title or Description contain "need"49.7%50.7%36.2%
2008, D, Title or Description do not contain "need"49.9%50.7%33.8%
2008, E, Title or Description contain "need"43.5%44.6%39.3%
2008, E, Title or Description do not contain "need"44.6%45.4%41.6%
2009, D, All27.9%27.9%13.4%
2009, D, need in title or desc25.5%25.5%14.7%

We get mixed messages here, too. In 2008 D loans with "need" were about 2.5% more likely to never recover (meaning they were confirmed to Default or Charge Off) but roughly equally as likely to have ended with a "Paid" status. 2008 E loans with "need" are, to date, less likely to have finished their loan with a "Paid" status but 5.5% less likely to never recover. 2009 D loans are less likely to have ended and Paid and more likely to never recover.

Now obviously not all 2008 and 2009 3-year loans have reached the end of their term. We'll be able to draw better conclusions in the coming months, but it's entirely possible that there is no correlation between "need" and loans which aren't repayed.

In future posts I'll whittle down my list of words that fail and see if I can find a set of words that consistently has results which are worse than the average.



About the WordValue number:
I created the WordValue by taking the difference between the Paid percentage of loans containing each word and the average repayment rate for loans before 2008. I only used words that were more or less than .5% of the average.

The WordValue number is the sum of each of those differences from the average taken only once per word.

No comments:

Post a Comment