Sunday, August 7, 2011

Needs Series: Comparing to Lending Club

Of course all of this needs data does us absolutely no good if it isn't generalizable beyond the loans that it came from. Since the initial data was taken from Prosper, it seemed that Lending Club loans would be make a good comparison and continue to tell us if we're finding something relevant or just random noise.

Data Set: All Lending Club Loans made in 2007 and 2008

2007-2008 LoansTotal LoansPercent Charged Off or Defaulted
All Loans299621.2%
"Need" in title or description81426.8%
"Need" not in title nor description218219.2%
"Payday" in title or description757.1%
"Payday" not in title nor description298921.1%

So there you have it, the word "Need" appears to correlate more often with a failure to repay a loan in Lending Club as well. (I included "Payday" data just for fun -- with only 7 loans the data isn't likely to be relevant, but it does fall in line with what we'd expect.)

Since so many of the 2009 loans won't be paid off until 2012, I included the >30 days late category with the already Charged Off and Defaulted loans and found the following stats:

2009 LoansTotal LoansPercent Charged Off, Defaulted or >30 days late
All Loans528110.7%
"Need" in title or description126313.6%
"Need" not in title nor description40189.8%
"Payday" in title or description683.3%

The trend continues...

All Articles in the Needs Series
An Introduction
Initial Findings
Correlation Matrix
Comparing to Lending Club

1 comment:

  1. Hiya,

    A very interesting post. I'm the CEO of Estonian P2P lending site

    We ran a similar text analysis on our data when we built our internal credit score and found similar correlations. In case you are interested I could send you the full list of 'problematic' words. Interestingly payday also proved to be very problematic even on a larger data set.

    In case your interested you could e-mail me at