Saturday, July 30, 2011

Words of Win

Let's take a look at the 10 most likely words to have been used in a loan that repaid:

lender
risk
card
rate
fund
investment
rates
0
minimum
ratio

These words range from lender, at 8% more likely to have Paid than average, to ratio, at 4% more likely to pay than average.

Interesting here is the discrepancy between the words of loss -- maxing out at 23% less likely to pay than average -- and the words of win -- maxing out only 8% more likely to pay than average. It could be that the words a borrower chooses may only help in identifying lenders who are less likely than average to repay, but not show borrowers who are more likely to repay.

Update Aug 30: I have added a post on the methodology used to find these words.

Words of Loss

I took a look at all loans that were made on Prosper up through the end of 2007 to see what words people who failed to repay their loans used most often. Looking only at words that were used more than 1000 times, the bottom 10 list is quite interesting:

payday
chance
behind
son
daughter
mother
children
child
track
deleted

Requests where the word 'payday' was used were 22% less likely than average to repay their loans. Requests containing the word 'deleted' were 13% less likely than average to repay their loans.

Most interesting, in my mind, is all of the words that invoke family. Son, daughter, mother, children, child -- half of the bottom 10 words are family members. Husband, at 10% less likely than average to pay, is 24th from the bottom.

In the next post I'll take a look at the words which were associated with a positive loan outcome.

Update Aug 30: I have added a post on the methodology used to find these words.

Monday, July 4, 2011

Need Series: Correlation Matrix

A correlation matrix tells us how much the change in various factors relate to one another. We're looking here to see if using the word "need" is strongly associated with some other variable, such as credit grade, Debt-To-Income Ration (DTI), or any other factor. If there is a strong correlation with some other factor then we know that we're not on to anything new here and can move along.

Generally it is my understanding that a correlation (or negative correlation) over .5 is strong and over .1 is weak. A correlation of 0 would mean that there is no relationship between the values at all.

Data set: Loans started between 2005-2008 which were not Cancelled
Credit Grade was scored as: AA=10, A=9, ..., HR = 4
Loan Outcome was scored as: Paid = 3, PaidInFull and RecoveredInFull = 2, Everything Else = 1
Title has need was scored as: Yes = 1, No = 0
Description has need was scored as: Yes = 1, No = 0
DTI and Amount Requested were as published by Prosper

Credit GradeDescription Has NeedDebt To IncomeAmount RequestedTitle has NeedLoan Outcome
Credit Grade1-.17.03.41-.11.30
Description Has Need-.171.01-.04.16-.11
Debt To Income.03.011.0900.04
Amount Requested.41-.04.091-.05-.06
Title Has Need-.11.160-.051-.06
Loan Outcome.30-.11.04-.06-.061

As we would expect, we see some stronger correlations, like the .3 between credit grade and loan outcome. We also see a decent correlation between the description and the title having the word "need" in them.

It is interesting to see a -.11 correlation between loan outcome and the word "need" in the description of the loan -- this is stronger than the -.06 correlation when we look at the title. In a future post we'll do a t-test to see if these results are statistically significant.


All Articles in the Needs Series
An Introduction
Initial Findings
Correlation Matrix
Comparing to Lending Club
What We Fund

Sunday, July 3, 2011

Need Series: Initial Findings

Let's look at my initial findings:

2005-2006 LoansTotal LoansNever RecoveredRecoveredPaid
Title contains "need"65246.9%52.8%51.2%
Title does not contain "need"532137.0%62.9%61.4%

2007 LoansTotal LoansNever RecoveredRecoveredPaid
Title contains "need"109247.7%52.3%50.6%
Title does not contain "need"1038737.3%62.7%61.5%

2008 LoansTotal LoansNever RecoveredRecoveredPaid
Title contains "need"86134.8%48.9%48.6%
Title does not contain "need"1070430.3%55.0%54.3%

It's worth pointing out that not all 2008 loans will have completed yet, so we expect the percent Paid to increase--for both "need" and not "need" loans--as the year goes on. Still, for loans which have reached their conclusion (2005-2007 loans) we see roughly a 10% difference in number Paid between the groups.


Now 312lender and frinxor pointed out on the prospers.org forum that use of the word "need" could be directly correlated with credit score and, hence, non-payment rate. Let's take a look at those numbers for all loans which originated between 2005-2007:

Credit ScorePaid With "Need"Paid Without "Need"Difference
AA79.6%87.0%7.4%
A76.3%76.2%-0.1%
B60.5%69.3%8.8%
C57.1%62.3%5.2%
D53.0%60.0%7.0%
E45.4%49.3%3.9%
HR32.3%37.0%4.7%

So it would seem that almost every credit grade has a difference between the groups. Now it's interesting to note that only 5% of A loans used the word "need" in the title and nearly 15% of HR loans used the word "need" in the title. This may, in fact, be relevant when we do more in-depth statistical analysis later on.

Initially, however, it still looks like we're on to something.


All Articles in the Needs Series
An Introduction
Initial Findings
Correlation Matrix
Comparing to Lending Club
What We Fund

Need Series: An Introduction

I look at the titles of many Prosper loans, and they sadden me:
Need money to pay off the high interest credit card bills !!
Fresh Start Needed!
need to build credit
HELP! Need money til I refi

I see borrowers needing money and I want to avoid those loans like the plague. It bothers me. I don't like the idea of needing things from others--and I don't like the idea of others needing things from me. It just seems to me that if a borrower really needs the money then they're not trying hard enough or thinking widely enough about the problem. And I associate that attitude with a failure to pay back loans.

So I set out to explore the question: "Do loans with the word 'need' in the title pay back less than loans without the word 'need' in the title."

My initial results are interesting. As you can see from the first batch of 2008 loans I tested, there was a 5.7% difference in the number of loans that were paid out as agreed:
DescriptionNumber of LoansPaid
2008 Loans, with "need" in the title86148.6%
2008 Loans, without "need" in the title1070454.3%

Five percent isn't huge, but it's big enough to keep my interest for a while. In future posts I'll break open the statistics book to start to explore these findings. I'll explore whether the numbers are even statistically significant and I'll see if there is a better explanation for my findings, such as credit rating and current delinquencies. Maybe I'm on to something new. Maybe I'm just tilting at windmills.


All Articles in the Needs Series
An Introduction
Initial Findings
Correlation Matrix
Comparing to Lending Club
What We Fund

Background

Chasing good returns, I've been lending on Prosper since summer of 2006. For me, and so many other early Prosper lenders, those returns never materialized and I lost money on my initial investments.

Here it is, some 5 years later, the housing and stock markets have tanked, Prosper has gone through an SEC Registration and LendingClub has emerged. To date my only peer-to-peer lending remains with Prosper, but it's time to explore money a bit more wisely.

In this blog I'll be exploring some ideas I have about lending, re-learning long-forgotten statistics, and ultimately chasing a healthy return for my investment.

My first series will explore an idea that I have that Prosper loans with the word "need" in the title end up with less of a payout than loans without the word need. I've got a statistics book on my desk and a stats professor ready to tell me my assumptions are wrong and 7 years of Prosper loan numbers to analyze.