Percent Paid (By Loan): Is the percent of loans, containing the indicated word at least once, which finished with a status Paid.
Percent Paid (By Word): Is the percent of time that a loan ended with the status paid, weighted by the frequency of the word in the listing. (For example, a loan with a title "Help, help, help, help!" which did not pay would count four times more than a loan with "Help" listed only once.)
Word Count: The number of listings containing the word at least once. (Notably not the total number of times the word was used--the maximum here is once per listing.)
Like in the original posts, these are words from Prosper loans that were created before 2008. My methodology is at the bottom of the post, but loans were assigned to groups randomly and there were 8728 loans in each group.
Group 1 Worst Performing Words
Word | Percent Paid (By Loan) | Percent Paid (By Word) | Word Count |
[average Paid] | 61.1% | ||
payday | 38.9% | 38.6% | 596 |
behind | 42.9% | 43.6% | 592 |
mother | 43.5% | 44.8% | 566 |
chance | 44.5% | 42.1% | 631 |
track | 46.8% | 45.6% | 581 |
son | 47.1% | 44.8% | 597 |
daughter | 48.1% | 46.3% | 516 |
child | 48.7% | 47.9% | 520 |
husband | 49% | 51.3% | 896 |
single | 49.5% | 49.7% | 707 |
Group 2 Worst Performing Words
Word | Percent Paid (By Loan) | Percent Paid (By Word) | Word Count |
[average Paid] | 59.6% | ||
payday | 37.5% | 39.2% | 595 |
behind | 42.4% | 41.3% | 566 |
chance | 43.5% | 41.5% | 575 |
son | 45.7% | 44.2% | 514 |
mother | 46.6% | 46.7% | 601 |
children | 47% | 45.6% | 854 |
daughter | 47.7% | 44.6% | 539 |
DELETED | 47.7% | 46.6% | 507 |
child | 47.8% | 46.3% | 552 |
30000 | 48.3% | 47.6% | 532 |
So, as with the original Words of Loss post, we see the word 'payday' at the bottom, with the words 'behind', 'chance' and then family words like 'mother', 'child', etc. to be on the bottom for both groups.
Now let's take a look at the best performing words:
Group 1 Best Performing Words
Word | Percent Paid (By Loan) | Percent Paid (By Word) | Word Count |
[average Paid] | 61.1% | ||
tax | 67.1% | 66.7% | 504 |
early | 67.2% | 67.7% | 534 |
rate | 67.6% | 68.6% | 1952 |
term | 67.6% | 66.6% | 509 |
risk | 67.8% | 70.3% | 565 |
fund | 68.2% | 70.2% | 666 |
rates | 68.3% | 68.8% | 609 |
lender | 68.3% | 70.2% | 707 |
minimum | 68.4% | 68.4% | 583 |
investment | 69.1% | 69.3% | 679 |
Group 2 Best Performing Words
Word | Percent Paid (By Loan) | Percent Paid (By Word) | Word Count |
[average Paid] | 59.6% | ||
risk | 64.2% | 66.1% | 592 |
card | 64.2% | 65.9% | 2910 |
higher | 64.4% | 63.2% | 765 |
style | 64.5% | 58.8% | 968 |
span | 64.9% | 59.5% | 1069 |
don't | 65% | 64.3% | 861 |
rate | 65.2% | 65.3% | 1938 |
student | 66% | 66.3% | 1078 |
lender | 66.2% | 67.5% | 754 |
I've | 66.2% | 63.6% | 888 |
It's interesting to see that lending words appear on both of these lists -- but there are fewer matches than the worst performing words. It looks like we've got 'risk', 'rate(s)' and 'lender' as matches but all of these are still much closer to the average paid than the worst performing words.
It could be that we will find that we can only tell if a loan is more likely to fail from the words that it uses, not that a loan is more likely to succeed.
Methodology:
Similar to the methodology I used in the previous two studies, I began with all Pre-2008 Prosper Loans.
I then placed all the loans in a random order and assigned them to Group 1 or Group 2 sequentially. From there I built a list of all the words in the title and body of the listing for those loans, tallying the number of times the word was used in each loan.
To come up with the Percent Paid (By Loan) I divided the number of loans with that word that finished with a status Paid by the number of loans with that word in total.
No comments:
Post a Comment