The Popular Bits

NT Heat Map

Click this thumbnail to see a heat-map of the the popularity of different verses. You may have to click again once it appears to zoom in.

What are the most popular bits of the bible, to quote and discuss?. Often trolling around the religious blogs it is clear that lots of folks don’t have a wide vocabulary to choose from. So tonight I did an experiment.

Using some code I have, I queried Google to ask how many unique pages it has in its index referring to each verse of the New Testament. I won’t go too much into methodology here, its kind of like looking at the number of results when you do a regular google query, but there’s a lot of faffing about needed to exclude duplicate content, and to account for aliases in the names (1Co, 1 Cor, 1 Corinthians, etc) tendencies such as the fact that verses at the start of popular ranges get mentioned more (e.g. 1 Cor 12:1-10, should either be credited to all ten verses, or none, not to 1 Cor 12:1, which a normal search would do).

The searching process involved 8000 separate searches, and the collation of a fair amount of data on about 15m pages. The top ten passages are:

John 3:16  For God so loved the world that he gave his one and only Son, that whoever believes in him shall not perish but have eternal life.
John 14:6  Jesus answered, “I am the way and the truth and the life. No one comes to the Father except through me.”
John 1:1  In the beginning was the Word, and the Word was with God, and the Word was God.
Matthew 28:19  Therefore go and make disciples of all nations, baptising them in the name of the Father and of the Son and of the Holy Spirit.
Matthew 7:16  By their fruit you will recognize them. Do people pick grapes from thorn bushes, or figs from thistles?
Acts 1:8  But you will receive power when the Holy Spirit comes on you; and you will be my witnesses in Jerusalem, and in all Judea and Samaria, and to the ends of the earth.
Colossians 4:3  Pray for us, too, that God may open a door for our message, so that we may proclaim the mystery of Christ, for which I am in chains.
Acts 2:38  Peter replied, “Repent and be baptized, every one of you, in the name of Jesus Christ for the forgiveness of your sins. And you will receive the gift of the Holy Spirit.”
John 10:10  The thief comes only to steal and kill and destroy; I have come that they may have life, and have it to the full.
Mark 16:15  He said to them, “Go into all the world and preach the gospel to all creation.”

(Quotations from NIV)

Some of these were a *big* suprise to me. Colossians 4:3, really? Wow. And there’s an interesting pattern among the top 50 – many are missional like this. You can also see in this the problem of dividing by verses. Some quotes, like John 10:10 are rarely quoted in full, only the second half is normally used.

If you deal with frequencies at all, you know that distributions tend to be massively skewed: the winners win big, everyone else is way behind. Sure enough this distribution is a power law, with John 3:16 scoring more highly than the next 6 verses combined. And there are a lot of passages with only a handful of mentions.

Frequency Graph of Mentions of NT Verses

A frequency graph showing how often each verse in the NT is discussed. The verses are ordered by decreasing frequency. The y scale is an adjusted number of pages - it won't be the value you get if you try this, because you won't share my underlying mathematical model.

Here is a graph of the frequency distribution of results. You can see that the first few are huge, and everyone else is basically nowhere. Obviously a very tiny proportion of the NT gets talked about.

A better diagram is the heat map above. This shows the relative distribution of popular bits. Note that the colors are generated by rank (not by absolute score), because if we did a heat map by score the whole thing would be purple with one or two blues, and then John 3:16 in red.

You can see some interesting trends on there (click for a bigger view, really it is worth it! – you may have to click again once it appears though, because it is taller than your screen, some browsers shrink it to fit). In particular you can see that John is the rockstar gospel, although Matthew’s sermon on the mount does pretty well. Among the letters Colossians is the clear winner, although Galatians is also pretty hot. I was surprised at 1 John, being so important, and 1 Corithians being so sparse. And Christians obviously don’t like encouragement, because Paul’s least scathing letter, 1 Thessalonians is practically entirely purple. And interesting that the top passage in Mark (16:15) isn’t really part of Mark at all.

There’s only so many conclusions you can draw around this, it is meant for fun rather than serious study. But if there are specific questions you’d like me to answer with the data, leave a comment.



Filed under Uncategorized

14 responses to “The Popular Bits

  1. Ian

    Doh, it just occurred to me that I’ve got a bug in the heat-map above because I failed to account for an ambiguity. I’ll edit and repost the heat-map later. The league table and discussion are fine, and the frequency map will not change noticeably.

    — done.

  2. Astounded that Colossians is the top letter. “Image of the invisible God” I suppose.

  3. Ian

    Yes I did have to check that twice, in case there was something else out there with a similar name that I was getting hits for. But it really does seem to be the case. Okay Colossians has some good bits, but still.

    Incidentally Col 1:15 (Image of the Invisible God) comes 35th overall and is the second Colossians verse (after 4:3). Other popular bits are 2:3 “In whom are hid all the treasures of wisdom and knowledge.”, 3:1 “Since, then, you have been raised with Christ, set your hearts on things above, where Christ is, seated at the right hand of God.” and 1:16 “For in him all things were created: things in heaven and on earth, visible and invisible, whether thrones or powers or rulers or authorities; all things have been created through him and for him.”

    I guess Colossians might be just more quotable than the others. Individual verses in other letters might not be so noteworthy, but ranges of verses may be.

  4. Ian, that’s an intriguing analysis (you really are full of good ideas!). I wonder is there any “reverse theology” we can derive from the least popular bits? The tail-off in the graph is striking – I am surprised it is quite *so* steep, but it does mean that if we were to try to derive a reverse theology, we’d maybe be dealing with a larger corpus than is used for regular theology… [Nah, I don’t really think so].
    Is there any proximity analysis that can be done on this, e.g. if a verse is cited in one place/webpage, is it cited next to another verse from elsewhere that would allow us to link them up and therefore pick out key conceptual networks?
    Not that I’m trying to give you extra work, you understand! How do you write these scripts? What language, processing etc?

  5. Another thought… when we are dealing with parallel passages in the synoptic gospels, is there any systematic relationship between which version tends to get quoted? You’re definitely right about John being the rock star gospel. Another thing; it has been oft noted that the epistles were tagged on largely in size order, largest first, shortest last. What’s the state of play of thoughts as to why a big fecker like Hebrews is stuck in where it is? Is it coz Paul’s epistles come first, and the others later? A Pauline primacy thingy? (Sorry if I am displaying my ignorance here!)

  6. Ian


    I don’t understand what you mean by reverse theology in that comment. Do you mean a theology of what passages are ignored? If so that could be tricky, because of the verse-bias in the method. Quite a lot of the lowest passages are things like: Luke 20:32 “Finally, the woman died too.” which are obviously the middle or end bits of a range of verses, that are overall reasonably well quoted. My favorite of the very low scorers is third from bottom 2 Cor 10:9 “I do not want to seem to be trying to frighten you with my letters.” Given what we’ve been saying about Paul on this blog for a while, that amuses me đŸ™‚ Also, you realise, by quoting them in this page, I’ve just marginally bumped their scores!

    As for cross-correlation – that would be interesting, yes. And fairly easy to do, we could just run joint queries across a bunch of pairs and with the individual data, we have everything we need. The problem is just one of scale – deciding which pairs to run. Since each query takes around a second.

    Yes, I was surprised at the steepness of the curve. In a log plot it is still quite significantly curved.

    And as for the synoptic question – that would be interesting too. But would rely on getting hold of, or building, a comprehensive verse-by-verse synopsis. I don’t have one in digital form.

    Finally Hebrews. Yes, because Hebrews wasn’t traditionally associated with Paul (that comes later), it was after the Pauline corpus. The Pauline corpus is Romans through Philemon (of course, we don’t believe they are all Pauline now). Roughly in order of size, with two exceptions: 1) Second letters always follow the first letter, 2) 1 Timothy is larger than 1 Thess, but follows it. 1) is logical, I don’t know why 2). Then after the Pauline corpus, we get the other letters, again in decreasing size order except for caveat 1.

    We have early evidence of the Pauline letters circulating as a single collection. Well before the notion of a NT canon – so that is why they tend to be grouped. The NT canon is a kind of collection of collections – the 4 gospels were circulating as a group also early on. We have manuscript evidence for this.

    As for what language. This code is written in Python, though the google API is just a REST API, so language is really irrelevant. The code is mostly used for imposing the data-model on the returned data (i.e. building lookup tables of urls, checking for duplicates, applying fudge factors). I then dump the data into a SQL database so I can query it simply and interactively.

  7. Ian

    On the synoptic issue, it seems from a cursory glance at a few choice passages that Matthew strongly wins over Luke in Q, and wins a little less dramatically over Luke and Mark in triple tradition material.

  8. very interesting stuff Ian. i understand why this is, many want the cuddly passages.

    the other day i was with a congregant who was having an operation and has had a series of operations and infections and operations over a two year period. this person stated they feel their faith is shattered. when i asked about this faith, it was pretty much these passages with some rainbows and unicorns sprinkled in for good measure. things like Psalm 62, John 14 etc. I stated, “You’re not there yet! your faith should be shattered cause you’re looking in the wrong places, you’re at Job, Habakkuk, and Psalm 69, not this stuff!”

    i don’t want to sound like a heartless guy. comfort is a good and necessary thing in the healing process. but we can’t jump to the good parts just cause we want the process to go faster. we can’t get to easter unless we go through Good Friday, so to speak. great post.

  9. Ian

    @01 You didn’t sound at all heartless. My wife and I both appreciate that approach.

    You went further than me in your interpretation of the woes of the top verses though đŸ™‚ I personally would have thought there would be more salve in the top 10. The amount of missional stuff was more surprising.

    And I think that John 1:1 deserves to be in the top ten list of things ever written. Never has there been a more succinct or elegant bringing together of two alien cultures.

  10. i guess i skipped that step where i tie in the missional stuff to the fluffy stuff. i always tend to skip important steps. anyway, i think the missional stuff is spawned from this “I got the joy, joy, joy, joy, down in my heart (where?)” and these Christians what to spread their ‘happiness’ all over the place. hence the rise of the prosperity gospel and the extreme growth of mega-churches. prosperity gospel is about 90% truth and 10% heresy. 5% being the foundation in material things and the other 5% being in the goal (personal, not corporate salvation). the middle is good (for the most part).

    i agree on John 1. love that stuff!

  11. Adonais

    Actually, Colossians 3:810 rather than the one quoted is what has been put on my heart to share with others lately. You can simply remove all the negative connotations associated with religion and simply look at the scripture mentioned above for ‘what it is’.

    I don’t claim to be anything other than what others say… and usually they are dead wrong, because they don’t listen. Pretty much what scripture says, indicating their ears are deaf and eyes blind…. but the blind and deaf will always misinterpret. God bless you Ian… and by the way… I am not in the business of converting anyone… nor do I have intention of converting anyone… because it’s not up to me. Again, God bless you.

  12. Ian

    Hi Adonais, good to hear from you again.

    Col 3:8-10 is a good ‘un, definitely. I don’t know if there’s many who couldn’t use being encouraged to do that regularly.

    “nor do I have intention of converting anyone… because it’s not up to me.”

    It isn’t up to you, no (I suspect we differ about who it *is* up to, but we can agree that it isn’t up to you). But that’s no reason not to try. You’d be a pretty heartless person if you believed what you believe, and did nothing to help others change their fate. In your model of the world, it’s not like this is a trivial matter!

    And I’ll make no bones about wanting to convert you too, because I’ve come through where you are, and seen the light and want to share it with folks who are stuck where I was — to help them.

    It’s what happens when that doesn’t work that tells you a lot about a person’s character. Are we interested in finding out why, in going on a journey, with respect and curiosity? Or is that failure a personal blow that means no relationship is possible? Do we take the other person’s position seriously?

    You are always welcome here, Adonis. You’ll be treated roughly at times, but I expect you to return the favour. But you’ll not be hated.

  13. Pingback: The Books of the Bible | Irreducible Complexity

  14. Pingback: Google Completion Polling | Irreducible Complexity

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s