Heal Your Church WebSite


Teaching, rebuking, correcting & training in righteous web design.

VerseScrape 0.2

Got a comment from Matt Irvine over at TerraNova.org regarding VerseScrape 0.1 that read "uh oh! when the verse has parenthesis in it the verse scrape acts up! i guess there’s no real way around that."
For those of you unaware, I’ve combine a small screen-scraping Perl program with crontab to update my site with the International Bible Society’s Verse of the Day.

Yo Matt, good catch! Unfortuantely, I didn’t have time until 2am last night to work in a fix. Part of the problem was that after stripping away the Javascript, that is everything outside of parenthesis used to encapsulate the IBS Scripture of the Day using the document.write() function, I then used the remaining opening parenthesis character "(" to split the scripture reference away from the verse. This works great if there are no parenthetical remarks within the verse itself. Only, in dealing with the NIV, there are.

That smug snickering you hear are the "King James onlyists" knodding their heads right about now saying "see … told you so!"

Theological arguments aside, there are two approaches. One would be to implement Sermonizer::Scripturizer. And when I convert this into a module, I think I will. Or I can split the verse of the day up using the left parenthesis character "(", then peel off the last array member as my Bible reference, concatenting the rest into the Scripture verse. Sounds like a lot of work, but it’s actually only modifying one line of code, and adding two more lines of code.

The line we change is the one we use to split up the reference from the verse:

($verse, $reference) = split(/\(/, $content);

Instead of reading the results into 2 variables, we read the results of the split function into an array. We then then we use the pop function to remove the last array alement from the array, and use the join function to put the other array members back together:

@verses = split(/\(/, $content);
$reference = pop(@verses);
$verse = join(“(“, @verses);

A bit kludgy, I’ll admit. But it gives me a chance to show you how many different ways there are to solve a problem. In fact, I suspect that some of you are thinking that very same thing as you read this. If you’re one of those, don’t be shy, throw down some code in the comment section so we can compare notes. There are no wrong answers, just so long as the verse of the day gets parsed correctly.

In the meantime, here’s your new version:


8 Comments

  1. Pingback: The Riehl World

  2. hey, just wanted to let you that the $reference var has a ‘\n’ at the end. in case you were using this in another context it makes a difference. i parsed it out using the line:

    $reference =~ s/\n$//gi;

    this script is awesome, nice site – God bless.

  3. hey, just wanted to let you that the $reference var has a ‘\n’ at the end. in case you were using this in another context it makes a difference. i parsed it out using the line:

    $reference =~ s/\n$//gi;

    this script is awesome, nice site – God bless.

  4. Pingback: The Journal

  5. Hello Dean, I wanted to let you know that I think versescrape is great, I’m using it on my site.

    I was wondering how a simlar thing could be done for people who use IIS/ASP, so I hope you don’t mind, but I’ve kinda taken the idea & done a version for IIS/ASP, which can be seen at http://glenn.bluemountains.net.au/mt/archives/000019.php

  6. It seems that IBS have added a link to an audio reading of the verse included in there verse of the dat text. To adapt versescrape I’ve added a .* in the regex that strips out the javascript:
    was: $content =~ s/ NIV\)”\)\;//gi;
    now: $content =~ s/ NIV\).*”\)\;//gi;

    I’m writing up a mod to the script to include the ausio link, I’ll post when it’s done, but for now this adjustment will just get rid of the link & allow the votd to display properly.

    cheers

  7. OK that took less time than I thought.

    The updated script is here: http://glenn.bluemountains.net.au/dist/versescrape/versescrape.pl, you can see the output it produces on my blog: http://glenn.bluemountains.net.au

    I’ve just added a $audio variable to extract the anchor tag into, then do the removal as stated in the previous comment.

    As Dean would say, your mileage may vary, please let me know if this doesn’t work. ;-)

  8. Hello and Thank you for the script it will come in very handy.

    I wonder if it is possible to do something similar with/to include a “Read the Bible in a Year” type of content . Here is one place that i found it http://bible.crosswalk.com/BibleInAYear/DailyReading.cgi

    Thanks again for the script.
    r