Heal Your Church WebSite


Teaching, rebuking, correcting & training in righteous web design.

VerseScrape 0.1

It dawned on me how often I mention that I’m a coder, without having really shown you all that much useful code. So I sat down at my trusty new computer, after my daughter rousted me at 5am, and 20 minutes later, I had a program I’ve wanted to write to provide Redland Baptist some more compelling content. A daily scripture verse that is modified automatically.

The problem is, while there are many programs and web sites offering a daily scripture verse, many of them add all sorts of advertising and/or deprecated HTML formatting. Now mind you, I believe in giving credit where credit is due, but some of the back references were ridiculous – a space alien would think the verse provider actually penned the verse themselves! Furthermore, most, if not all of the verse providers only offer the verse via Javascript – something many people/firewalls are turning off due to excessive online ads. That and it means everytime someone loads my page with the verse reference, they have to go out over the web and get the same content over-n-over again.

A potential problem created by the insidious varieties of <font> and <center> tags inflicted upon the output, is that such tags get in way of my not-so-evil plans to stash this verse in XML so I can dynamically render it to a wide variety of platforms. No, the solution was obvious, I had to find a relatively unpolluted stream of data, screen-scrape it, then render it so it fit my site’s personality and formatting style, while still rendering unto Ceaser what is his. To do it automatically in the middle of the night has me thinking of my old friend Perl and its forward thinking friend cron.

Yes, you can get the same effect in a variety of other languages, but Perl has this nice little library known as LWP::Simple which is just perfect for the job. The next trick was to find the stream. And after 10 minute search through various HTML atrocities, I came across the daily scripture offered by the good folks at the International Bible Society. Their solution was also in Javascript, but sans hyper-formatting. It would be a very simple task to regex the input and redirect it into a little file I can then include in other documents via SSI. Okay enough talk onto the code:

#!/usr/bin/perl
# ———————————————————————–
# copyright Dean Peters © 2002 – all rights reserved
# http://www.HealYourChurchWebSite.org
# ———————————————————————–
#
# versescrape.pl is free software. You can redistribute and modify it

# freely without any consent of the developer, Dean Peters, if and
# only if the following conditions are met:
#
# (a) The copyright info and links in the headers remains intact.
# (b) The purpose of distribution or modification is non-commercial.
#
# Commercial distribution of this product without a written
# permission from Dean Peters is strictly prohibited.
# This script is provided on an as-is basis, without any warranty.
# The author does not take any responsibility for any damage or
# loss of data that may occur from use of this script.
#
# You may refer to our general terms & conditions for clarification
# http://www.healyourchurchwebsite.com/archives/000002.shtml

#
# ———————————————————————–
# The good folks at the IBS asked us to include the following disclaimer:
#
# THIS SITE/SERVICE IS PROVIDED BY IBS ON AN “AS IS” BASIS, AND IBS MAKES
# NO REPRESENTATION OR WARRANTIES OF ANY KIND, EXPRESSED OR IMPLIED, AS TO
# THE OPERATION OF THE SITE OR THE INFORMATION, CONTENT, MATERIALS, OR
# PRODUCTS INCLUDED ON THIS SITE. TO THE FULL EXTENT PERMISSIBLE BY
# APPLICABLE LAW, IBS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
# INCLUDING FITNESS FOR A PARTICULAR PURPOSE. IBS WILL NOT BE LIABLE FOR
# ANY DAMAGES OF ANY KIND ARISING FROM THE USE OF THIS SITE, INCLUDING,
# BUT NOT LIMITED TO, DIRECT, INDIRECT, INCIDENTAL, PUNITIVE, AND
# CONSEQUENTIAL DAMAGES.
#
# See also http://www.ibs.org/bibles/termsofuse.php
# ———————————————————————–
use LWP::Simple;

# ——— name of the output file ———–
# * change the path and output file to suit your system
# * make sure it exists, and is chmod’ 766 or 777
$verseFile = “/home/YOURACCCOUNT/www/includes/dailyverse.html”;

# ——— url containing the daily scripture verse ———
$scriptureURL=”http://www.gospelcom.net/ibs/dm/syndicate/scripture.txt”;

# ——— url of online scripture lookup:
$bibleURL=”http://bible.gospelcom.net/cgi-bin/bible?language=English&version=NIV&x=12&y=8&passage=”;

# ——— get the scripture ———
$content = get($scriptureURL);

# ——— strip out javascript at beginning, then end ———
$content =~ s/^document\.writeln\(“//gi;
$content =~ s/ NIV\)”\)\;//gi;

# ——— split content into verse & reference ———
($verse, $reference) = split(/\(/, $content);

# ——— create output ———
$bibleURL .= $reference;
$bibleURL = qq~<a href=”$bibleURL” title=”NIV – Bible Gateway”>$reference</a>~;

# ——— print output ———–
$output = qq~<div class=”dailyScripture”>
$verse
<span class=”lookup”>($bibleURL)</span>
<br /><br />
<a href=”http://www.ibs.org/” title=”International Bible Society © 2002″><span class=”credits”>Provided by International Bible Society</span></a>
|
<a href=”http://www.ibs.org/dm/” title=”SubScribe to the IBS Daily Verse”><span class=”credits”>Subscribe</span></a>
</div>~;

# ——— write output ————
open(OUT,”>$verseFile”) || die(“Cannot Open File $verseFile”);
print OUT $output;
close(OUT);

# ———————————————————————–
end;
# ———————————————————————–

usage notes:

  1. Using your favorite editor, copy and paste the above code.
  2. edit $verseFile so the path and filename reflect YOUR system’s settings
  3. save the file
  4. chmod 755 versescrape.pl (set permission)
  5. create an empty file using the name defined by $verseFile
  6. set the permissions of the target output file to 766 or 777
  7. run the script once from the shell – test for errors
  8. crontab to taste
  9. include the verse file in your documents
  10. edit your .css documents to accomodate the <div> containing the reference

Keep in mind, this particular version has been written NOT as a .CGI script, but as a small script that will get called early in the morning by cron to create an output file that will be included on other documents via SSI or some similar mechanism.

If you don’t have shell access, then I’m afraid you’re going to have to wait – or do something similar with Javascript. Otherwise, give it a try, let me know how your mileage varies. Sometime within the next week or two, I’ll create a CGI version, then an XML version.

Not bad for an early Monday morning, huh?

Posted by Dean at July 22, 2002 01:00 PM

usage notes:

  1. Using your favorite editor, copy and paste the above code.
  2. edit $verseFile so the path and filename reflect YOUR system’s settings
  3. save the file
  4. chmod 755 versescrape.pl (set permission)
  5. create an empty file using the name defined by $verseFile
  6. set the permissions of the target output file to 766 or 777
  7. run the script once from the shell – test for errors
  8. crontab to taste
  9. include the verse file in your documents
  10. edit your .css documents to accomodate the <div> containing the reference

Keep in mind, this particular version has been written NOT as a .CGI script, but as a small script that will get called early in the morning by cron to create an output file that will be included on other documents via SSI or some similar mechanism.

If you don’t have shell access, then I’m afraid you’re going to have to wait – or do something similar with Javascript. Otherwise, give it a try, let me know how your mileage varies. Sometime within the next week or two, I’ll create a CGI version, then an XML version.

Not bad for an early Monday morning, huh?

6 Comments

  1. Yeah, but what’s the final output?

  2. Sean

    Good comment – and yes – I should practice what I preach. Funny thing, I added it about 2 minutes before you must have revisited my site. No, you didn’t miss it the first time. It was missing the first time. You just happened to revisit often (good reader, smart reader, excellent reader) and catch the new version.

    Thanks for pointing it out.

  3. This is neat. I do a similar thing, except with daily emails. I’ve subscribed to the Light List (http://www.thelightministry.com), which sends me an email with the verse and thought for the day like this:

    2 Corinthians 3:3
    You show that you are a letter from Christ…

    Thought
    Most of the world may not read the Bible, but the lives of believers speak volumes.

    __________________________________________________
    For subscription information, please visit
    http://www.thelightministry.com/

    The email is passed throught procmail, parsed by a similar (python) script, and fed right into the database.

  4. uh oh! when the verse has parenthesis in it the verse scrape acts up! i guess there’s no real way around that.

  5. i really want to get this to work on my site lesbrown.net. I believe that I’ve correctly done steps 1 through 6 on your usage notes, but I’m a little hazy on the rest. Any chance of getting a little help?
    Here are my questions:
    Where do I put the file “versescrape.pl”? I’m assuming I put it in my cgi-bin with my movable type stuff, but I’m not sure.

    I don’t really understand directions 7 through 9 at all. Any chance that you could explain it to me?

    I understand number 10 fine. I just don’t know what to put in my site in order to make the script execute.

    Thanks for any help…

    Les Brown

  6. I am looking for a dily verse script. I know enough about cgi to follow instructions, but you lost me. Please come help me get this going. Can we see an example of it working too?