Warning: strpos(): needle is not a string or an integer in /home/allisona/public_html/kcocco/index.php on line 31

Warning: strpos(): needle is not a string or an integer in /home/allisona/public_html/kcocco/index.php on line 37
KCOCCO ~ photo
Perl Stripping 
Over the holiday Dad and I hacked a perl script that parses HTML pages from FoodNetwork.com. The goal was to get the episode numbers & names for Molto Mario recipes. Our first step was to save the index page of Mario recipes from FoodNetwork.com . We then used the Unix command wget to search the page for URL links and save the associated HTML pages into one directory. We then created a Perl script that opens each of the HTML files found in the directory and pulls out the data we needed.
Here is an example of the output(episode number, name, recipe name) now ready to load into a database:
MB2G15~Trastevere On a Sunday~Gnudi con Fiori di Zucca
MB2G15~Trastevere On a Sunday~Chicken with Sweet Peppers: Pollo con Peperoni
MB2G17~Antica Bessetta~Bigoli - Basic Recipe
MB2G17~Antica Bessetta~Bigoli Bianchi with Duck Ragu
MB2G17~Antica Bessetta~Bigoli Scuri

Here is the general script that can easily be altered to parse other shows:
#!/usr/bin/perl
####################################################################
# recipestrip.pl
# Text stripping script. Used on Foodnetwork html pages.
# 3.28.2005 K & L Cocco
#
# Program used to capture episode number & title and recipe tiles
# from html files.
# The raw html files were gathered from foodnetwork.com with wget command
# using flags -i and -F.
# example: wget -F -i capturedhtmlfile.html
#
####################################################################
use Getopt::Long;
use File::Basename;

sub trimwhitespace($)
{
my $string = shift;
$string =~ s/^s+//;
$string =~ s/s+$//;
return $string;
}

$path="/video1/mario/shows/";

opendir(SHOWS, "$path");
@Allnames = readdir(SHOWS);

open (koutputfile, ">episoderecipelist.txt") :: die ("Could not open file. $!");
foreach $Name (@Allnames) {
if (-d $Name) {next};
($show, $path, $suffix) = fileparse($Name, ".html");
if ($suffix ne ".html") {next};
#print $Name,"n";

open (inputfile, $Name) :: die ("Could not open file. <br> $!");

foreach $text (<inputfile>){
# chomp $text;
#print $text;
if ($text =~ /articleshowname/){
$stopsl = index($text,"Episode ")+8;
$lensl = index($text,"</SPAN><P") - $stopsl;
$episodenum = substr($text,$stopsl,$lensl);
#print substr($text,$stopsl,$lensl),"n";
print $episodenum,"n";
# print $text,"n";
#print "stopsl: ",$stopsl, " lensl: ",$lensl,"n";
#print "*****************************************n";
}
if ($text =~ /episodename/){
$stopsl = index($text,"name'>")+6;
$lensl = index($text,"</SPAN") - $stopsl;
$episodename = substr($text,$stopsl,$lensl);
#print substr($text,$stopsl,$lensl),"n";
#print $episodename,"n";
# print $text,"n";
#print "stopsl: ",$stopsl, " lensl: ",$lensl,"n";
#print "******************************************n";
}
if ($text =~ /recipes/recipe/){
$startsl = index($text,"html'>")+6;
$stopsl = index($text, "</a></TD>");
# captures lines with no ending tag
if ($stopsl == -1) {
$stopsl = length($text);
}
$lensl = $stopsl - $startsl;
print koutputfile trimwhitespace($episodenum)."~".trimwhitespace($episodename)."~".trimwhitespace(substr($text,$startsl,$lensl)),"n";
#print $Name,"n";
#print "stopsl: ",$stopsl, " lensl: ",$lensl,"n";
#print "******************************************";
}
}
close (inputfile);
}
print "*** Run completion ***n";
close (koutputfile);
  |  [ 0 trackbacks ]   |  permalink  |  related link
Clayton Peak 

2/27 Sunday with Mike & Dan, Clayton Peak, Brighton. Single ride pass to the top of Great Western Chair. Breaking out the old school boot packing with 5 laps!
The panoramic above was created with Panoramic Factory. My last panaramic(Snowbasin) had better results using PhotoStitch software that comes with a Canon digital camera. PhotoStitch is amazing easy while Panoramic Factory has many slick configurable features. I original is 23,160 pixels wide. If you click on the image it will bring up a larger(yet still 10x smaller) 2,300 pixel version.


Mike carving it up.


Me entering chute, Mike taking pictures. You can see the cute that leads to the untouched snow skirt in picture above.


Dan sliding out of the bottom shoots of 10420 peak 'good buddy'. Left to right lines: Dan, Mike, Kevin. Check out the rollerballs.

  |  [ 0 trackbacks ]   |  permalink  |  related link
Picasa - by Google 
Google has put out a new image/photo editing package. Dan, turned me on to program and said I should check out....here is what I found out: The software is FREE and quite slick. Picasa will inventory all photos by year and directory name. Hard drive cranked in the background for over an hour as it auto indexed the gazillion images on my PC. I enjoyed playing around with the photo editing features. Hats off to Picasa devolpers for putting intuitive controls on Photoshop type adjustments.
Picasa - Google link

Here is a test of the Picasa ' Export as WEB Page' feature. This is Alex getting some nice face shots at Brighton. The export process is very basic, easy and fast.
http://www.kcocco.com/2005_1_10_brighto ... index.html

  |  [ 0 trackbacks ]   |  permalink  |  related link
PHP Blog installed 
Okay, here we go. I have just installed SPHPBLOG a BLOG system on my ISP server. The program is a PHP sourceforge.net project.
Goal will be to post my journey thought 2005 with an emphisis on photo/image journal.
Enjoy....
~Kevin
  |  [ 0 trackbacks ]   |  permalink  |  related link

Back

// Google Analytics