Doper programmers: little help please?

Skywatcher · July 13, 2004, 3:47am

Saw this on another message board I frequent:

By “use a reader on a web page and transfer the data”, he means an automated process to gather baseball statistics from sites like ESPN and MLB.com and transfering them to a data file. Is this guy just talking out his ass or what?

ccwaterback · July 13, 2004, 4:13am

Not at all. I have a script (written by a generous SDMB fellow) that “reads” a website every night at 7PM and saves it to my PC. To get baseball stats all you would need to so is program more “page parsing” logic.

Dim oXMLHTTP, sHTML
Dim oFS, oTS, sFileName, sFilePath
Dim sNewHTML
Set oXMLHTTP = CreateObject(“MSXML2.XMLHTTP.3.0”)
oXMLHTTP.open “GET”,“http://finance.yahoo.com/a3?o=l:0&d=t”,false
oXMLHTTP.send
sHTML = oXMLHTTP.responseText
Set oXMLHTTP = Nothing

sNewHTML = Replace(CStr(sHTML), “/q”, “http://finance.yahoo.com/q”)

sFileName = “up” & CStr(Year(Date)) & “-” & CStr(Month(Date)) & “-” & CStr(Day(Date)) & “.htm”
sFilePath = “C:\Documents and Settings\Administrator\Desktop\StkStuff”
Set oFS = CreateObject(“Scripting.FileSystemObject”)
Set oTS = oFS.CreateTextFile(sFilePath & sFileName,True,False)
oTS.Write CStr(sNewHTML)
oTS.Close

Set oTS = Nothing
Set oFS = Nothing

’ Run manually the first time to get Norton’s blessing.

friedo · July 13, 2004, 5:42am

I’ve written many such “screen-scraping” programs in Perl. Pretty trivial stuff, as long as they don’t radically change the layout of the page. (But if they do, all you have to do is change your parsing logic to match.)

rjung · July 13, 2004, 7:20am

Note that “scraping” is falling out of vogue, though, because it requires maintaining the reader to follow changes in the format of the parent page. Any site that’s going to present updated information on a regular basis really should have an RSS feed instead, which can be parsed easily.

friedo · July 13, 2004, 8:20am

Generally scraping is only used as a last resort; sites which want to distribute their information usually provide an XML feed of some sort (usually RSS). But if you need to extract information from a site that doesn’t want to bother, then scraping is the way to go.

Skywatcher · July 13, 2004, 1:06pm

Thanks, I knew I could count on Dopers to fill me in.

Trigonal_Planar · July 13, 2004, 1:50pm

I did this for an online game I play. There’s an economic portion to it and I wrote a script that would regularly collect the prices, supply and demand on the different items; this way I could plot graphs of said item over time in an effort to gain an edge on the competition.

Topic		Replies	Views
Programmers - help me win a contest! Factual Questions	7	992	May 11, 2001
Is there an easy way to update stats to my softball team's website? Factual Questions	5	717	May 24, 2002
Ask the guy who's downloaded at the Retrosheet baseball data into a database. Miscellaneous and Personal Stuff I Must Share	31	4099	October 25, 2005
MLB.com Gameday, 2007 version -- some nice improvements Cafe Society	0	637	April 5, 2007
Fantasy Baseball Miscellaneous and Personal Stuff I Must Share	8	826	January 26, 2001

Doper programmers: little help please?

Related topics