Archives on the World-Wide Web are now starting to hold some weight in their own right. With such a short lifespan of barely 10 years, the Web had not been considered a tool for serious researchers when it came to archival content.
However, as the metabolism of the Internet is at least 3 times faster than that of the average person, we see more happening faster, and more content available. Without restriction, this WWW machine is receiving content by the giga-truckloads 24/7. Good, bad, and useful for Investigators.
The trouble is that as quickly as this content rolls in it overwrites the current material. That can make things difficult for the historical researcher who is looking for yesterday’s news, a copy of the website from last year, or material that has since been erased from the screen. Investigators can utilize historical information for locating
– A photo that was once present, but since removed.
– Content, views, writings and opinions which may have been rescinded by a particular author or group.
– Products being sold illegally, pulled when they realized they were being watched.
– Old job ads, company statements, company affiliations, or product lines
Solutions are found in at least two services, free and easy to use…
Google.com as always comes through with some sort of solution. From the results of your standard search in Google.com you will see cache results. The cache is a snapshot of the site, when Google.com indexed it for their servers. Unfortunately, no dates are given so you are never sure of when that site was scanned and indexed. Using Google.com and its cache option are perfect when you are not sure of where your answer will appear. You are not familiar with the web address, or starting point. If you have your address though, you can narrow your google.com cache search by limiting to the site: function. Hence site: anyaddress.com + term. You have to add a term, use something common such as contact if you aren’t sure of anything specific, however names work very well.
If you know the website you are examining, waste no time in visiting the WayBack Machine located at http://www.archive.org
You will see on Archive.org a simple to use plug and play search box where you enter the address and click go. After the http:// enter the address you wish to view older pages on. We tried enron.com to see what was available. Visiting the Enron.com website today you will see content involving the Chapter 11 filings for the company, and post investigations and fallout documents. It’s current, it’s relevant, yet the information today does not tell me what happened 3 years ago, as told through their website. I would prefer to see Internet press releases, photos of executives shaking hands with vendors, and the “about us” section offering the biographies of their top employees prior to the collapse of their empire.
After typing in www.enron.com we see the results are as early as December 10, 1997. Clicking on the link brings you to the actual site from that date in time. The links will work! So if you were trying to find press releases, department heads, affiliations, and any sort of detail you would locate on a current site, but with the older data.
With a company as large as Enron, you can most likely find information in other sources as well. But when investigating a small firm, a one man operation with a website and not a good deal of national press, the WayBack Machine really offers some incredible leads.
For example, we checked out La Strada Ristorante, a Chicago based restaurant at random. In particular, we looked at their Wedding Package offers. The web site address is http://www.lastradaristorante.com/weddingpackages.html. Today you can order the Prime Rib dinner for $89.25. After running this address through the WayBack Machine we see the price in 2000 was $67.50 (including horseradish sauce). We also noticed in 2000 it was Surf and Turf $78.95. Now it’s marketed as Filet Mignon and Lobster Tail at $108.00.
We expect to see price and products increase with business, yet we note the jazzier style of the presentation, the careful wording of the menu items, and a measurable increase in prices, which will help us to predict future pricing in the next 2 years. In other words, the WayBack Machine is helpful as a looking glass to the future, or competitive intelligence tool.
Some technical notes into the advanced features…
The asterisks we see next to some of the dates will tell us when WayBack noticed changes from one version of the site to another. This way you do not have to view each and every page looking for change. There is a comparison tool. After you run the initial search you can click on “Compare Archive Pages” at the top of the screen. Then you can choose two dates to compare against each other. What you see are strike outs right on the page, of the older text.
If you click on Advanced Features, you can narrow your search by year, convert your results into PDF files (which is helpful for reports and printing results), also a large assortment of check boxes for restraining or enhancing your search options.
WayBack Machine also offers a toolbar utility which is handy if you are visiting a website, you click on your toolbar and the Archive results screen pops up. So you can go from the actual site you are researching to its historical views in one click. Toolbar option is found at http://www.archive.org/web/web.php.
WayBack Machine is not only archiving the big companies and fortune 500 types. According to the site there are “30 billion web pages archived from 1996 to a few months ago.” What about searching for specific data from archived pages? Read our article on the Recall Machine! Combining the WayBack Machine with Google.com cache should give Investigators a descent start on archival research into Web sites.
Author of Business Background Investigations (2007) and The Manual to Online Public Records (2008).