I have the following program at home, which draws on a batch of pre-processed files (that are word frequency text files, compiled using ToLowerCase, Tally and Sort. Here is an example of the file that it draws on:
{{"i", 3073}, {"you", 2860}, {"the", 1741}, {"and", 1518}, {"a", 1218},
{"me", 1209}, {"to", 1153}, {"my", 913}, {"t", 855}, {"that", 843}... and so on. Each file is about ~ 50KB, and it's been generated using Put and is thus grabbed using Get.
I use them for the following program, which essentially plots word frequency for a given year (each year has its own file). It's along a similar lines as the Google ngram viewer, but on a very specific historical dataset.
Manipulate[viewerCount1 = {};
viewerCount2 = {};
SetDirectory["/users/myNAME/desktop/DB/Put/"];
filenames = FileNames["*.txt"];
Do[
input = Get[file];
yearLength = Length[input];
AppendTo[viewerCount1,
If[Length[Flatten[Cases[input, {word1, _}]]] == 0, 0,
Flatten[Cases[input, {word1, _}]][[2]]]/yearLength];
AppendTo[viewerCount2,
If[Length[Flatten[Cases[input, {word2, _}]]] == 0, 0,
Flatten[Cases[input, {word2, _}]][[2]]]/yearLength];
, {file, filenames}];
DateListPlot[{Tooltip[viewerCount1, word1],
Tooltip[viewerCount2, word2]}, {1964}, Joined -> True,
PlotLabel -> "Number of Total Word Appearances by % of All Words",
PlotStyle -> {{Red}, {Blue}}], {{word1, "war",
"First Word (Red)"}}, {{word2, "peace", "Second Word (Blue)"}}]

Here's the crux of the question, however. Is there a good way to deploy this online so other people can access it? I assume this is too big for the CDF format? Alternatively, WebMathematica sounds promising, but before I upgrade my Mathematica purchase I would love to hear if the community thinks that would be useful.
I know that there have been discussions elsewhere about sharing dynamic content with non-Mathematica users, but these seem to not rely on specific external files.
Or, am I using the wrong language to deploy the final version of this?
ImportorGetfrom non-official sources, would I then write include all of that information in the Mathematica code itself (i.e. paste the MB or so of info into the notebook itself)? – canadian_scholar Feb 15 '12 at 15:19Compressand storeCompress-ed data. You can make them self-uncompressing upon the first call to the data. The technique can be similar to what I used to answer your other question here: http://stackoverflow.com/questions/8247005/efficiently-working-with-and-generating-large-text-files/8250860#8250860 – Leonid Shifrin Feb 15 '12 at 15:21Import. – Eli Lansey Feb 15 '12 at 17:09