There is a lot going on with your code, as already pointed out by others. But, I would like to point out a few more things, and this was much, much to long for a comment.
First, I would use SparseArray to generate H, instead of the pre-allocate, fill-in method you are using, as follows:
H = SparseArray[
{l_, m_} /; m >= l :> Sqrt[pp]/(Sqrt[n + l - m] (m - l + 1)),
{n, n}
];
Edit: looking at the timing for this, and as pointed out in the comments by ssch, the calculation of H can be sped up significantly by using Array, instead:
diags = Sqrt[1./HarmonicNumber[n, 2]]/(Sqrt[n - #] (# + 1)& @
Range[0., n - 1]);
H = Array[If[#2 < #1, 0., diags[[#2 - #1 + 1]]] &, {n, n}];
For large matrices, this scales significantly better for large n. For instance, at n = 500 the SparseArray method takes 3.3 secs v. 0.028 secs for the Array method.
End Edit
This generates an upper-triangular matrix, with m - l taking the place of i in your loops. Also, I got rid of the extra N as it is not necessary because pp is already a machine number from your use of N in its definition. Similarly, all N[H] can be replaced with H, further down in the code. While it may not provide much of a speed up, do not overuse a function if you don't have to.
Second, in terms of overuse, the way you have defined T means it will be regenerated every time it is used. Orthogonalize is not a fast operation, so don't make more work for yourself. I would define T as
T = A.Orthogonalize[H.A];
where I am using Set (=), instead of SetDelayed (:=), and I have removed the second argument to Orthogonalize. By default, Orthogonalize effectively uses Dot, but by specifying the argument, you may be forgoing potential speed ups. If this is true in this case, I do not know, I have not tested it. But, it holds true in a lot of other cases, so be cautious about overriding defaults.
Third, A.M.A == M. This simplifies the definition of Z to
Z = T.H.Transpose[T]
Edit: That is not true, in general. But, $A^T = A$ and $AA = I$, so
$$A(ABA)^TA = AAB^TAA = B^T,$$
and combined with
Othogonalize[H.A] == Orthogonalize[H].A
as discussed by Daniel, both T and Z can be rewritten:
T = Orthogonalize[H];
Z = A.T.A.H.Transpose[T];
But, as A is just a permutation, it can be best effected by using Part, and using Daniel's notation:
negrange = -Range[n];
Z = T[[negrange, negrange]].H.Transpose[T];
Fourth, timing shows that the longest running part of the code is Orthogonalize. On my machine, at n = 2500 this step along can take over a minute. An alternative is to use QR decomposition which generates an upper triangular matrix, $R$, and an orthogonal matrix, $Q$, that forms a basis for the column space of the matrix being decomposed. So, to get an orthogonal basis out of the row space, you take the transpose. The definition of T becomes
T = QRDecomposition[Transpose[H]][[1]];
The function below has been edited with these changes.
End Edit
Lastly, I would put the entire contents of the outer For into a separate function, sans the Print statement. Print does not provide results in a form usable for further processing, but as you were using For, its use is understandable. Additionally, putting it into a separate function allows you to use any number of methods for generating your data. With the changes I have suggested above, this looks like
myFunction[n_Integer?Positive] :=
Block[{negrange, diags, H, T, Z},
diags = Sqrt[1./HarmonicNumber[n, 2]]/(Sqrt[n - #] (# + 1)& @
Range[0., n - 1]);
H = Array[If[#2 < #1, 0., diags[[#2 - #1 + 1]]] &, {n, n}];
T = QRDecomposition[Transpose[H]][[1]];
negrange = -Range[n];
Z = T[[negrange, negrange]].H.Transpose[T];
Total[(Flatten[UpperTriangularize[Z,1]])^2]
]
Edit: Timings on my computer:
n Timing (s)
2500 15.05
3000 24.31
3500 37.61
4000 54.56
So, while not fast, it scales significantly better than the previous versions.
End Edit
As this is likely to be a long running process, I would run it as follows:
Internal`WithLocalSettings[
strm = OpenWrite["output.log"];
AppendTo[ $Output, strm ],
Scan[Print[#, "\t", myFunction[#]]&, Range[2500, 10000, 500]],
$Output = Drop[$Output, -1];
Close[strm]
]
where a Print will output to both a log-file and the screen. This allows you to save the output as you go. If you don't need this, then just use Table.
Edit - Last words:
The key here is simply to profile your code. Examine each step and find out which is slow. As shown above, while I like the simplicity of using SparseArray, it scales badly compared to using Array, so for efficiency sake SparseArray is out. The same thing goes for Orthogonalize v. QRDecomposition. As you might not be familiar with the properties of QRDecomposition, that is when you ask: Orthogonalize scales badly, are there alternatives?
The other advice is to reduce redundancy. Only do something once, if you can get away with it. There are times where you won't be able to, like you run out of memory, but if you can, it often will save time.
Sum[1/i^2, {i, 1, n}] == HarmonicNumber[n, 2]and is always better to work with built in functions. I have also bad experience withForI would useDoif there is no way to implementNest,Mapetc. – Kuba Aug 08 '13 at 12:02A = Reverse[IdentityMatrix[n]]should provide the same result as your two lines initializingA, but it should a little faster/cleaner. – Jacob Akkerboom Aug 08 '13 at 12:26UpperTriangularize[Z,1]to zero everything on the main and lower diagonals. And sum it like:Total[(Flatten[UpperTriangularize[Z,1]])^2]– ssch Aug 08 '13 at 12:37