4

Using C language,I can use feof to test if the pointer at end of file.

while(!feof(fp))
{
    //to do
}

But in the mma,I usually use this method to simulate C.

file = OpenRead["http://home.ustc.edu.cn/~xiaozh/SE/stu.dat", BinaryFormat -> True];    
Reap[While[(tempRecord = BinaryRead[file, "Byte"]) =!= EndOfFile, 
     Sow@tempRecord]][[2, 1]]        
(*same as BinaryReadList[file,"Byte"]*)
Close[file];

{0, 0, 0, 0, 97, 0, 0, 0, 1, 0, 0, 0, 98, 0, 0, 0, 2, 0, 0, 0, 99, 0, 0, 0, 3, 0, 0, 0, 100, 0, 0, 0, 4, 0, 0, 0, 101, 0, 0, 0, 5, 0, 0, 0, 102, 0, 0, 0, 6, 0, 0, 0, 103, 0, 0, 0, 7, 0, 0, 0, 104, 0, 0, 0, 8, 0, 0, 0, 105, 0, 0, 0, 9, 0, 0, 0, 106, 0, 0, 0, 10, 0, 0, 0, 107, 0, 0, 0, 11, 0, 0, 0, 108, 0, 0, 0, 12, 0, 0, 0, 109, 0, 0, 0, 13, 0, 0, 0, 110, 0, 0, 0, 14, 0, 0, 0, 111, 0, 0, 0, 15, 0, 0, 0, 112, 0, 0, 0, 16, 0, 0, 0, 113, 0, 0, 0, 17, 0, 0, 0, 114, 0, 0, 0, 18, 0, 0, 0, 115, 0, 0, 0, 19, 0, 0, 0, 116, 0, 0, 0, 20, 0, 0, 0, 117, 0, 0, 0, 21, 0, 0, 0, 118, 0, 0, 0, 22, 0, 0, 0, 119, 0, 0, 0, 23, 0, 0, 0, 120, 0, 0, 0, 24, 0, 0, 0, 121, 0, 0, 0, 25, 0, 0, 0, 122, 0, 0, 0}

But this time I want to use mma to load the struct structure not bytes.

file = OpenRead["http://home.ustc.edu.cn/~xiaozh/SE/stu.dat", BinaryFormat -> True];
Reap[While[(tempRecord = BinaryRead[file, "Integer32"]) =!= EndOfFile,
     Sow@{tempRecord, BinaryRead[file, "Character8"]}; 
     Skip[file, "Byte", 3]]][[2, 1]]
Close[file];

{{0, "a"}, {1, "b"}, {2, "c"}, {3, "d"}, {4, "e"}, {5, "f"}, {6, "g"}, {7, "h"}, {8, "i"}, {9, "j"}, {10, "k"}, {11, "l"}, {12, "m"}, {13, "n"}, {14, "o"}, {15, "p"}, {16, "q"}, {17, "r"}, {18, "s"}, {19, "t"}, {20, "u"}, {21, "v"}, {22, "w"}, {23, "x"}, {24, "y"}, {25, "z"}}

So I want to know if we can define a feof function to make the process more like C?

PS:The stu.dat was generated by C++.

And it define a struct structure.

The first member is the index of the student,the second is its name.

#include <iostream>
#include <stdio.h>
using namespace std;

typedef struct student
{
    int id;
    char name;
}stu;

int main()
{
const int num=26;
student stu[num];
memset(stu,0,sizeof(student)*num);

for(int i=0;i<num;i++)
{
    stu[i].id=i;
    stu[i].name=(char)(97+i);
}
for(int i=0;i<num;i++)
    cout<<stu[i].id<<" "<<stu[i].name<<endl;

FILE* fp = fopen("C:\\stu.dat","wb");
fwrite(stu,sizeof(student),num,fp);
fclose(fp);
return 0;
}
partida
  • 6,816
  • 22
  • 48
  • It depends on what you are doing. More context is needed. To see a while loop just like the one you show, but written in Mathematica, take a look here. – Szabolcs Nov 18 '16 at 14:15
  • PArtida.. This is a C# or C++ command.? I do not dominate these languages but vertainly you can run the code from Mathematica. – Jose Enrique Calderon Nov 18 '16 at 14:27
  • 4
    You should perhaps be aware that "while(!feof(fp))" is the wrong way to read a file in C. See also: http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong – Thomas Padron-McCarthy Nov 18 '16 at 17:44
  • As @ThomasPadron-McCarthy noted, while ( ! feof(fp) ) ; is not how you read files in C – cat Nov 19 '16 at 00:04
  • You can read multiple values with a single BinaryRead and even arrange them into whatever expression you like. BinaryRead[stream, {"Byte", {"Integer32", "Integer32", "Integer32"}, "Byte"}] – Szabolcs Nov 20 '16 at 09:15
  • @Szabolcs what the result of the code?I can't run it – partida Nov 20 '16 at 09:58
  • Why not? Did you open a file first using OpenRead[..., BinaryFormat -> True]? Did you look at the example I linked to multiple times, which shows how to use OpenRead? – Szabolcs Nov 20 '16 at 10:03
  • 1
    @Szabolcs yeah,BinaryReadList[ stream, {"Integer32", "Character8", "Byte", "Byte", "Byte"}][[All, ;; 2]] get the correct answer! – partida Nov 20 '16 at 10:19

3 Answers3

7

Your ideas are too C-oriented. The Wolfram Language is not at all like C despite some syntactic sugar built into it to make some constructs C-like; In my opinion, these features more often lead newcomers astray than help them. It is better to learn to do things in the WL way, rather than trying to write code to force WL to behave like C.

When working with external data, one must be defensive -- it's a jungle out there in file land. So file reading code of any kind and in any language should check each file system operation actually succeeds.

In Mathematica versions up to V11, what you want to do is done with streams. V11 introduced File objects, which, when used, can make the code like a little more like C.

Here is how I would implement reading a file byte by byte.

Using a stream

readFileBytes[path_String] :=
  Module[{strm, item, data = {}},
    If[(strm = OpenRead[path, BinaryFormat -> True]) === $Failed, Return[$Failed]];
    CheckAbort[
      While[(item = BinaryRead[strm, "Byte"]) =!= EndOfFile, 
        data = {data, item}];
      Close[strm];
      Flatten[data],
      Close[strm]]]

Using a file object

readFileBytes[path_String] :=
  Module[{file, item, data = {}},
    If[FileExistsQ[path], file = File[path], Return[$Failed]];
    CheckAbort[
      While[(item = BinaryRead[file, "Byte"]) =!= EndOfFile, 
        data = {data, item}];
      Close[file];
      Flatten[data],
      Close[file]]]

Testing

I have a file, ~/Documents/Miscellanea/data.txt that looks like this:

data

bytes = readFileBytes["~/Documents/Miscellanea/data.txt"]

{49, 32, 50, 10, 50, 32, 52, 10, 51, 32, 54, 10, 52, 32, 56, 10, 53, 32, 49, 48, 10}

 bytes // FromCharacterCode

data

m_goldberg
  • 107,779
  • 16
  • 103
  • 257
2

I don't think it is possible to check for the end of file on a generic stream (which may be a pipe, or other thing form $InputStreamMethods) as opposed to a file (for which you have a solution), without also trying to read from it.

What you can do is attempt reading something. If the end of file is reached, then functions such as Read, ReadString, BinaryRead, etc. will return EndOfFile. Functions such as ReadList, BinaryReadList, etc. will return {}.

For most tasks, this is quite sufficient. See here for such an application with a While loop very similar to the one in your question:


Why do I think that this check isn't possible without actually performing the read? Because in certain situations the read operation may block until it actually receives some information. One example is from the EndOfBuffer documentation page:

process = StartProcess[$SystemShell];

This will output one line, then wait for 20 seconds before terminating:

BinaryWrite[process, "echo line 1
  sleep 20
  exit\n"];

This returns the first line:

Read[process, String]
(* "line 1" *)

This won't tell us that the end of the stream is reached until the process actually terminates. It will block until the 20 seconds are up. Up to that point Mathematica cannot know if the process will decide to output more data.

Read[process, String]
(* EndOfFile *)
Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
1

This method works for me.

feof[file_]:=StreamPosition[file] >= FileByteCount[file]
partida
  • 6,816
  • 22
  • 48
  • This only works on files, not general streams. What if you're reading from a pipe? What if you are reading from a string stream? I think the correct solution to dealing with the end of the stream is either what Kuba or I said in the comments. – Szabolcs Nov 18 '16 at 15:27