The current documentation out there seems to be all over the map, and never makes it especially clear, especially for beginners. Hopefully this document will address this need.
Almost the exact same document can be found here, but it does/says things slightly differently. Comparing the differences might help the beginner understand the concepts a little better.
Pretty much a no-brainer, right?
Now let’s create a text file to be read, something like:
We could name the file anything, but we’ll call it
There are basically three ways a file can be read:
Side Note: Be wary of that term “character”; in American English one letter of the alphabet (or one numeral, or one punctuation mark, etc) corresponds to one character, but in other languages, there might be more than one character per “letter”/symbol. For our purposes, using an ASCII text file like above, we are safe to think of one character as one byte of data as one symbol on the keyboard (or in an ASCII chart). But ideally, you should get familiar with UTF-8 (of which ASCII is now a sub-set), and think in those terms, rather than thinking in terms of ASCII characters.
Also, you might be in the habit of thinking of a “string” as a connected series of “char”s, as in the string “Rover is a good dog” is a connected series of “R” + “o” + “v” + e” and etc. Be aware that in Rust, a “String” type is not simply an array, a connected series, of “char” types.
But both of these points are for a later time, and for today, you can think of ASCII text as chars and as symbols on your keyboard plus a few others you have to look up in an ASCII chart to use (like box-drawing chars and upside-down question-marks, etc), and a string as a connected series of chars.)
We’re only going to look at the first of these three methods, because if you understand it, there’s a good change that the existing online documentation should be understandable enough for you to follow for the second and third methods.
This is probably the easiest way to read a text file. Your program simply opens the text file and gulps the entire file in one big bite into a string. Below is the basic idea (adding the highlighted lines), although trying to compile it as it is currently written will result in an error – go ahead, try it. We no longer need the “Hello, World!”, any more, so let’s get rid of it (the strikeout line) to clean up our code.
As you can see, we don’t have an “open file” or “close file” statement as is true in many other programming languages; we’re just grabbing (or trying to grab) the file contents in one big gulp, and then to print them to the terminal window.
But the compiler complains that “read_to_string” is “not found in this scope”. In other words, the compiler needs some “path” information to follow the trail to finding the command “read_to_string”. We can do this in a couple of different ways.
We can add the path directly in the command:
let file_contents = std::fs:: read_to_string(“src/data.txt”);
or we can put the path information in a use statement at the beginning of the program file:
The program still doesn’t run, though, giving you this error:
| println!("{}", file_contents); | ^^^^^^^^^^^^^ `std::result::Result<String, std::io::Error>` cannot be formatted with the default formatter
That simply means that the file is formatted (by the “read_to_string” command) in such a way that the “println!” command (“macro”, actually, as indicated by the bang (“!”)) doesn’t know how to format it for printing.
We can tell the “println!” macro to use a special “debug” format to sort out how to print it. You don’t really need to know exactly what this is doing; it’s just telling the println! macro to “just guess; do the best you can with it”. We do this by adding a “:?” to the “{}” in the println! statement.
println!(“{:? }”, file_contents);
The program should now compile and run, and give you some output that looks like this:
Ok("Kent\nMason\njugular\ntiger, river-dancing tse-tse “flyboys in the air”\nThis.Is.The.End.Of.This.Text.File!\n")
Here again is the output of our program:
Ok("Kent\nMason\njugular\ntiger, river-dancing tse-tse “flyboys in the air”\nThis.Is.The.End.Of.This.Text.File!\n")
The “\n” is the “newline” character (singular, although it’s actually two characters when typed out in English text, a backslash and an ‘n’). This corresponds to hitting ENTER to create a new line in your text editor.
When the “read_to_text” command reads a file, there are two possible outcomes: either there will be no errors, or there will be errors.
If the file is read successfully, the contents of the file are put into a "wrapper", and then a label is put onto that wrapper that says "Ok". If the file is not read successfully, an error message is put into that wrapper, and then a label is put onto that wrapper that says "Err".
This "wrapper" is then returned to the calling routine as a special type of data called a "Return" type (just as an integer might be an "i32" type, or a string of letters might be a "String" type, etc). A "Return" type returns either an "Ok" or an "Err", along with either the expected results tied to the "Ok" or an error message tied to the "Err".
>Try temporarily renaming your data file from “src/data.txt” to “src/data.tx”, and re-running your program.
Now you get this output from the program:
Err(Os{ code: 2, kind: NotFound, message: "No such file or directory";})
Nifty, eh? Some error information is “wrapped” in a package labeled “Err”.
But the program continues on. You can see this by adding another message at the end of the program:
Now if you run the program you'll see the error message printed out, and then the proof that this did not stop the program. In many cases, that could be bad; do you really want your program continuing after it has failed in some previous part?
It’s good Rust style to handle errors.
There are four ways of dealing with errors:
One way to handle errors is to simply not handle them, expecting them to not occur, as we did above.
A marginally better way is to not handle the error, but to stop the program from going any farther. We can do this by forcing the program to "panic" and quit. Try the following code (with your data file still misnamed).
let file_contents = fs::read_to_string("src/data.txt";).unwrap();
When you run the program now, the program crashes. You get essentially the same message as before, but before, the program didn’t panic/crash. Both situations (crashing and not crashing) can be bad, but not crashing, when running with bad/missing data, might turn out worse than just crashing.
If there are no errors though, (go ahead and fix the misspelling of “data.txt” now), you get your data (because there was no error), but you also don’t get it wrapped up in an “Ok” (or an “Err”) package. This is what the “.unwrap()” option does for us. It unwraps the data from the wrapper that results from the “read_to_string” command. And since the package is already unwrapped by the time it gets printed, you no longer need the "debug" format:
We can improve this non-handling of an error. Rather than using “.unwrap()”, we can use “.expect()”. This does much the same thing, except that if we don’t get an “Ok” like we expect, we can print out a customized message:
Now if you run your program, and there is no error, you get the data, but it’s not wrapped in the "Result" wrapper, so the file contents can be printed without using the "debug" format. But if something does go wrong, your customized message gets printed along with the error from the failure.
There's a finer-grained way to handle errors:
In the above snippet of code, we're initially creating a variable named "file_contents", and putting a "Result" data-type into that contains either an "Ok" along with the file contents or an "Err" along with the error message given from the operating system to the "read_to_string" function. Then we're running a "match" against that variable to find out which result we have; if it's an "Ok" (along with the file), we return that file and put it into a new-but-same-named variable, "file_contents". If the result is an "Err" (along with the error message from the OS), we panic, along with printing out a custom message which includes that error message from the OS.
The format of the returned data is not particularly helpful. It would be better if we could break up the data into logical “chunks”. A logical way of breaking up this data would be line-by-line. So if we could break up the data at the newlines, that’d be great. You can see that would be especially helpful on a data file that is more uniform, like this new “data2.txt”:
(Remember to change the instance of “data.txt” in your program to “data2.txt”, and to change it in the error message instance as well, so your error message won’t be in error.)
When we read this file, we read it into a single String variable ("file_contents") as one long string, like this:
West,Kent,157,555-123-4567,Male\nJohnson,Bendi,42,888-329-7555,Female\nDalcern,Frederique,2230,415-444-4448,Female<EOF>
It's easy to break up the data line-by-line, and convert each line into an element of a vector (a one-dimensional array).
The above can be condensed to the following:
;