This document hopes to explain in a simple, step-by-step manner, the basics of reading a text file in Rust.
Almost the exact same document can be found here, but it does/says things slightly differently. Comparing the differences might help the beginner understand the concepts a little better.
You need a working Rust setup, and to be able to compile and run the simple “Hello, World!” program. So let’s start with that.
/* Learning to Read a Text File By Kent West, 4 Sept 2023 */ main() { println!(“Hello, World!”); } // end of main()
Pretty much a no-brainer, right?
Now let’s create a text file, something like below (or the Dr. Seuss version above in the "Absolute minimal" green section):
Kent Mason Jugular tiger, river-dancing tse-tse “flyboys in the air” This.Is.The.End.Of.This.Text.File!
We could name the file anything, but we’ll call it “data.txt”. The file will need to be in the same directory as your current prompt. If you’re in “/home/kent/PROGRAMMING/Rust/file_experiment/”, that’s where your “data.txt” file should be. Since this is Rust, I’d move my current working directory down one more level, into “/home/kent/PROGRAMMING/Rust/file_experiment/src”, and put my “data.txt” file here, and run my commands like “cargo run” from here. Alternatively, in our coding later, when we read “data.txt”, we could specify the path in that read command.
There are basically three ways a file can be read:
Side Note: Be wary of that term “character”; in American English one letter of the alphabet (or one numeral, or one punctuation mark, etc) corresponds to one character, but in other languages, there might be more than one character per “letter”/symbol. But for our purposes, using an ASCII text file like above, we are safe to think of one character as one byte of data as one symbol on the keyboard (or in an ASCII chart). But ideally, you should get familiar with UTF-8 (of which ASCII is now a sub-set), and think in those terms, rather than thinking in terms of ASCII characters.
Also, you might be in the habit of thinking of a “string” as a connected series of “char”s, as in the string “Rover is a good dog” being a connected series of the characters “R” + “o” + “v” + e” and etc. Be aware that in Rust (because of the above sidenote), a “String” type is not simply an array, a connected series, of “char” types.
But both of these points are for a later time, and for today, you can think of ASCII text as chars and as symbols on your keyboard plus a few others you have to look up in an ASCII chart to use (like box-drawing chars and upside-down question-marks, etc), and a string as a connected series of chars.)
We’re only going to look at the first of these three methods, because if you understand it, there’s a good chance that the existing online documentation should be understandable enough for you to follow for the second and third methods.
This is probably the easiest way to read a text file. Your program simply opens the text file and gulps the entire file in one big bite into a string. Using the other methods, we'd be reading a bit of the file, manipulating it, then reading a bit more, manipulating that, reading some more, etc. With this method, we bring the whole file into fast RAM, and then we can manipulate it as needed. This method is not so well-suited for huge files, but for most files and for most projects, this method will be suitable in most cases.
Below is the basic idea (although trying to compile it will result in an error – go ahead, try it). Notice we removed the "Hello, World" print statement; we no longer need that.
/* Learning to Read a Text File By Kent West, 4 Sept 2023 */ main() { let file_contents = std::fs::read_to_string(“data.txt”); println!(“{}”, file_contents); } // end of main()
Or, if you prefer to use a "use" statement instead of a path in the "read_to_string()" line:
/* Learning to Read a Text File By Kent West, 4 Sept 2023 */use std::fs::*; main() { let file_contents =std::fs::read_to_string(“data.txt”); println!(“{}”, file_contents); } // end of main()
Either method will work.
As you can see, we don’t have an “open file” or “close file” statement as is true in many other programming languages; we’re just grabbing (or trying to grab) the file contents in one big gulp, and then to print them to the terminal window.
If your working directory is different than where the "data.txt" file is located, your "read_to_string" line will need to specify that path, like so:
read_to_string("src/data.txt").
But as mentioned, compilation fails, giving you this error:
| println!("{}", file_contents); | ^^^^^^^^^^^^^ `std::result::Result<String, std::io::Error>` cannot be formatted with the default formatter
This is because the "read_to_string" command doesn't simply return a String-type variable. It returns a Result-type of variable.
Think of a Result as a package from Amazon left on your front porch; it's a wrapped box that contains something. Imagine that Amazon has a policy (they don't) such that the delivery person scans the package with an x-ray machine just as the package is being left on your porch, and if the package-contents are in good shape, an "OK" label is attached to the package; if the contents are damaged, a label that says "Error" is attached, with a brief explanation of what's wrong. A Result-type of data is like this Amazon package; a wrapper containing the contents along with an "Ok" or "Err" label and an error message.
The "println!" statement expects a String-type of value, but we're giving it a Result-type of value.
There are several ways of dealing with this Result-type of "Amazon Package".
The normal println!() formatting capabilities does not know how to handle a Result-type of "Amazon package". But we can tell println!() to use its debug formatter. This is kind of like tossing your Amazon package to a three-year old kid, or a puppy, and saying, "See if you can get the contents out of this, Kid."
For "quick-and-dirty" results, this is often a good choice. It looks like this:
println!("{:? }", file_contents);
and results in this:
$ cargo run Compiling bub v0.1.0 (/home/westk/OneDrive/bub) Finished dev [unoptimized + debuginfo] target(s) in 0.14s Running `/home/westk/OneDrive/bub/target/debug/bub` Ok("Kent\nMason\nJugular tiger, river-dancing tse-tse “flyboys in the air”\nThis.Is.The.End.Of.This.Text.File!") $
As you can see, the contents of the file are wrapped as a String, in a "package" of parentheses, with an "Ok" label slapped onto the front.
As mentioned, this solution works for "quick-and-dirty", but it's probably not what you want in production-quality code.
This method unwraps the contents out of the "package", ignoring the "Ok" and "Err" labels, assuming that everything will be okay. If you're not terribly worried about your program crashing, this is a quick and not-quite-so-dirty method of getting to the data. And if things are not okay (say, the file is unreadable for some reason), your program will just crash.
In this code below, we are unwrapping the package after reading the file, but before we hand the contents off to the println!() statement (from which we have removed the debug formatter).
... main() { let file_contents = read_to_string(“data.txt”);let unwrapped_file_contents = file_contents.unwrap(); println!(“{}”,unwrapped_ file_contents); } // end of main()
Of course, if you wanted to re-use the same variable name, the above could be written thusly:
let file_contents = file_contents.unwrap(); println!(“{}”, file_contents);
More often though, we'd unwrap the package at the same time that we're reading it, rather than as a separate step later:
... main() { let file_contents = read_to_string(“data.txt”).unwrap() ; println!(“{}”, file_contents); } // end of main()
Either way, you'll see that the contents are just the file contents, not wrapped as a String in parentheses, and not with an "Ok" or an "Err" label attached.
Running `/home/westk/OneDrive/bub/target/debug/bub` Kent Mason Jugular tiger, river-dancing tse-tse “flyboys in the air” This.Is.The.End.Of.This.Text.File!
This method is very similar to the above method, except that if a crash-causing problem occurs, you can print out your own message before the program crashes.
That looks like this:
... main() { let file_contents = read_to_string(“data.txt”).unwrap().expect("Something went wrong; we didn't get what we expected.") ; println!(“{}”, file_contents); } // end of main()
Below is a successful run, followed by me changing the name of the data file to force an error on the next run.
$ cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `/home/westk/OneDrive/bub/target/debug/bub` Kent Mason Jugular tiger, river-dancing tse-tse “flyboys in the air” This.Is.The.End.Of.This.Text.File! $ mv data.txt data.tx $ cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `/home/westk/OneDrive/bub/target/debug/bub` thread 'main' panicked at 'Something went wrong; we didn't get what we expected.: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:4:52 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace $
As a general rule, using expect() is probably a better option than using unwrap().
This is probably the best method, acting according to what the label says. If this were an Amazon package that said "Error", indicating damage, you'd want to return it for a replacement.
That's the very purpose of a Result-type, so that the program can act accordingly to the label attached. If the label is "Err", the program can branch off to handle the error condition. We won't go deep into this option, but will only give a skeletal example.
... fn main() { let file_contents = read_to_string("data.txt"); match file_contents { Ok(contents)=> println!("{}", contents), Err(e) => println!("BOOM! Your program has exploded due to the following error: {}", e), } } // end of main()
Below is a successful run, followed by me changing the name of the data file to force an error on the next run.
$ cargo run Compiling bub v0.1.0 (/home/westk/OneDrive/bub) Finished dev [unoptimized + debuginfo] target(s) in 0.15s Running `/home/westk/OneDrive/bub/target/debug/bub` Kent Mason Jugular tiger, river-dancing tse-tse “flyboys in the air” This.Is.The.End.Of.This.Text.File! $ mv data.txt data.tx $ cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `/home/westk/OneDrive/bub/target/debug/bub` BOOM! Your program has exploded due to the following error: No such file or directory (os error 2) $
Now that we have the data in RAM, as one long continuous String-type variable, it can be manipulated in a variety of ways. For example, we might want to break up the single string into individual lines. Below are two different ways to accomplish this.
... fn main() { // Reading the data from a file into a string. let data_raw = read_to_string("data.txt").expect("Something went wrong; we didn't get what we expected."); // Splitting the data string at newlines using 'split()'. let data_processed = data_raw.split('\n'); println!("\nThis is debug's approximation of what the split'd data looks like internally:\n"); println!("{:?}", data_processed); println!("\nAnd this is the split'd data item-by-item:\n"); for item in data_processed { println!("Item = {}", item); } } // end of main()
and ...
... fn main() { // Reading the data from a file into a string. let data_raw = read_to_string("data.txt").expect("Something went wrong; we didn't get what we expected."); // Splitting the data string at newlines using 'splitlines ()'. let data_processed = data_raw.splitlines (); println!("\nThis is debug's approximation of what thesplit>line 'd data looks like internally:\n"); println!("{:?}", data_processed); println!("\nAnd this is thesplitline 'd data item-by-item:\n"); for item in data_processed { println!("Item = {}", item); } } // end of main()
We can even collect the lines (garnered either by .lines() or by .split()) into a vector:
... // Splitting the data string at newlines using 'lines()'. let data_processed = data_raw.lines();let data_processed into_vec: Vec<&str> = data_as_lines.collect(); ... println!("{:?}", data_processed_into_vec ); println!("\nAnd this is the lined data item-by-item:\n"); for item in data_processed_into_vec { ...
Notice that whereas we started out with a String-type of value in "data_raw", the vector "data_processed_into_vec" contains &str-type of values. This is because the "split()" and "lines()" functions both produce &str-types of values. Also note that we had to specify the type ("Vec<&str>") of the "data_processed_into_vec" variable.
Often these steps will be combined together:
... fn main() { // Reading the data from a file. let data_raw: String = read_to_string("data.txt") // Read data as a string. .expect("Something went wrong; we didn't get what we expected."); let data: Vec<&str> = data_raw .lines() // Then break up the single string into lines. .collect(); // Then collect those lines into a vector of &str values. println!("\nThis is debug's approximation of what the data looks like internally:\n"); println!("{:?}", data); println!("\nAnd this is the data element-by-element:\n"); for element in data { println!("Item = {}",element); } } // end of main()
What if you want to put your file-reading capability into a function?
Rust doesn’t care if the function definitions go before or after “main()”, but I tend to put them after.
/* Learning to Read a Text File By Kent West, 4 Sept 2023 */ use std::fs::*; fn main() { // Filepath to the data file: let data_file = "./data.txt"; // Get a vector of Strings, one element per line in the above data file. let data = read_data_from_file(data_file); println!("\n===ALL ELEMENTS, one at a time, without element index:"); for i in &data { println!("{}", i); } println!("\n===ALL ELEMENTS, one At a time, with index:"); for i in 0..data.len() { println!("Element {} = {}", i, data[i]); } println!("\nApproximation of vector:\n{:?}", data); println!("\nAnd the third element, just to show you can access individual elements in the vector:\n\t{}", data[2]); } // end of main() fn read_data_from_file(data_file: &str) -> Vec<String> { // Prepare error message in case it's needed. ('format!' works just like // 'println!', except it returns a value instead of printing it.) let err_msg = format!("Something went wrong reading '{}'.", data_file); // Read data file into a String. let contents_raw: String = read_to_string(data_file).expect(&err_msg); // Convert the String into a vector of &str elements, one per line of data file. let contents: Vec<&str> = contents_raw.lines().collect(); // Convert those &str values to String values. let mut return_vec = Vec::new(); for item in contents { return_vec.push(item.to_string()); } return return_vec; } // end of read_data_from_file()
And here's that function with better error-handling code, and a slightly different conversion method.
fn read_data_from_file(data_file: &str) -> Vec<String> { // Read data file into a Result. let contents_raw = read_to_string(data_file); // Process Result for errors let contents_string = match contents_raw { Ok(contents)=> contents, Err(e) => { let err_msg = format!("\nBOOM!\nYour program failed in the \ 'read_data_from file' function, trying to read the data file:\n\ '{}'.\nThe reason provided is:\n'{}'\n", data_file, e); panic!("{}", err_msg); } }; // Convert the String into a vector of String elements, one per line of data file. let contents = contents_string // Start with the String value. .lines() // Break it into one &str item per newline. .map(|line|line.to_string()) // Map each &str line into a String line. .collect(); // Collect the lines into a vector of Strings. return contents; } // end of read_data_from_file()
(You can test it by misnaming the 'data.txt" file, as we did before.)
$ cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `/home/westk/OneDrive/bub/target/debug/bub` ===ALL ELEMENTS, one at a time, without element index: Kent Mason Jugular tiger, river-dancing tse-tse “flyboys in the air” This.Is.The.End.Of.This.Text.File! ===ALL ELEMENTS, one At a time, with index: Element 0 = Kent Element 1 = Mason Element 2 = Jugular tiger, river-dancing tse-tse “flyboys in the air” Element 3 = This.Is.The.End.Of.This.Text.File! Approximation of vector: ["Kent", "Mason", "Jugular tiger, river-dancing tse-tse “flyboys in the air”", "This.Is.The.End.Of.This.Text.File!"] $ mv data.txt data.tx $ cargo run Finished dev [unoptimized + debuginfo] target(s) in 0.00s Running `/home/westk/OneDrive/bub/target/debug/bub` thread 'main' panicked at ' BOOM! Your program failed in the 'read_data_from file' function, trying to read the data file: './data.txt'. The reason provided is: 'No such file or directory (os error 2)' ', src/main.rs:40:13 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace $
One final modification of the function, which simplifies the code, but perhaps makes it more difficult for the beginner to comprehend:
fn read_data_from_file(data_file: &str) -> Vec<String> { // Read data file into a Result. let contents_raw = read_to_string(data_file); // Process Result for errors let contents = match contents_raw { Ok(contents)=> contents.lines().map(|line|line.to_string()).collect(), Err(e) => { let err_msg = format!("\nBOOM!\nYour program failed in the \ 'read_data_from file' function, trying to read the data file:\n\ '{}'.\nThe reason provided is:\n'{}'\n", data_file, e); panic!("{}", err_msg); } }; return contents; } // end of read_data_from_file()
You now know how to read data as a String from a text file, and process that data, in Rust.