Standard File I/O

Here, we'll discuss system functions for straightforward file input and output in VCSSL.

- Table of Contents -

Standard File Output
Writing Line by Line
Standard File Input
Reading Line By Line
Specification of Standard File I/O Functions
Writing Values of y = x^2 to a CSV Numerical Data File
Reading a CSV Numerical Data File of y = x^2
Specifying Character Encoding

Standard File Output

Let's start with an example that writes a string to a file named "world.txt":


// Open the file in text writing mode
int fileID = open("world.txt", "w");

// Write "Hello world!" to the file
write(fileID, "Hello world!");

// Close the file
close(fileID);

WriteFile.vcssl

Executing this code creates a file named "world.txt" in the same directory as the program, containing the text "Hello world!".

In the "open" function, the first argument is the file name "world.txt", and the second argument is the text writing mode "w" (alternatively, you can use the constant WRITE). There are various other modes available for file operations.

The open function returns a unique identifier for the opened file, referred to as the "file ID". In the example above, it's stored in an integer variable named "fileID".

To perform file write/read operations or to close the file, you use this file ID, allowing you to manage multiple files simultaneously.

Writing Line by Line

In scenarios where you need to write content line by line, users familiar with C language might write:


write(fileID, "Hello world! \n");

WriteFileLineBackslashN.vcssl

However, while this works, it's discouraged in VCSSL due to potential confusion caused by environment-dependent newline characters. Instead, use the "writeln" function, which automatically appends the correct newline character based on the operating environment:


// Open the file in text writing mode
int fileID = open("world.txt", "w");

// Write three lines to the file, one at a time
writeln(fileID, "Hello");
writeln(fileID, "World");
writeln(fileID, "!");

// Close the file
close(fileID);

WriteFileLine.vcssl

This approach ensures that each string is correctly followed by a newline, simplifying the writing process and avoiding cross-platform issues.

Standard File Input

Let's now try reading the file we previously wrote:


// Open the file in text reading mode
int fileID = open("world.txt", "r");

// Read from the file
string text = read(fileID);

// Close the file
close(fileID);

// Display the read content
print(text);

ReadFile.vcssl

In the open function above, the "r" argument specifies the text reading mode. The "read" function retrieves the file's contents (alternatively, you can use the constant READ).

The return value of the "read" function is typically a string array. However, in text reading mode ("r"), it returns an array containing a single element. In VCSSL, you can directly assign an array with one element to a scalar variable. Thus, in the example above, the return value of the read function is stored in the scalar variable "text".

In binary reading mode ("rb"), the returned array from the "read" function contains bytes. In TSV reading mode ("rtsv"), the returned array contains contents separated by tabs or spaces. In CSV reading mode ("rcsv"), the returned array contains contents separated by commas. This means you need to handle the results differently depending on the mode used.

Reading Line By Line

To read the file line by line, follow these steps:


// Open the file in text reading mode
int fileID = open("world.txt", "r");

// Determine the number of lines in the file
int n = countln(fileID);

for (int i = 0; i < n; i++) {

	// Read one line from the file
	string line = readln(fileID);

	// Display the read content
	println("LINE_" + i + ": " + line);
}

// Close the file
close(fileID);

ReadFileLine.vcssl

Here, we first ascertain the number of lines using the "countln" function. Then, we read through the file one line at a time using the readln function. Unlike the read function, which retrieves the entire file's contents at once, readln extracts content line by line, but otherwise it's the same.

Specification of Standard File I/O Functions

The functions used in the examples above, such as "open" and "readln", are called standard file I/O functions, provided by the "System" library.

List of Standard File I/O Functions

Function	Arguments	Details
open	file name: string, mode: string	Opens the file with the specified file name in the specified mode, assigns a file ID, and returns it. The modes are summarized in the table below. Modes can be specified using string literals like "w", or using constants like WRITE.
write	file ID: int, content1: any type, content2: any type, ...	Writes the contents to the target file. If multiple arguments are specified, they are written continuously in text writing mode, tab-separated in TSV writing mode, comma-separated in CSV writing mode, or byte-by-byte in binary writing mode.
writeln	file ID: int, content1: any type, content2: any type, ...	Writes the contents to the target file and adds a newline. The behavior when multiple arguments are specified is the same as the "write" function.
read	file ID: int	Reads the contents of the specified file and returns them as a string array. In text reading mode, it returns an array with one element containing the entire file content. In TSV reading mode, it returns an array separated by tabs or spaces; in CSV reading mode, by commas; and in binary reading mode, by bytes. Please note that, in TSV reading mode, spaces and tabs are treated as equivalent, and consecutive spaces or tabs are treated as one, assuming numerical data TSV files. The open.file.TextFile library provides functionality for handling more strict text TSV files.
readln	file ID: int	Reads one line from the specified file and returns it as a string array, similar to the "read" function but for individual lines.
countln	file ID: int	Counts the number of lines in the file and returns it as an int.
close	file ID: int	Closes the file, ensuring all data is written and file resources are released.

List of File I/O Modes

The modes specified as the second argument of the "open" function are outlined below.

Writing Modes

Specified Value for Writing Mode	Details
"w" or constant WRITE	Opens a file in text writing mode.
"a" or constant APPEND	Opens a file in append mode, adding text to the end.
"wtsv" or constant WRITE_TSV	Opens a file in TSV writing mode.
"wcsv" or constant WRITE_CSV	Opens a file in CSV writing mode.
"wb" or constant WRITE_BINARY	Opens a file in binary writing mode.

Reading Modes

Specified Value for Reading Mode	Details
"r" or constant READ	Opens a file in text reading mode.
"rtsv" or constant READ_TSV	Opens a file in TSV reading mode, treating tabs and spaces as delimiters. (*)
"rcsv" or constant READ_CSV	Opens a file in CSV reading mode.
"rb" or constant READ_BINARY	Opens a file in binary reading mode.

* Note: TSV reading mode interprets both tabs and spaces as delimiters, not just tabs. This is because this mode was implemented for reading numerical files, from the early days of VCSSL. If you strictly want to separate values only by tabs, please use the open.file.TextFile library.

Details and Notes for Each Mode

Text Writing/Reading Modes

These modes are intended for regular text files. In writing mode, the specified strings or values are written directly to the file as text. If multiple arguments are specified, they are written sequentially without any separators. In reading mode, the entire contents of a file or a line are returned as a string array with one element.

Binary Writing/Reading Modes

These modes allow for the direct reading and writing of byte values. Since VCSSL lacks a type specifically for handling single-byte values, reading and writing are performed using int type values. For example, writing a sequence of int values 1, 2, and 3 will directly write them as a sequence of unsigned byte values 1, 2, and 3 (represented in binary as 00000001, 00000010, and 00000011). Note that the maximum value that can be written in one byte is 255 (11111111).

TSV Writing/Reading Modes

These modes are designed for handling tab- or space-separated numerical data files. They are optimized for handling numerical data and perform simple operations without additional processing such as quoting or escaping values that contain delimiter characters or newlines.

CSV Writing/Reading Modes

Similar to TSV modes, these are for comma-separated numerical data files. Like TSV modes, no complex processing is performed for values that include delimiters or newlines.

For both TSV and CSV modes, if you require handling of files that need more sophisticated processing (such as escaping delimiters within values), you should use the open.file.TextFile library.

Writing Values of y = x^2 to a CSV Numerical Data File

Let's demonstrate writing to a CSV numerical data file. We'll calculate y = x^2 and output the results to a file:


// Open the file in CSV writing mode
int fileID = open("x2.csv", "wcsv");

for(int i = 0; i <= 10; i++){

	// Write one line to the file, appending a newline
	writeln(fileID, i, i*i);
}

close(fileID);

WriteFileCSV.vcssl

Executing this code will generate a file named "x2.csv" in the same directory as the program. The content of the file will look like this:

0,0
1,1
...
10,100

Files in this format can be graphed using general graphing software or VCSSL's graph plotting functionality.

Reading a CSV Numerical Data File of y = x^2

Next, we'll read from the file we just created:


// Open the file in CSV reading mode
int fileID = open("x2.csv", "rcsv");

// Get the number of lines in the file
int n = countln(fileID);

// Variables to store values (comma-separated) for each line
int x[n];
int y[n];

for(int i = 0; i < n; i++){

	// Read the next line and separate values by commas
	int[] line = readln(fileID);

	// Assign the first and second elements to x and y, respectively
	x[i] = line[0];
	y[i] = line[1];
}

close(fileID);

// Display the read data
for(int i = 0; i < n; i++){
	println(x[i], y[i]);
}

ReadFileCSV.vcssl

This program reads the "x2.csv" file, storing the separated values into x and y arrays, and then displays their contents on the console.

The "readln" function, when called for the first time, retrieves the first line of the file, separates it by the comma symbol ",", and stores it in a string array, then returns it. In the above code, this array is received in the integer array "line". Here implicit type conversion casts the strings to integers.

The line[0] represents the first value (x-value) described in the line, so it's stored in the integer array x, and similarly, the nexe value line[1] is stored in y. This completes the reading of one line.

Since "readln" is within a for loop, it's called repeatedly 10 times. When called for the second time, it reads the 2nd line of the file. This way, it reads up to the 10th line, and the program exits.

Specifying Character Encoding

When exchanging files created on different operating systems, issues like character encoding differences can lead to "mojibake" or garbled text. To address this, you can specify the character encoding in the "open" function by adding an encoding parameter.


int file = open("utf8.txt", "w", "UTF-8");
write(file, "ABCDE!");
close(file);

WriteFileEncoding.vcssl

Here are the character encodings you can use:

Encoding	Details
UTF-8	One of the Unicode Transformation Formats. UTF-8 is versatile and widely supported across platforms, becoming the global standard character encoding.
UTF-16	One of Unicode Transformation Format. It was once popular for certain applications but is less commonly used today outside of specific contexts.
UTF-32	One of Unicode Transformation Format. This format uses four bytes for each character, providing a fixed width for all Unicode characters. It is rarely used due to its inefficiency in space compared to UTF-8 or UTF-16.
Shift_JIS	Previously the mainstream character encoding in Japanese computing environments. There has been a significant shift towards UTF-8 due to its broader compatibility.
EUC-JP	Commonly used in older Japanese versions of Linux environments. Like Shift_JIS, it has been largely superseded by UTF-8 in modern applications.

It's crucial to use the correct formal notation for each encoding. For example, "Shift_JIS" instead of "Shift-JIS" (using an underscore instead of a hyphen) is required to avoid errors. Similarly, "UTF-8" must be used instead of "UTF_8". Adherence to these formats is essential to ensure correct function usage.

Acknowledgement: We greatly appreciate the cooperation of two ChatGPT AIs in translating this page.
» How we translated this page