Standard File I/O

Here, we'll discuss system functions for straightforward file input and output in VCSSL.

- Table of Contents -

Standard File Output

Let's start with an example that writes a string to a file named "world.txt":

WriteFile.vcssl

Executing this code creates a file named "world.txt" in the same directory as the program, containing the text "Hello world!".

In the "open" function, the first argument is the file name "world.txt", and the second argument is the text writing mode "w" (alternatively, you can use the constant WRITE). There are various other modes available for file operations.

The open function returns a unique identifier for the opened file, referred to as the "file ID". In the example above, it's stored in an integer variable named "fileID".

To perform file write/read operations or to close the file, you use this file ID, allowing you to manage multiple files simultaneously.

Writing Line by Line

In scenarios where you need to write content line by line, users familiar with C language might write:

WriteFileLineBackslashN.vcssl

However, while this works, it's discouraged in VCSSL due to potential confusion caused by environment-dependent newline characters. Instead, use the "writeln" function, which automatically appends the correct newline character based on the operating environment:

WriteFileLine.vcssl

This approach ensures that each string is correctly followed by a newline, simplifying the writing process and avoiding cross-platform issues.

Standard File Input

Let's now try reading the file we previously wrote:

ReadFile.vcssl

In the open function above, the "r" argument specifies the text reading mode. The "read" function retrieves the file's contents (alternatively, you can use the constant READ).

The return value of the "read" function is typically a string array. However, in text reading mode ("r"), it returns an array containing a single element. In VCSSL, you can directly assign an array with one element to a scalar variable. Thus, in the example above, the return value of the read function is stored in the scalar variable "text".

In binary reading mode ("rb"), the returned array from the "read" function contains bytes. In TSV reading mode ("rtsv"), the returned array contains contents separated by tabs or spaces. In CSV reading mode ("rcsv"), the returned array contains contents separated by commas. This means you need to handle the results differently depending on the mode used.

Reading Line By Line

To read the file line by line, follow these steps:

ReadFileLine.vcssl

Here, we first ascertain the number of lines using the "countln" function. Then, we read through the file one line at a time using the readln function. Unlike the read function, which retrieves the entire file's contents at once, readln extracts content line by line, but otherwise it's the same.

Specification of Standard File I/O Functions

The functions used in the examples above, such as "open" and "readln", are called standard file I/O functions, provided by the "System" library.

List of Standard File I/O Functions

Function Arguments Details
open file name: string,
mode: string
Opens the file with the specified file name in the specified mode, assigns a file ID, and returns it. The modes are summarized in the table below. Modes can be specified using string literals like "w", or using constants like WRITE.
write file ID: int,
content1: any type,
content2: any type,
...
Writes the contents to the target file. If multiple arguments are specified, they are written continuously in text writing mode, tab-separated in TSV writing mode, comma-separated in CSV writing mode, or byte-by-byte in binary writing mode.
writeln file ID: int,
content1: any type,
content2: any type,
...
Writes the contents to the target file and adds a newline. The behavior when multiple arguments are specified is the same as the "write" function.
read file ID: int Reads the contents of the specified file and returns them as a string array. In text reading mode, it returns an array with one element containing the entire file content. In TSV reading mode, it returns an array separated by tabs or spaces; in CSV reading mode, by commas; and in binary reading mode, by bytes. Please note that, in TSV reading mode, spaces and tabs are treated as equivalent, and consecutive spaces or tabs are treated as one, assuming numerical data TSV files. The open.file.TextFile library provides functionality for handling more strict text TSV files.
readln file ID: int Reads one line from the specified file and returns it as a string array, similar to the "read" function but for individual lines.
countln file ID: int Counts the number of lines in the file and returns it as an int.
close file ID: int Closes the file, ensuring all data is written and file resources are released.

List of File I/O Modes

The modes specified as the second argument of the "open" function are outlined below.

Writing Modes

Specified Value for Writing Mode Details
"w" or constant WRITE Opens a file in text writing mode.
"a" or constant APPEND Opens a file in append mode, adding text to the end.
"wtsv" or constant WRITE_TSV Opens a file in TSV writing mode.
"wcsv" or constant WRITE_CSV Opens a file in CSV writing mode.
"wb" or constant WRITE_BINARY Opens a file in binary writing mode.

Reading Modes

Specified Value for Reading Mode Details
"r" or constant READ Opens a file in text reading mode.
"rtsv" or constant READ_TSV Opens a file in TSV reading mode, treating tabs and spaces as delimiters. (*)
"rcsv" or constant READ_CSV Opens a file in CSV reading mode.
"rb" or constant READ_BINARY Opens a file in binary reading mode.
* Note: TSV reading mode interprets both tabs and spaces as delimiters, not just tabs. This is because this mode was implemented for reading numerical files, from the early days of VCSSL. If you strictly want to separate values only by tabs, please use the open.file.TextFile library.

Details and Notes for Each Mode

Text Writing/Reading Modes

These modes are intended for regular text files. In writing mode, the specified strings or values are written directly to the file as text. If multiple arguments are specified, they are written sequentially without any separators. In reading mode, the entire contents of a file or a line are returned as a string array with one element.

Binary Writing/Reading Modes

These modes allow for the direct reading and writing of byte values. Since VCSSL lacks a type specifically for handling single-byte values, reading and writing are performed using int type values. For example, writing a sequence of int values 1, 2, and 3 will directly write them as a sequence of unsigned byte values 1, 2, and 3 (represented in binary as 00000001, 00000010, and 00000011). Note that the maximum value that can be written in one byte is 255 (11111111).

TSV Writing/Reading Modes

These modes are designed for handling tab- or space-separated numerical data files. They are optimized for handling numerical data and perform simple operations without additional processing such as quoting or escaping values that contain delimiter characters or newlines.

CSV Writing/Reading Modes

Similar to TSV modes, these are for comma-separated numerical data files. Like TSV modes, no complex processing is performed for values that include delimiters or newlines.

For both TSV and CSV modes, if you require handling of files that need more sophisticated processing (such as escaping delimiters within values), you should use the open.file.TextFile library.

Writing Values of y = x^2 to a CSV Numerical Data File

Let's demonstrate writing to a CSV numerical data file. We'll calculate y = x^2 and output the results to a file:

WriteFileCSV.vcssl

Executing this code will generate a file named "x2.csv" in the same directory as the program. The content of the file will look like this:

0,0
1,1
...
10,100

Files in this format can be graphed using general graphing software or VCSSL's graph plotting functionality.

Reading a CSV Numerical Data File of y = x^2

Next, we'll read from the file we just created:

ReadFileCSV.vcssl

This program reads the "x2.csv" file, storing the separated values into x and y arrays, and then displays their contents on the console.

The "readln" function, when called for the first time, retrieves the first line of the file, separates it by the comma symbol ",", and stores it in a string array, then returns it. In the above code, this array is received in the integer array "line". Here implicit type conversion casts the strings to integers.

The line[0] represents the first value (x-value) described in the line, so it's stored in the integer array x, and similarly, the nexe value line[1] is stored in y. This completes the reading of one line.

Since "readln" is within a for loop, it's called repeatedly 10 times. When called for the second time, it reads the 2nd line of the file. This way, it reads up to the 10th line, and the program exits.

Specifying Character Encoding

When exchanging files created on different operating systems, issues like character encoding differences can lead to "mojibake" or garbled text. To address this, you can specify the character encoding in the "open" function by adding an encoding parameter.

WriteFileEncoding.vcssl

Here are the character encodings you can use:

Encoding Details
UTF-8 One of the Unicode Transformation Formats. UTF-8 is versatile and widely supported across platforms, becoming the global standard character encoding.
UTF-16 One of Unicode Transformation Format. It was once popular for certain applications but is less commonly used today outside of specific contexts.
UTF-32 One of Unicode Transformation Format. This format uses four bytes for each character, providing a fixed width for all Unicode characters. It is rarely used due to its inefficiency in space compared to UTF-8 or UTF-16.
Shift_JIS Previously the mainstream character encoding in Japanese computing environments. There has been a significant shift towards UTF-8 due to its broader compatibility.
EUC-JP Commonly used in older Japanese versions of Linux environments. Like Shift_JIS, it has been largely superseded by UTF-8 in modern applications.

It's crucial to use the correct formal notation for each encoding. For example, "Shift_JIS" instead of "Shift-JIS" (using an underscore instead of a hyphen) is required to avoid errors. Similarly, "UTF-8" must be used instead of "UTF_8". Adherence to these formats is essential to ensure correct function usage.

Acknowledgement: We greatly appreciate the cooperation of two ChatGPT AIs in translating this page.
» How we translated this page