Lesson 11: File Input and Output
Files are an important part of any game whether to hold high scores, player information, stats and configuration settings, or assets such as sounds and images. This lesson is going to assume you already have a working knowledge of drive and directory structure. If you know what the following terms mean and how they pertain to drive and directory structure then continue on with this lesson.
Root, directory, sub-directory, tree, command line interface, shell, extension, and path.
If you are unfamiliar with these terms I highly suggest you view a tutorial on using the Windows command line. A good place to start would be Computer Hope's tutorial found here.
The programs you write communicate with the underlying operating system. In order to speak correctly to the operating system's drive and directory structure the programmer needs to know how to format the parameter structures sent to it. This is true for any programming language and operating system combination you may decide to use. If you are using Linux or MacOS with this course then find a good tutorial on your operating system's drive and directory structure and how to navigate it at the command prompt. In Linux and MacOS the command prompt is often referred to as the Terminal.
Directory Structure: The _STARTDIR$ Statement
The _STARTDIR$ statement returns the path where the program resides. User's may decide to install your program anywhere on their computer. The _STARTDIR$ statement let's your program know exactly where it resides on the user's hard drive.
PRINT _STARTDIR$
If you followed the directions in the Install QB64 section and installed the qb64 folder onto your computer's desktop then:
C:\Users\<your login name>\Desktop\qb64
should have been printed to your screen. The <your login name> area should contain the account name you use on your computer.
Directory Structure: The CWD$ Statement
The _CWD$ statement returns the path of the current working directory.
PRINT _CWD$
If you followed the directions in the Install QB64 section and installed the qb64 folder onto your computer's desktop then:
C:\Users\<your login name>\Desktop\qb64
should have been printed to your screen. The <your login name> area should contain the account name you use on your computer.
As you navigate around the hard drive the path contained in _CWD$ will change accordingly.
The current working directory is where files will be opened from and new files created. If you need to change the current working directory then use the CHDIR statement.
Directory Structure: The CHDIR Statement
Before files can be opened or created they need a place to reside within the drive's directory structure. The CHDIR statement allows you to navigate the directory structure by changing the current working directory. The asset file you downloaded and placed into your qb64 directory contains a "Lesson11" subdirectory located at the following path:
.\tutorial\Lesson11
To navigate to this folder you would issue the command:
CHDIR ".\tutorial\Lesson11" ' navigate to Lesson11 subdir from the current directory
The ".\" signifies the current working directory, "tutorial" is the subdirectory you are navigating to and from there you are navigating to another subdirectory contained in "tutorial" named "Lesson11". Your current working directory is now "Lesson11" and any files you want to open need to be located here, and likewise, any files you create will also be created here. At the command prompt you get a nice visual path indicator of this:
C:\users\< your login name>\desktop\qb64\tutorial\Lesson11>
However in QB64 there is no visual indication of what your current working directory is. You'll need to keep track of this through the code you write.
You can also use parent directory notation ( .. ) to move back one level just like at the command prompt. To navigate back to the qb64 directory you could issue the command:
CHDIR "..\.." ' move down the tree two levels
and this would move you down the tree two levels back to the qb64 directory.
To change to a different drive such as the D: drive issue:
CHDIR "D:\" ' change drive letter
If you want to go directly to the root of the current drive you are on issue the command:
CHDIR "\" ' go directly to the root
As you can see CHDIR is basically QB64's version of the Windows and Linux command line's CD command.
Note: Windows (and DOS) operating systems use the backslash "\" to separate directory entries. Linux and MacOS use foreslashes "/" to accomplish this. QB64 takes into account which operating system it's running on and changes these accordingly at compile time. However you should get into the habit of using the proper slash for the appropriate operating system. There are metacommands to identify which operating system your program is currently running in to aid you in selecting the appropriate notation. Metacommands will be discussed at the end of this lesson.
Directory Structure: The MKDIR Statement
The MKDIR statement is used to create a directory or subdirectory in the current working directory. The following command will create a directory called "MyFolder" inside the qb64 folder:
MKDIR "MyFolder" ' create subdirectory in current working directory
You can also supply a path to a valid subdirectory to create a directory anywhere on your drive regardless of your current working directory.
MKDIR "\RootFolder" ' create a directory on the root of the current drive
MKDIR ".\tutorial\Lesson11\NewFolder" ' path from current working directory
MKDIR "D:\MyFolder" ' create a directory on another drive
MKDIR is basically QB64's version of the Windows command line's MD command and Linux's command line's MKDIR command.
Directory Structure: The RMDIR Statement
The RMDIR statement is used to remove, or delete, a directory or subdirectory. Before a directory can be deleted it must be empty; no files or subdirectories can exist within it. If you attempt to remove a directory that is not empty you'll get the error shown in Figure 1 below:
Figure 1: Directory not empty
The following lines of code will create and then remove a directory from the current working directory.
MKDIR "MyFolder"
PRINT "Folder created. Press any key to delete it."
SLEEP
RMDIR "MyFolder"
PRINT "Folder has been deleted."
RMDIR is basically QB64's version of the Windows command line's RD command and Linux's command line's RMDIR command.
Note: A deleted directory will not get placed into the Windows Recycle Bin.
Directory Structure: The _DIREXISTS Statement
If you attempt to navigate to a directory that does not exist QB64 will issue an error as seen in Figure 2 below:
Figure 2: Non-existent directory
The _DIREXISTS statement will test for the existence of a directory and return a numeric value of -1 (true) if the directory is found or a numeric value of 0 (false) if the directory is not found. It's always a good idea to test for the existence of a directory before attempting any other directory or file related statements with it. The following lines of code will test for the existence of the "Lesson11" subdirectory within the "tutorial" subdirectory.
IF _DIREXISTS(".\tutorial\Lesson11") THEN ' does the folder exist?
PRINT "Folder found! ' yes, inform user
ELSE ' no, the folder was not found
PRINT "Folder not found!" ' inform user
END IF
Directory Structure: The NAME...AS Statement
The NAME...AS statement is used to rename either a file or directory name.
NAME "HighScores.txt" AS "HighScores.old" ' rename a file
NAME "C:\GAMES" AS "C:\MYGAMES" ' rename a folder
The NAME...AS statement can also be used to move a directory and its contents from one location to another.
NAME "C:\User\Terry\Desktop\qb64\Project\" AS "C:\Games\Finished\NewGame\"
The directory named "Project" is getting renamed to "NewGame" and since the paths are different it and all of the files contained within it are getting moved from one location to another.
Directory Structure: The KILL Statement
The KILL statement is used to delete a file(s) from the drive. The use of wildcards is also acceptable.
KILL "HiScore.txt" ' delete file from current working directory
KILL "*.tmp" ' delete all files with tmp extension
KILL ".\SupportFiles\Image.bak" ' delete file in specified path
KILL ".\backup\??04*.*" ' delete files with 3rd and 4th characters of 04
There are a few things you should be aware of when using the KILL statement.
- Files that are deleted are not moved to the Windows Recycle Bin.
- Open files (files in use) can't be deleted. The program using the file must close it first.
- Files marked as read only can't be deleted. They must first have the read only attribute disabled.
- The KILL statement can't be used to delete directories. Use RMDIR instead.
Directory Structure: The _FILEEXISTS Statement
If you attempt to work with a file that does not exist QB64 will issue an error message as seen in Figure 2 above outlined in the _DIREXISTS statement. The _FILEEXISTS statement will test for the existence of a file and return a numeric value of -1 (true) if the file is found or a numeric value of 0 (false) if the file is not found. It is always a good idea to test for the existence of a file before attempting any file related statements with it. The following lines of code will test for the existence of the "I_Exist.txt" file within the ".\tutorial\Lesson11\" path. (The file was included with the tutorial asset file.)
IF _FILEEXISTS(".\tutorial\Lesson11\I_Exist.txt") THEN ' does the file exist?
PRINT "File found!" ' yes, inform user
ELSE ' no, the file was not found
PRINT "File not found!" ' inform user
END IF
File Access: The OPEN Statement
The OPEN statement is used to open a file in a predetermined mode of operation.
OPEN Filename$ FOR mode AS #Filenumber&
Filename$ is the name of the file you wish to open and can be a literal string in quotes or a string variable that contains the name and path to the file.
The mode parameter denotes how the file is to be opened. The available OPEN modes are:
INPUT - read contents from a file (sequential access)
OUTPUT - write contents to a file (sequential access)
APPEND - write contents to a file beginning at the end of the file (sequential access)
BINARY - read and write to a file at any position (database access)
RANDOM - read and write records from a file at any position (database access)
The Filenumber& parameter is used to give the file a handle number to reference. It's possible to have more than one file open at a time (over 2 billion if necessary) and the value given to Filenumber& is used to reference the individual file.
Sequential access to a file means that when a file is opened it can only be read from or written to in sequential, or a line by line style. Here we are opening a file for sequential read access:
OPEN ".\tutorial\Lesson11\I_Exist.txt" FOR INPUT AS #1 ' open for sequential read
When lines of data are read from this file the first line read will be at the top and last line read will be the bottom. There is no control of which line is read next. After a line is read in the following line becomes the next default line to be read in.
Here the file is being opened for sequential write access:
OPEN ".\tutorial\Lesson11\NewFile.txt" FOR OUTPUT AS #1
When lines of data are written to this file the lines will be written from top to bottom. There is no control of which line is written to next. After a line is written the following line becomes the next default line to be written.
Note: If opening a file for OUTPUT and the file already exists the existing contents will be deleted to make room for the new content.
The final sequential method of opening a file is for appending, or adding to the end of a file:
OPEN ".\tutorial\Lesson11\I_Exist.txt" FOR APPEND AS #1
This mode is the same as OUTPUT accept that the file's contents are preserved and the writing of new lines begins at the end of the file.
We'll discuss the other two modes BINARY and RANDOM in a bit. For now let's discuss files that can be opened for sequential access. We'll start by creating a simple high score table that could be used in any game.
This code example creates a file called HISCORES.TXT in the current working directory. Keep in mind that if HISCORES.TXT already exists it will be overwritten and any existing contents will be destroyed. The handle number given to the file is #1 and from this point on is used to reference this open file. Line 2 of the code uses the PRINT statement to place information into the file:
PRINT #1, "Fred Haise"
By referencing the file handle of #1 the PRINT statement is instructed to place the string into the open file instead of to the screen. Opening the file in Windows Notepad reveals the file structure:
Figure 3: HISCORES.TXT opened in Notepad
The PRINT statements placed each piece of data on separate lines sequentially. When you are finished working with a file you need to close it and the CLOSE statement as seen in line 8 accomplishes this:
CLOSE #1
The file HISCORES.TXT opened with the handle of #1 is closed. If you do not close a file before your program ends execution there is the possibility of file corruption due to the fact that the file is left in an open state.
The next piece of code will be used to retrieve the data from the HISCORES.TXT file we just created.
Figure 4: High score data retrieved
In this example line 6 of the code opens the file for INPUT and gives it a file handle of #1. Lines 11 and 12 of the code use the INPUT statement to read data from the file:
INPUT #1, HSname$(Count%) ' get name from file
INPUT #1, HScore%(Count%) ' get score from file
By referencing the file handle of #1 the INPUT statement is told to get data from the open file instead of the keyboard.
Data can be appended, or added, to the file by opening it in APPEND mode. The following code will open the HISCORES.TXT file and add more lines of data to the end of it.
Figure 5: Appended data
Opening the file once again in Notepad shows that the lines of data were added to the existing lines of data in the file.
File Access: The EOF Statement
In the code examples so far it is known how many pieces of data have been written to the HISCORE.TXT file. However, it's not always possible to know the exact number of lines contained within a file that need to be read in. The EOF statement can tell code when it has reached the end of a file. The following code will read the data contained in HISCORE.TXT but instead of setting up a loop with a known value it will use EOF to determine when the end of the file has been reached.
In this example we use dynamic arrays that grow in size depending on the amount of data read in from the file. Line 11 of the example:
WHILE NOT EOF(1)
only allows entry into the loop if the end of the file has not been reached. The EOF statement will return a value of -1 (true) when the end of the file has been reached and a value of 0 (false) when not at the end of the file. EOF(1) refers to the file that has been opened with a file handle of #1. Once inside the loop the dynamic arrays are increased in size and the data read in using the INPUT statements.
File Access: The WRITE Statement
There is a more efficient way to read and write data to and from a sequential file using the WRITE statement instead of the PRINT statement. The WRITE statement creates a Comma Separated Values, or CSV, file that can be read in more efficiently. First, we need to create a new HISCORES.TXT file written as a CSV file.
Figure 6: Comma separated values
When opened in Notepad as seen in Figure 6 above strings are surrounded in quotes and numeric values are written as is. Most spreadsheet programs can read and write CSV files allowing the programmer a way to create very large CSV files for use with QB64 and other software. The example code below shows how a CSV file's data can be read and make the code as efficient as possible.
Figure 7: CSV file loaded
In line 15 of the example:
INPUT #1, Score(UBOUND(Score)).Pname, Score(UBOUND(Score)).Pscore
two variables were read in using a single INPUT statement. You can keep adding commas and variables as needed to read entire lines of CSV values in. The quotes surrounding the strings are automatically removed before the values are placed into string variables.
LINE INPUT can also be used to read data from sequential files but it's best not to use it with comma separated data. LINE INPUT will grab a line exactly as it appears in the file and place it into a string. Remember that LINE INPUT can only accept a string as a variable parameter. Try this little example program that uses LINE INPUT to read in the HISCORES.TXT CSV file that was just created.
Figure 8: LINE INPUT reading a CSV file
As Figure 8 illustrates the data brought in mirrors what was seen in Notepad when viewing the file. The programmer would have to perform extra string manipulation steps to parse the data into something meaningful.
File Access: The FREEFILE Statement
The FREEFILE statement returns an unused file handle number to be used when opening files. Up to this point we have been manually assigning #1 as the file handle in the example code listings. If another file would happen to be using that same file handle then your program would generate an error. For small programs assigning file handle values manually is probably alright. However, as your programs get larger and especially if you have subroutines and/or functions opening files then the possibility using the same file handle number twice greatly increases. It's considered good programming practice to use the FREEFILE statement to return an unused file handle value for you.
File Access: The LOF Statement
The LOF statement returns the size in bytes of a file's contents, or the length of the file. LOF will return the numeric value of zero if no data is present in the file. The following example shows how to use LOF to load an entire file into a string for searching. Notice also that the file was opened in BINARY mode which will be discussed later.
Updating a Sequential File
The benefit of sequential files is their ease of use but the one major drawback is their sequential nature. You must read and write data from the top to bottom. This makes it very difficult to insert data somewhere in the middle of a sequential file. This can be done in a number of different ways but always entails extra work on the programmer's part. The example code below shows one method in which this can be done.
Figure 9: New player and score inserted
Figure 10: HISCORE.TXT with new information inserted
The example code shows a method of opening the original file for INPUT and another temporary file for OUTPUT. As each line of data read in from the original file is copied over to the temporary file. Line 17 of the code:
IF Score.Pscore <= NewScore% THEN ' is this score less than the new score?
chooses the right time to insert the new values into the temporary file. Once all the original lines of data have been copied from the original file to the temporary file the original file is deleted. The temporary file that was created is then renamed to the original file.
Random Access and Binary Files
There are methods that QB64 offers to set up file access that uses records to store data that can be accessed quickly through the use of record numbers. A file record has a set length and by knowing the length a simple calculation can be made to determine where any record is stored within this type of file, known as a database. The RANDOM and BINARY modes of file access allow the programmer to store information in a manner that makes retrieval fast no matter where the information is located within the file. However, this method of file access is beyond the scope of these lessons because of their complexity, therefore sequential files will serve our purpose for the most part. If you are interested in pursuing database file creation using QB64 then the QB64 Wiki and the following statements and commands are where you need to start.
OPEN Filename$ FOR RANDOM AS #Filenumber%
OPEN Filename$ FOR BINARY AS #Filenumber%
Your Turn
You were recently hired by the FBI as a computer programmer attached to the code cypher group. A code was recently intercepted on the Internet that the team believes they have decrypted but need you to write code to verify that their analysis is correct. They have supplied you with the data in a file named "POINTDATA.XYP" and have concluded that it contains polygon coordinates that spells out a hidden message. Your job is to write a program that takes this data and displays the hidden message to the screen. Here is the information the cypher team has given you.
The coded file "POINTDATA.XYP" is in the ".\tutorial\Lesson11" directory. You may want to open it in Notepad to view the file's structure while reading the next statements.
Within the file the team believes:
A single value greater than 0 contained on a line is a command to draw a polygon.
That single value contains the number of x,y coordinate pairs contained in the polygon.
The line following this single value contains the x,y coordinate pairs of the polygon.
A single value equaling -1 contained on a line is a command to paint the previous polygon.
The line following this -1 value contains the x,y coordinate pair where the paint color resides and the red, green, and blue values of the paint color.
A single value equaling 0 contained on a line means the end of the data file has been reached.
Given the above information the team has asked that you draw the polygon lines in white and to use white as the border color when painting the polygons in. Hurry! Time is critical! This message must get to the White House as soon as possible for critical analysis.
The team has asked that you save this critical program as MyQB64.BAS when finished. Don't fail in your task! The country is counting on you.
Click here to see the solution. (Don't cheat! Remember, your country needs you!)
Commands and Concepts Learned
New commands introduced in this lesson:
_DIREXISTS
_FILEEXISTS
CHDIR
MKDIR
RMDIR
KILL
NAME...AS
OPEN
INPUT (file statement)
OUTPUT
APPEND
BINARY
RANDOM
PRINT (file statement)
WRITE (file statement)
LINE INPUT (file statement)
FREEFILE
LOF
EOF
New concepts introduced in this lesson:
sequential file
Comma Separated Values file (CSV)
records
database