LESSON 19: Input and output
FOCUS QUESTION: How can I save and retrieve my data?
Data from the real world is often complicated by messy formats, missing data, and input errors. This lesson uses the MATLAB data import tools to read and manipulate real data. The lesson also introduces the MATLAB vectorized logical operations for manipulating data.
Contents
- DATA FOR THIS LESSON
- SETUP FOR LESSON 19
- EXAMPLE 1: Load the sleep diary data and output the number of rows and columns
- EXAMPLE 2: Create a diaries directory in the current directory
- EXAMPLE 3: Write the individual sleep diaries as tab-delimited text files
- EXAMPLE 4: Catch the error when trying to open a bad file
- EXAMPLE 5: Read a sleep diary containing a NaN
- EXAMPLE 6: Read a sleep diary containing an emtpy field
- EXAMPLE 7: Check myAlarm is valid
- EXAMPLE 8: Output the invalid positions in myAlarm
- EXAMPLE 9: Count the lines in a sleep diary using low-level MATLAB I/O
- SUMMARY OF SYNTAX
DATA FOR THIS LESSON
| File | Description |
| diaries.mat | contains the extracted, cleaned, and consolidated sleep diary data
in MATLAB variables. The arrays have a column for each person.
The vectors have an element for each person. The values
in column n correspond to the same person as the value in position
n of each vector. The
file contains the following variables:
|
| Doe_Jack_male.csv | contains a sample sleep diary. The third entry in the wake-up to alarm field is a NaN. |
| Doe_Jane_female.csv | contains a sample sleep diary. The third entry in the wake-up to alarm field is empty. |
SETUP FOR LESSON 19
- Set the Current Directory to Z:\working\MATLAB\Lesson19. (You will need to make a new directory for Lesson19.)
- Download diaries.mat from Blackboard or copy it from Lesson 10. Doe_Jack_male.csvt to your Lesson19 directory.
- Download the data file Doe_Jack_male.csvt to your Lesson19 directory.
- Download the data file Doe_Jane_female.csv to your Lesson19 directory.
- Create a new script called Lesson19Script.m. (Use File->New->Blank M-File from the main MATLAB menubar.) You will enter each of the examples in a new cell in this script.
EXAMPLE 1: Load the sleep diary data and output the number of rows and columns
Create a new cell in which you type and execute:
load diaries.mat; % Load the sleep diaries [numDays, numDiaries] = size(bedTimes); % How many rows and columns? fprintf('\n\n%g Sleep diaries kept for %g days\n', numDiaries, numDays);
You should see bedTimes, dayCaffeine, gender, nightCaffeine, section, toSleepMinutes, useAlarm, wakeTimes, numDays, and numDiaries variables in your Workspace. You should also see the following output in the Command Window:
144 Sleep diaries kept for 21 days
EXAMPLE 2: Create a diaries directory in the current directory
Create a new cell in which you type and execute:
directoryName = './diaries'; if ~isdir(directoryName) mkdir(directoryName); else fprintf('The %s directory already exists\n', directoryName); end;
You should see a directoryName variable in your Workspace Browser. The first time you execute this cell, a diaries directory will appear in your Current Directory. Subsequent executions of this cell produce the following output in the Command Window:
The ./diaries directory already exists
EXAMPLE 3: Write the individual sleep diaries as tab-delimited text files
Create a new cell in which you type and execute:
sleepHours = (wakeTimes - bedTimes)*24; % Also include hours sleep for k = 1:numDiaries % For each subject thisFile = ['./diaries/subject' num2str(k), '.txt']; fid = fopen(thisFile, 'w'); % Create a file fprintf(fid, '%g subject is a %s from section %g\n\n', ... k, gender{k}, section(k)); % Write gender and section infp fprintf(fid, '%20s\t%20s\t%s\t%s\t%s\t%s\t%s\n', ... 'Bed-time', 'Wake-time', '2Sleep', 'Hours', 'Alarm', 'Day', 'Night'); for j=1:numDays % For each day fprintf(fid, '%s\t %s\t %d\t %6.3g\t %d\t%d\t%d\n', ... datestr(bedTimes(j, k), 0), datestr(wakeTimes(j, k), 0), ... toSleepMinutes(j, k), sleepHours(j, k), ... useAlarm(j, k), dayCaffeine(j, k), nightCaffeine(j, k)); end; % Done writing days for this subject fclose(fid); end; % Done with all of the subjects
You should see sleepHours, thisFile, fid, k, and j variables in your Workspace Browser. You should also see an individual sleep diary file for each subject in the diaries subdirectory.
EXAMPLE 4: Catch the error when trying to open a bad file
Create a new cell in which you type and execute:
badFile = 'BadName.csv'; try fid = fopen(badFile); % Trying to open a non-existent file fclose(fid); catch theError; fprintf('%s for file %s\n', theError.identifier, badFile); end;
You should see badFile and fid variables in the Workspace Browser and the following output in your Command Window. The try-|catch| allows you to handle errors without terminating your script.
MATLAB:FileIO:InvalidFid for file BadName.csv
EXAMPLE 5: Read a sleep diary containing a NaN
Create a new cell in which you type and execute:
fName = 'Doe_Jack_male.csv'; % File is in the current directory fid = fopen(fName); % The fid is a handle to the open file formatString = '%s %s %n %s %n %n %n'; % Assume 7 items on each line dataJohn = textscan(fid, formatString, 'HeaderLines', 1, 'Delimiter', ','); fclose(fid);
You should see fName, fid, formatString, dataJohn variables in the Workspace Browser.
EXAMPLE 6: Read a sleep diary containing an emtpy field
Create a new cell in which you type and execute:
fName = 'Doe_Jane_female.csv'; % File is in the current directory fid = fopen(fName); % The fid is a handle to the open file formatString = '%s %s %n %s %n %n %n'; % Assume 7 items on each line dataJane = textscan(fid, formatString, 'HeaderLines', 1, ... 'Delimiter', ','); fclose(fid);
You should see fName, fid, formatString, dataJane variables in the Workspace Browser.
EXAMPLE 7: Check myAlarm is valid
Create a new cell in which you type and execute:
myAlarm = dataJane{5}; % myAlarm for this diary is in 5th column of data;
if sum(isnan(myAlarm))
fprintf('Alarm field has missing or NaN values\n');
end;
if sum(myAlarm ~= 0 & myAlarm ~= 1)
fprintf('Alarm field is not all 0''s or 1''s\n');
end;
You should see a myAlarm variable in the Workspace Browser and the following output.
Alarm field has missing or NaN values Alarm field is not all 0's or 1's
EXAMPLE 8: Output the invalid positions in myAlarm
Create a new cell in which you type and execute:
myAlarm = dataJane{5}; % myAlarm for this diary is in 5th column of data;
nanPositions = find(isnan(myAlarm)); % Find the positions of NaN's
if ~isempty(nanPositions)
fprintf('Alarm field has NaN''s in positions ');
fprintf(' %g', nanPositions);
fprintf('\n');
end;
badPositions = find(myAlarm ~= 0 & myAlarm ~= 1);
if ~isempty(badPositions)
fprintf('Alarm field has bad values in positions ');
fprintf(' %g', badPositions);
fprintf('\n');
end;
You should see myAlarm, nanPositions, and badPositions variables in the Workspace Browser and the following output.
Alarm field has NaN's in positions 3 Alarm field has bad values in positions 3 8 14
EXAMPLE 9: Count the lines in a sleep diary using low-level MATLAB I/O
Create a new cell in which you type and execute:
fName = 'Doe_Jane_female.csv'; % File is in the current directory fid = fopen(fName); % The fid is a handle to the open file lineCount = 0; tLine = fgetl(fid); % Read the first line while (tLine ~= -1) % End of file when tLine is -1 lineCount = lineCount + 1; % Successfully read a line fprintf('%g: %s\n', lineCount, tLine); %Output it tLine = fgetl(fid); % Get another line end; fclose(fid); % Must close the file when done
You should see fName, fid, lineCount, and tLine variables in your workspace and the following output in the Command Window:
1: Wake-up Date,Bed-time (24-hour time),Minutes to fall asleep,Wake-up time (24 -hour time),"Alarm wake-up? (1 = Yes, 0 = No)","Daytime caffeine? (1 = Yes, 0 = No)","Evening caffeine? (1 = Yes, 0 = No)" 2: 9/23/2009,22:20,10,5:49,0,1,0 3: 9/24/2009,22:30,120,6:00,1,1,0 4: 9/25/2009,22:00,5,4:30,,1,0 5: 9/26/2009,22:00,5,7:30,0,1,0 6: 9/27/2009,0:58,10,7:18,0,1,0 7: 9/28/2009,23:00,10,6:00,1,1,0 8: 9/29/2009,22:10,10,6:00,1,1,0 9: 9/30/2009,22:50,10,6:00,9,1,0 10: 10/1/2009,21:38,40,6:00,0,1,0 11: 10/2/2009,23:00,10,6:00,1,1,0 12: 10/3/2009,22:00,10,7:18,0,1,0 13: 10/4/2009,22:00,10,6:00,0,1,0 14: 10/5/2009,21:50,10,5:00,0,1,0 15: 10/6/2009,22:30,60,6:00,2,1,0 16: 10/7/2009,22:40,10,6:00,1,1,0 17: 10/8/2009,22:25,10,6:00,1,1,0 18: 10/9/2009,22:35,10,6:00,1,1,0 19: 10/10/2009,23:00,5,7:41,0,1,0 20: 10/11/2009,22:00,5,7:30,0,1,0 21: 10/12/2009,22:30,5,6:00,1,1,0 22: 10/13/2009,0:15,5,6:15,0,1,0
SUMMARY OF SYNTAX
| MATLAB syntax | Description |
fclose(fid) |
Closes the file represented by the handle fid,
causing system resources to be released and allowing other
programs to access the file.
|
tline = fgetl(fid) |
Returns the next line of the open file represented by
fid after discarding newline characters.
|
Y = find(X) |
Returns the indices of the nonzero elements of X.
|
fid = fopen(filename) |
Readies the file represented by filename
for reading (opens the file) and returns a handle for
future operations. If the fid handle is -1,
MATLAB could not successfully open the file..
|
fid = fopen(filename, 'w') |
Readies the file represented by filename
for writing (opens the file) and returns a handle for
future operations. If the fid handle is -1,
MATLAB could not successfully open the file..
|
Y = isnan(X) |
Returns a logical array that is the same size as X.
The array Y has 1's (true) where the corresponding elements
of X are NaN and 0's elsewhere.
|
isdir(filename) |
Returns 1 (true) if filename represents a directory
and 0 (false) otherwise.
|
mkdir(directoryName) |
Creates a directory with the name directoryName.
|
data = textscan(fid, formatString, 'HeaderLines', 1,
'Delimiter', ',') |
Returns a cell array containing the result of a formatted read.
The formatString specifies the format. This example
specifies the first line should be treated as a header and ignored.
The values of the file are treated as comma-delimited.
|
The while loop:while (expression) |
Repeatedly execute statements as
long as expression is true (non-zero).
|
This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 31-Dec-2010. Please contact krobbins@cs.utsa.edu with comments or suggestions. The image is of a 5.25" floppy diskette taken by Andreas Frank on July 29, 2005 and available at http://commons.wikimedia.org/wiki/File:5.25%22-Diskette.jpg.