Tuesday, May 8, 2012

FILE HANDLING IN C

So,finally a post after some time..!
and in this post,I'll be discussing about performing simple operations with files in C, including creating,reading & updating.

Starting with some basic stuff, C views each file simply as a sequencial stream of bytes i.e whenever a file is opened in a C program, a stream is associated with the file.Use of these streams is to provide communication channels between files & programs and Each file ends with a 'end-of-file' marker i.e "Ctrl+D" in linux and "Ctrl+Z" in windows.You will find use of this end-of-file marker in many of your C programs, working with files.
 
It is quite interesting to know that whenever we execute any C program,three files and their associated streams are automatically opened-'Standard Input'(enables the program to read data through the keyboard),'Standard Output'(to print data on the screen) & Standard Error'(to print the error messages).These three streams are manipulated using the file pointers-'stdin','stdout' & 'stderr',respectively.
The stream stdout,stdin or in fact,any output stream is buffered,i.e. data coming to that stream is stored in a buffer and is not printed until the buffer is full.To avoid this,we can use fflush() function. 

Opening a file returns a pointer to the 'FILE' structure defined in <stdio.h>(defined in detail in previous post),that contains information used to process a file.This structure includes a 'file descriptor' that is used as an index into an operating system array(File Open Table).Each index(FCB-file control block) of this array is used to administer a particular file.

Now,time for some practical things:
With respect to performing operations on file,I'll be discussing merely the functions used for different operations,not the whole programs.

So,lets start with creating a file:
The standard library function 'fopen' is used to create a file or refer to an existing file,through a file pointer.
Syntax:        fp=fopen("filename","open-mode").
The above statement opens the file 'filename' for the specified mode & associates the file pointer fp to it. If the file doesn't exists,it will create a new one.
The second argument of fopen specifies the mode in which we want to open the file.There are mainly three modes-"r"(reading),"w"(writing:discards the previous contents of the file),"a"(append).Other extensions of these modes also exist like:"r+"(for update(reading and writing),"rb","wb","ab";'b' character is added to indicate that the file is a binary file. 
fopen returns NULL if the given file doesn't exists & it is not able to create a new file(reasons include, system running short of memory)

But what to do after opening the file??
So aur next operation is scanning & writing to a file, which is done using the functions 'fscanf' & 'fprintf'.
Both these functions are similiar to scanf & printf functions,except these have first argument as the required file pointer.If the file pointer is given as stdin/stdout,these will work as scanf & printf.
Syntax:     fscanf(fp,"%d",&number)
        fprintf(fp,"%d",number)
fscanf returns EOF if a problem occurs whie reading data,otherwise the number of data items succesfully read.

Other functions also exist for exist for scanning a charater or a string from a file.These are fgetc & fgets.These also have the same syntax as that of getchar or gets.
Synatx:        fgetc(fp)
        fgets(str,max_size,fp)
str refers to the string variable in which the scanned string will be stored,max_size,as the name indicates, is the max size of the string that can be scanned.Note that 'fgets' can be used to read a file linewise by providing appropriate size of the string.
Similarly,for writing,we have analogous 'fputc' function.

Now,while reading a file,we may need to shift the file pointer to the starting of file.This can be done by "rewind" function.
Syntax:        rewind(fp)

All the functions I've discussed above are mainly used for sequential files, i.e. where data is stored sequentially(characterwise).Therfore,in these type of files, its very difficult to randomly edit the file,because even numbers also are stored characterwise(i.e they don't occupy full 4 bytes resulting in varying size of different numbers).Therefore,editing can result in overwriting of the data.
So,we use other type of files known as random-access-files.In these type of files,length of different records are fixed i.e variables of each data-type are provided their respective size.Nowadays, these type of files are used almost everywhere..
Funcions used for reading & writing to a random-access-file are 'fread' & 'fwrite'.
Syntax:        fread(&n,sizeof(n-datatype),1,fp)
n refers to the variable in which we want to store the scanned data,'1' refers to the no. of elements of type 'n' to be scanned:this is mainly used to scan multiple elements of an array.
Similar syntax is for fwrite:         fwrite(&n,sizeof(n-dataype),1,fp)
If 'n' is an array,it can be used to write multiple elements of this array,by changing '1' to the required number.
fwrite returns the number of items it successfully output.If this number is less than the 3rd argument,there is a problem.Same is for fread.

Finally,the most important thing,how to reach the record that we want to edit??
The function used for shifting the file pointer, is 'fseek'.
Syntax:        fseek(fp,long int offset,int whence)
offset refers to the number of bytes to seek from the location 'whence' in the file.Argument whence can have three values->SEEK_SET(starting of the file),SEEK_CUR(current location of the pointer),SEEK_END(end of the file).
fseek returns a non-zero value if the seek opearion can't be performed.

These all are the functions that you will require for working with files.Any queries regarding the same are welcomed.
 
and like always,hope this post is helpful :)

No comments: