There are times when you need to quickly change a field delimeter in a flat file and for this, once again the Linux command line saves the day. You can use the tr command to quickly and easily replace one delimeter with another.
In this example we will change the comma delimeter to a pipe.
tr ‘,’ ‘|’ < comma-delimitedfile.txt > pipe-delimitedfile.txt
#tr command changes the delimiter from a comma to a pipe in this example.
The syntax works like this.
tr is the command or “function” that you are calling
‘,’ is the original delimeter inside of single quotes
“|” is the new delimeter that we want to replace inside of single quotes.
< gets the input from the input file
> outputs the output to a new file.
If you want to change the order of the fields in the flat file here is how you would do this.
awk -F, ‘BEGIN { OFS=”,” }
{
print $1,$2,$3,$4,$5,$6,$7,$8,$13,$12,$9,$10,$11
}’ inputfile.txt > standardized.txt
# change the variables $4,$1,$3, etc to the desired column order
#Usage: Substitute inputfile.txt on line 4 with your database file.
#Recommended Standard is:
#email,fname,lname,address,city,state,zip,phone,dob,gender,source,ip,datetime
#change OFS=”,” from , to desired delimiter if required.
Code explained:
Copy the code to a text file and save it.
Use ./textfilename to execute the script.
awk (also written as Awk and AWK) is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a document and the action that is to be taken when a match is found within a line.
The -F flag tells awk that we want to work with the field seperator.
OFS= comma delimiter
print $1, $2 etc is the field order where $1 is field 1 and $2 is field 2 etc. So if you want to re-arrange that order simply substitute them in the order they appear in the script. For example if you change:
print $1,$2,$3,$4,$5,$6,$7,$8,$13,$12,$9,$10,$11
to
print $2,$1,$3,$4,$5,$6,$7,$8,$13,$12,$9,$10,$11
The second field would then become the first field and the first field would be come the second field when output