Import Examples

Top  Previous  Next

The following examples illustrate importing data from a variety of formats into the Country table in the sample Employee database (located in $firebird/firebird/examples/empbuild/employee.fdb in a typical Firebird installation).

Delimited Text

In a delimited text file, data records are separated by line breaks, while fields within each record are separated consistently with a single 7-byte character, such as a TAB character or a rarely-used printable character, such as the "pipe" symbol (|, ASCII 124/x07C).

Your delimited text data file might contain data like this:

USA|Dollar

England|Pound

Canada|CdnDlr

Switzerland|SFranc

Japan|Yen

Italy|Euro

France|Euro

Germany|Euro

Australia|ADollar

Hong Kong|HKDollar

Netherlands|Euro

Belgium|Euro

Austria|Euro

Fiji|FDollar

A control file for the import could be:

options db=127.0.0.1:/opt/firebird/examples/empbuild/employee.fdb

options file=countries.txt

options direction=i 

options passwd=masterkey 

options user=SYSDBA

options variables=2

options delimiter=|

UPDATE OR INSERT INTO COUNTRY (COUNTRY,CURRENCY) VALUES (?,?)

.CSV Format

OpenOffice, Microsoft Excel and many other data storage applications can export data as plain text in comma-separated values ("CSV") format.  The characteristics of CSV include double-quoting on string values and fields separated by commas or some other configurable separator character.

For example, an input file for our COUNTRY table, using a semicolon as the field separator, might look like this:

"COUNTRY";"CURRENCY"

"USA";"Dollar"

"England";"Pound"

"Canada";"CdnDlr"

"Switzerland";"SFranc"

"Japan";"Yen"

"Italy";"Euro"

"France";"Euro"

"Germany";"Euro"

"Australia";"ADollar"

"Hong Kong";"HKDollar"

"Netherlands";"Euro"

"Belgium";"Euro"

"Austria";"Euro"

"Fiji";"FDollar"

The control file could be :

options excel quote=" delimiter=; direction=i file=countries.txt 

options variables=2 user=SYSDBA passwd=masterkey 

options db=127.0.0.1:employee 

UPDATE OR INSERT INTO COUNTRY (COUNTRY,CURRENCY) VALUES (?,?) 

Fixed Position Text Records

In a fixed-length format, there are no fields of variable length and no delimiters.  Interpretation of this input data as fields and records depends on two attributes:

1.The starting position of each field
2.The size of each field (number of characters)

Position and Size Arguments

A specialised usage of the embedded directive descriptors specifies these two required attributes to dbFile for each field in the fixed position text record.  The interpolated clause (one for each input field) has the form

{position=p size=s}

where p is the 1-based start position in the record and s is the maximum number of characters in the field. For example, a record with the form

Switzerland    SFranc

has two fields, the first defined by {position=1 size=15}, the second by {position=16 size=10}.

Suppose your fixed position countries.txt file has fixed position data such as the following:

USA            Dollar     

England        Pound      

Canada         CdnDlr     

Switzerland    SFranc     

Japan          Yen        

Italy          Euro       

France         Euro       

Germany        Euro       

Australia      ADollar    

Hong Kong      HKDollar   

Netherlands    Euro       

Belgium        Euro       

Austria        Euro       

Fiji           FDollar    

A control file for this import could be:

options direction=i file=countries.txt variables=2 

options user=SYSDBA passwd=masterkey db=127.0.0.1:employee  

UPDATE OR INSERT INTO COUNTRY (COUNTRY {position=1 size=15},  

  CURRENCY {position=16 size=10}) VALUES (?,?)

The variables= option specified here indicates that the DSQL statement has two parameters.

Fixed Position Data without Line Breaks

Certain fixed position data formats do not separate records by line breaks:  data just runs in a continuous string.  An additional option keyword is available for specifying the record length for such formats:

lrecl=nn

where nn is length of one record.

For fixed position input in certain character sets and/or file formats, nn could be the byte count, rather than the character count.  If you have non-ANSI character sets or platform-specific file formats in the picture, experimentation in a test environment is strongly recommended, to establish the specifics of the case.