Fun with Firebird and Python
Pavel Císař, IBPhoenix
Firebird Conference 2012, Luxenbourg

Pavel Císař, IBPhoenix
Firebird Conference 2012, Luxenbourg
FDB – pure Python driver with no special dependencies (just fbclient library).
Supports Firebird 2.0+
Extremely easy to use, yet powerful
- Data (types, automatic conversions, stream BLOBs)
- SQL
- Transactions (multiple per connection, DT, parameters)
- Firebird Events
- Firebird Services
Analyze database content
After some unfortunate event we have two (or more) copies of the same database where one (or all) has some rows missing, and we need to assemble one good database.
Get primary keys from single table from both copies as sets.
>>> import fdb
>>> con1 = fdb.connect(dsn='db1',user='sysdba',password='pas')
>>> con2 = fdb.connect(dsn='db2',user='sysdba',password='pas')
>>> c1 = con1.cursor()
>>> c2 = con2.cursor()
>>> c1.execute("select pk from tbl")
>>> c2.execute("select pk from tbl")
>>> pks1 = set(c1)
>>> pks2 = set(c2)
Is there any difference?
>>> s1 == s2
False # yes, it is
Do they have something in common at all?
>>> s1.isdisjoint(s2)
False # yes, they have
Do both miss something or just only one?
>>> s1 < s2
False
>>> s1 > s2
False # Bad day, we have to merge
Rows missing in both db's
>>> miss1 = s2 - s1
>>> miss2 = s1 - s2
How many?
>>> print len(s1), len(miss1)
999866 100
>>> print len(s2), len(miss2)
950000 49966
It seems that most efficient is to take everything from db1 and add 100 missing rows from db2.
# helper function
def read_rows(c, pks):
for pk in pks:
c.execute("select * from tbl where pk = ?",pk)
for out_row in c:
yield out_row
# Prepare output
con3 = fdb.connect(dsn='db3', user='sysdba', password='pas')
c3 = con3.cursor()
insert = 'insert into tbl values (?,?)'
# Move data from A
c1.execute('select * from tbl')
c3.executemany(insert,c1)
# Move data from B
c3.executemany(insert,read_rows(c2,miss1))
Now it's time for rewrite into reusable function(s) / module... (see fbmerge.py)
>>> import fdb
>>> from fbmerge import merge_table
>>> c_A = fdb.connect(dsn='db1.fdb',user='sysdba',password='pas')
>>> c_B = fdb.connect(dsn='db2.fdb',user='sysdba',password='pas')
>>> c_T = fdb.connect(dsn='db3.fdb',user='sysdba',password='pass')
>>> merge_table(c_A, c_B, c_T, 'TABLE_A', 'PK_A')
>>> merge_table(c_A, c_B, c_T, 'TABLE_B', 'PK_B')
With few additional lines we can automate the whole process... (see merge_all.py)
Share the code with others! Via Firebird project, PyPI etc.
Analyze Firebird log
Which is big (3MB for example) and full of entries, most of them repeating, but some are special.
Typical needles in the haystack situation.
SOMESERVER Wed Apr 06 16:33:11 2011
INET/inet_error: read errno = 10054
SOMESERVER Wed Apr 06 16:33:12 2011
Unable to complete network request to host "SOMESERVER".
Error reading data from the connection.
SOMESERVER Wed Apr 06 16:36:16 2011
Shutting down the server with 1 active connection(s) to
1 database(s), 0 active service(s)
SOMESERVER Mon Nov 21 01:29:13 2011
SERVER/process_packet: broken port, server exiting
SOMESERVER Wed Sep 21 14:45:26 2011
Database: MYDATABASE
internal gds software consistency check (page in use
during flush (210), file: cch.cpp line: 3715)
Feed lines from log:
>>> log_file = open("firebird.log")
>>> # File is an iterable object
Filter out empty lines (using generator expression):
>>> non_empty = (line for line in log_file if line.lstrip())
>>> non_empty
<generator object <genexpr> at 0xe79140>
Group lines to "entries":
# Generator function
def group_entries(lines):
entry = []
for line in lines:
if not line[0].isspace():
if entry:
yield tuple(entry)
entry = []
entry.append(line)
if entry:
yield tuple(entry)
# Generator expression
entries = (x for x in group_entries(non_empty))
Parse entries:
from datetime import datetime
def parse_entries(entries):
f = '%a %b %d %H:%M:%S %Y'
for entry in entries:
(server,ts) = entry[0].split('\t')
ts = datetime.strptime(ts.strip(),f)
msg = [m.strip() for m in entry[1:]]
key = msg[0]
if key.startswith('Database:') and len(msg) > 1:
key = msg[1]
yield (ts, key, msg)
tuples = (x for x in parse_entries(entries))
Convert entries to dictionaries
columns = ['timestampt','key','msg']
log = (dict(zip(columns,t)) for t in tuples)
As we packaged all in functions in single file (fblog.py), we can:
>>> from fblog import *
>>> show(frequency(fblog()))
>>> show(frequency(normalize_keys(fblog())))
>>> show(frequency(normalize_keys(fblog(exclude="INET"))))
>>> show(full(fblog(only="consistency check")))
>>> show(frequency(between(normalize_keys(fblog()),
to_time=dt("14.10.11 10:22:02"))))
>>> show(brief(between(fblog(),d("13.10.11"),d("14.10.11"))))
There is a way how to make it more simple, but that's for another session (or an article for IBPhoenix/Foundation newsletter).
Firebird log once again
But this time we want to be notified about daily events
from fblog import *
now = datetime.now()
day_off = timedelta(days=1,minutes=5)
last24hours = list(between(fblog(),from_time=now-day_off))
brief = '\n'.join(frequency(last24hours))
full = '\n'.join(full(last24hours))
Firebird log once again
But this time we want to monitor it and see entries as they are written
def follow(thefile):
# Go to the end of the file
thefile.seek(0,2)
while True:
line = thefile.readline()
if not line:
# Sleep briefly
time.sleep(1)
continue
yield line
from fblog import *
log = follow(open('/opt/firebird/firebird.log'))
try:
show(full(fblog(log_file=log)))
except KeyboardInterrupt:
pass
except Exception as e:
raise e
Firebird Trace Service
You want to know which indices are used to optimize queries and which are useless
Stupid idea! We can use our favorite GUI Firebird manager to do that, or Firebird-provided fbsvcmgr.
But navigating GUIs is time consuming...
And who on Earth would like type things like:
> fbsvcmgr service_mgr -user sysdba -password masterkey
-action_trace_list
> fbsvcmgr service_mgr -user sysdba -password masterkey
-action_trace_stop trc_id 1
When writting CLI script in Python would take less than 5 minutes and save us time and typing?
The PowerConsole.
Enhanced python interpreter that can host user defined commands.
- Pavel Císař
- pcisar@ibphoenix.cz
- IBPhoenix
- www.ibphoenix.com