Python Bindings for GNU Recutils

This manual documents version 1.0 of the Python bindings for GNU Recutils, version 1.5.

Documentation for Recutils is available online. You can find more information about Recutils by also running info recutils or by looking at /usr/doc/recutils/, /usr/local/doc/recutils/, or similar directories on your system.

Introduction

The Python bindings for GNU Recutils follow the Python/C API and should work with Python 2.7 or higher. The extension module has been written to implement new built-in object types similar to the structures in librec (the GNU Recutils C library), and to call the corresponding C library functions. There is also a Python module that is part of the bindings. It acts as an exception handler and binds the enum datatypes from the C library.

You can install the bindings from source using python setup.py install or using pip install python/ if you have pip installed. It is a good idea to use pip to install Python packages since it can help in easy uninstallation/reinstallation if you want to upgrade. You may consider using virtualenv to create isolated Python environments.

The python subdirectory in torture/ contains some example-test programs in Python that show how to perform operations on recfiles. It would be a good idea to have a look at these as you read this manual.

The object types that extend the C library in Python are classes, and the usual rules of classes apply.

Modules

The bindings consist of two modules, recutils and pyrec. The module recutils is the extension written in the Python/C API that provides the librec library functions in Python. The module pyrec is a Python module written to handle exceptions within the functions itself, and also to wrap C enum datatypes.

recutils — GNU Recutils

This module provides access to functions in Recutils. Please note that some functions require the pyrec module to be imported for exceptions to be handled.

class recutils.recdb

The constructor creates and returns a Database class object. It has the following methods:

size()

Return the number of record sets contained in a given database.

pyloadfile(filename)

Load a file into a Database object. filename is a string containing the name of any recfile. Does not handle exception on failure. See module pyrec.

pywritefile(filename)

Write to file from a Database object. This function overwrites a non-empty file. Does not handle exception on failure. See module pyrec.

pyappendfile(filename)

Append to file from a Database object. This function appends to a non-empty file. Does not handle exception on failure. See module pyrec.

get_rset(position)

Return the record set occupying the given position in the database. If no such record set is contained in the database then None is returned.

pyinsert_rset(recset, position)

Insert the given record set into the given database at the given position. If POSITION >= rec_rset_size (DB), RSET is appended to the list of fields. If POSITION < 0, RSET is prepended. Otherwise RSET is inserted at the specified position. Does not handle exception on failure. See module pyrec.

pyremove_rset(position)

Remove the record set contained in the given position into the given database. If POSITION >= rec_db_size (DB), the last record set is deleted. If POSITION <= 0, the first record set is deleted. Otherwise the record set occupying the specified position is deleted. Does not handle exception on failure. See module pyrec.

type()

Determine whether an rset named TYPE exists in a database. If TYPE is None then it refers to the default record set. Returns 1 for success and 0 otherwise.

get_rset_by_type()

Get the rset with the given type from the DB. Returns None if there is no record set having that type.

DATABASE HIGH-LEVEL FUNCTIONS

query(type, join, index, sexp, fast_string, random, fexp, password, group_by, sort_by, flags)

Query for some data in a database. The resulting data is returned in a record set. This function takes the following arguments:

TYPE

The type of records to query. This string must identify a record set contained in the database. If TYPE is None then the default record set, if any, is queried.

JOIN

If not None, this argument must be a string denoting a field name. This field name must be a foreign key (field of type ‘rec’) defined in the selected record set. The query operation will do an inner join using T1.Field = T2.Field as join criteria.

INDEX

If not None, this argument is a pointer to a buffer containing pairs of Min, Max indexes, identifying intervals of valid records. The list of ends with the pair REC_Q_NOINDEX,REC_Q_NOINDEX. INDEX is mutually exclusive with any other selection option.

SEX

Selection expression which is evaluated for every record in the referred record set. If SEX is None then all records are selected. This argument is mutually exclusive with any other selection option.

FAST_STRING

If this argument is not None then it is a string which is used as a fixed pattern. Records featuring fields containing FAST_STRING as a substring in their values are selected. This argument is mutually exclusive with any other selection option.

RANDOM

If not 0, this argument indicates the number of random records to select from the referred record set. This argument is mutually exclusive with any other selection option.

FEX

Field expression to apply to the matching records to build the records in the result record set. If FEX is None then the matching records are unaltered.

PASSWORD

Password to use to decrypt confidential fields. If the password does not work then the encrypted fields are returned as-is. If PASSWORD is None, or if it is the empty string, then no attempt to decrypt encrypted fields will be performed.

GROUP_BY

If not None, group the record set by the given field names.

SORT_BY

If not None, sort the record set by the given field names.

FLAGS

ORed value of any of the following flags:

REC_Q_DESCRIPTOR

If set returned record set will feature a record descriptor. If the query is involving a single record set then the descriptor will be a copy of the descriptor of the referred record set, and will feature the same record type name. Otherwise it will be built from the several descriptors of the involved record sets, and the record type name will be formed concatenating the type names of the involved record sets. If this flag is not activated then the returned record set won’t feature a record descriptor.

REC_Q_ICASE

If set the string operations in the selection expression will be case-insensitive. If FALSE any string operation will be case-sensitive.

Return None if there is not enough memory to perform the operation.

insert(type, index, sexp, fast_string, random, password, recp, flags)

Insert a new record into a database, either appending it to some record set or replacing one or more existing records. This function takes the following arguments:

TYPE

Type of the new record. If there is an existing record set holding records of that type then the record is added to it. Otherwise a new record set is appended into the database.

INDEX

If not None, this argument is a pointer to a buffer containing pairs of Min, Max indexes, identifying intervals of valid records. The list of ends with the pair REC_Q_NOINDEX,REC_Q_NOINDEX. INDEX is mutually exclusive with any other selection option.

SEX

Selection expression which is evaluated for every record in the referred record set. If SEX is None then all records are selected. This argument is mutually exclusive with any other selection option.

FAST_STRING

If this argument is not None then it is a string which is used as a fixed pattern. Records featuring fields containing FAST_STRING as a substring in their values are selected. This argument is mutually exclusive with any other selection option.

RANDOM

If not 0, this argument indicates the number of random records to select from the referred record set. This argument is mutually exclusive with any other selection option.

PASSWORD

Password to use to decrypt confidential fields. If the password does not work then the encrypted fields are returned as-is. If PASSWORD is None, or if it is the empty string, then no attempt to decrypt encrypted fields will be performed.

RECORD

Record to insert. If more than one record is replaced in the database they will be substitued with copies of this record.

FLAGS

ORed value of any of the following flags:

REC_F_ICASE

If set the string operations in the selection expression will be case-insensitive. If FALSE any string operation will be case-sensitive.

REC_F_NOAUTO

If set then no auto-fields will be added to the newly created records in the database.

If no selection option is used then the new record is appended to either an existing record set identified by TYPE or to a newly created record set. If some selection option is used then the matching existing records will be replaced. This function returns 0 if there is not enough memory to perform the operation.

delete(type, index, sexp, fast_string, random, flags)

Delete records from a database, either physically removing them or commenting them out. This function takes the following arguments:

TYPE

Type of the records to remove.

INDEX

If not None, this argument is a pointer to a buffer containing pairs of Min, Max indexes, identifying intervals of valid records. The list of ends with the pair REC_Q_NOINDEX,REC_Q_NOINDEX. INDEX is mutually exclusive with any other selection option.

SEX

Selection expression which is evaluated for every record in the referred record set. If SEX is None then all records are selected. This argument is mutually exclusive with any other selection option.

FAST_STRING

If this argument is not None then it is a string which is used as a fixed pattern. Records featuring fields containing FAST_STRING as a substring in their values are selected. This argument is mutually exclusive with any other selection option.

RANDOM

If not 0, this argument indicates the number of random records to select from the referred record set. This argument is mutually exclusive with any other selection option.

FLAGS

ORed value of any of the following flags:

REC_F_ICASE

If set the string operations in the selection expression will be case-insensitive. If FALSE any string operation will be case-sensitive.

REC_F_COMMENT_OUT

If set the selected records will be commented out instead of physically removed from the database.

Return 0 if there is not enough memory to perform the operation.

set(type, index, sexp, fast_string, random, fexp, action, action_arg, flags)

Manipulate the fields of the selected records in a database: remove them, set their values or rename them. This function takes the following arguments:

TYPE

The type of records to act in.

INDEX

If not None, this argument is a pointer to a buffer containing pairs of Min, Max indexes, identifying intervals of valid records. The list of ends with the pair REC_Q_NOINDEX,REC_Q_NOINDEX. INDEX is mutually exclusive with any other selection option.

SEX

Selection expression which is evaluated for every record in the referred record set. If SEX is None then all records are selected. This argument is mutually exclusive with any other selection option.

FAST_STRING

If this argument is not None then it is a string which is used as a fixed pattern. Records featuring fields containing FAST_STRING as a substring in their values are selected. This argument is mutually exclusive with any other selection option.

RANDOM

If not 0, this argument indicates the number of random records to select from the referred record set. This argument is mutually exclusive with any other selection option.

FEX

Field expression selecting the fields in the selected records which will be modified.

ACTION

Action to perform to the selected fields. Valid values for this argument are:

REC_SET_ACT_RENAME

Rename the matching fields to the string pointed by ACTION_ARG.

REC_SET_ACT_SET

Set the value of the matching fields to the string pointed by ACTION_ARG.

REC_SET_ACT_ADD

Add new fields with the names specified in the fex to the selected records. The new fields will have the string pointed by ACTION_ARG as their value.

REC_SET_ACT_SETADD

Set the selected fields to the value pointed by ACTION_ARG. IF the fields dont exist then create them with that value.

REC_SET_ACT_DELETE

Delete the selected fields. ACTION_ARG is ignored by this action.

REC_SET_ACT_COMMENT

Comment out the selected fields. ACTION_ARG is ignored by this action.

ACTION_ARG

Argument to the selected action. It is ok to pass None for actions which dont require an argument.

FLAGS

ORed value of any of the following flags:

REC_F_ICASE

If set the string operations in the selection expression will be case-insensitive. If FALSE any string operation will be case-sensitive.

REC_F_COMMENT_OUT

If set the selected records will be commented out instead of physically removed from the database.

Return 0 if there is not enough memory to perform the operation.

int_check(check_descriptors_p, remote_descriptors_p, errors)

Check the integrity of all the record sets stored in a given database. This function returns the number of errors found. Descriptive messages about the errors are appended to ERRORS.

class recutils.rset

The constructor creates and returns a Record-set class object. It has the following methods:

num_records()

Return the number of records stored in the given record set.

descriptor()

Return the record descriptor of a given record set. None is returned if the record set does not feature a record descriptor.

type()

Return the type name of a record set. None is returned if the record set does not feature a record descriptor.

class recutils.record

The constructor creates and returns a Record class object. It has the following methods:

num_fields()

Return the number of fields stored in the given record.

contains_value(value, case_insensitive)

Determine whether a record contains some field whose value is VALUE. The string comparison can be either case-sensitive or case-insensitive.

contains_field(field_name, field_value)

Determine whether a record contains a field whose name is FIELD_NAME and value FIELD_VALUE.

class recutils.sex(case_insensitive)

The constructor creates and returns a Selection Expression class object. It has the following methods:

pycompile(expr)

Compile a sex. Sexes must be compiled before being used. If there is a parse error return 0. Does not handle exception on failure. See module pyrec.

pyeval(rec, status)

Apply a sex expression to a record, setting STATUS in accordance: 1 if the record matched the sex, 0 otherwise. The function returns the same value that is stored in STATUS. Does not handle exception on failure. See module pyrec.

eval_str(rec)

Apply a sex expression to a record and get the result as an allocated string.

class recutils.fex(str, kind)

The constructor to parse and create a Field Expression, and return it. A fex kind shall be specified in KIND (enum type). If STR does not contain a valid FEX of the given kind then None is returned. If there is not enough memory to perform the operation then None is returned. If STR is None then an empty fex is returned. It has the following methods:

size()

Get the number of elements stored in a field expression.

member_p(fname, min, max)

Check whether a given field (or set of fields) identified by their name and indexes, are contained in a fex.

get(position)

Get the element of a field expression occupying the given position. If the position is invalid then None is returned.

append(fname, min, max)

Append an element at the end of the fex and return it. This function returns None if there is not enough memory to perform the operation.

all_calls_p()

Determine whether all the elements of the given FEX are function calls.

check(str, kind)

Check whether a given string STR contains a proper fex description of type KIND.

sort()

Sort the elements of a fex using the ‘min’ index of the elements as the sorting criteria.

str(kind)

Get the written form of a field expression. This function returns None if there is not enough memory to perform the operation.

class recutils.field

The constructor creates and returns a Field class object. It has the following methods:

name()

Return a string containing the name of a field. Note that this function can’t return the empty string for a properly initialized field.

value()

Return a string containing the value of a field, i.e. the string stored in the field. The returned string may be empty if the field has no value, but never None.

set_name(name)

Set the name of a field. This function returns 0 if there is not enough memory to perform the operation.

set_value(value)

Set the value of a given field to the given string. This function returns 0 if there is not enough memory to perform the operation.

source()

Return a string describing the source of the field. The specific meaning of the source depends on the user: it may be a file name, or something else. This function returns None for a field for which a source was never set.

set_source()

Set a string describing the source of the field. Any previous string associated to the field is destroyed and the memory it occupies is freed. This function returns 0 if there is not enough memory to perform the operation.

location()

Return an integer representing the location of the field within its source. The specific meaning of the location depends on the user: it may be a line number, or something else. This function returns 0 for fields not having a defined source.

location_str()

Return the textual representation for the location of a field within its source. This function returns None for fields not having a defined source.

char_location()

Return an integer representing the char location of the field within its source. The specific meaning of the location depends on the user, usually being the offset in bytes since the beginning of a file or memory buffer. This function returns 0 for fields not having a defined source.

class recutils.comment(text)

The constructor creates and returns a Comment class object. None is returned if there is not enough memory to perform the operation.It has the following methods:

text()

Return a string containing the text in the comment.

set_text(text)

Set the text of a comment. Any previous text associated with the comment is destroyed and its memory freed.

class recutils.buffer(data, size)

The constructor creates and returns a Buffer class object. A flexible buffer is a buffer to which stream-like operations can be applied. Its size will grow as required.

Functions outside Classes

recutils.field_equal_p(field1, field2)

Determine whether two given field objects are equal (i.e. they have equal names but possibly different values).

recutils.comment_equal_p(comment1, comment2)

Determine whether the texts stored in two given comment objects are equal.

pyrec — Handle exceptions and enum datatypes

This module provides exception handling and binds the enum datatypes of the C library. The classes in this module inherit from the base classes in the recutils module, the methods just have the additional ability to handle exceptions. However, this class is required to use the enum datatypes of the C library.

class pyrec.Recdb

This class inherits from the recutils.recdb class. The constructor creates and returns a Database class object. It has the following methods:

loadfile(filename)

Load a file into a Database object. filename is a string containing the name of any recfile. Catches exception if load fails.

writefile(filename)

Write to file from a Database object. This function overwrites a non-empty file. Catches exception if write fails.

appendfile(filename)

Append to file from a Database object. This function appends to a non-empty file. Catches exception if append fails.

insert_rset(recset, position)

Insert the given record set into the given database at the given position. If POSITION >= rec_rset_size (DB), RSET is appended to the list of fields. If POSITION < 0, RSET is prepended. Otherwise RSET is inserted at the specified position. Catches exception if insertion fails.

remove_rset(position)

Remove the record set contained in the given position into the given database. If POSITION >= rec_db_size (DB), the last record set is deleted. If POSITION <= 0, the first record set is deleted. Otherwise the record set occupying the specified position is deleted. Catches exception if removal fails.

class pyrec.Fexenum

This class inherits from the recutils.fex class. It is used to handle the enum type that the internal C function for fexes requires. This is achieved by having the following attributes initialized from 0 to 2.

REC_FEX_SIMPLE
REC_FEX_CSV
REC_FEX_SUBSCRIPTS
class pyrec.RecSetenum

This class inherits from the recutils.recdb class. It is used to handle the enum type that the internal C function for the DB set requires. The following attributes are initialized from 0 to 6.

REC_SET_ACT_NONE
REC_SET_ACT_RENAME
REC_SET_ACT_SET
REC_SET_ACT_ADD
REC_SET_ACT_SETADD
REC_SET_ACT_DELETE
REC_SET_ACT_COMMENT
class pyrec.RecSex(case_insensitive)

This class inherits from the recutils.sex class. The constructor takes a boolean argument, to indicate case insensitivity for the selection expression. It has the following methods:

compile(expr)

Compile a sex. Sexes must be compiled before being used. If there is a parse error return 0. Catches exception on error.

eval(rec, status)

Apply a sex expression to a record, setting STATUS in accordance: 1 if the record matched the sex, 0 otherwise. The function returns the same value that is stored in STATUS. Catches exception on error.

A Little Example

You can find some example programs in torture/python/ that will give you an idea of how to use the bindings to access librec in Python. To start, always import the recutils and pyrec modules so the functions are available to your program.

In the example showing read-only processing, some low-level functions are shown to get information from recfiles. You can also check if particular information exists. In the other example program, some high-level functions demonstrate how to query for records/record-sets having desired information, insertion, deletion as well as manipulation of databases. Please note the use of enum datatypes that are required by certain functions. You can also check the integrity of your recfiles in Python.

Once you have the bindings installed, here is a little example to help you get started. The walkthrough is below.

#!/usr/bin/python
import sys
import recutils
import pyrec


db = pyrec.Recdb()
db1 = pyrec.Recdb()
db.loadfile("movies.rec")

sexp = pyrec.RecSex(1)
b = sexp.pycompile("Audio = 'German'")

print "Query for a record set consisting of a list of German language movies"
queryrset = db.query("movies", None, None, sex1, None, 10, None, None, None, None, 0)
num_rec = queryrset.num_records()
print "Number of queried records = ",num_rec

db1.insert_rset(queryrset,2);
db1.writefile("german.rec")

The above script can be saved into a file (eg: little_example.py), and run like this:

python little_example.py

Walkthrough

Start by importing the modules that are going to give you access to the recutils functions.

import recutils
import pyrec

Next, create the databases.

db = pyrec.Recdb()
db1 = pyrec.Recdb()
db.loadfile("movies.rec")

The Recdb() constructor creates and returns a database class object. We create two objects here, db and db1. loadfile() then loads the movie database into db.

sexp = pyrec.RecSex(1)
b = sexp.pycompile("Audio = 'German'")

RecSex() creates and returns a selection expression object. Its argument is a case_insensitive boolean flag. SEXES need to be compiled before being used, so let’s query for German language movies.

print "Query for a record set consisting of a list of German language movies"
queryrset = db.query("movies", None, None, sex1, None, 10, None, None, None, None, 0)
num_rec = queryrset.num_records()
print "Number of queried records = ",num_rec

Call the query() function on db and all the German movies should go into queryrset.

db1.insert_rset(queryrset,2);
db1.writefile("german.rec")

Lastly, just to be sure we got what we wanted, we insert queryrset into db1, the second database object we created, and then write it to file. Opening german.rec should give you the list of all the German movies you wanted to see.

You can try out more such operations with the help of the example-test programs in torture/python/.