Kapteyn Institute Kapteyn Package

Module tabarray

Author: Hans Terlouw <gipsy@astro.rug.nl>

Module tabarray provides a class which allows the user to read, write and manipulate simple table-like structures. It is based on NumPy and the table-reading part has been optimized for speed. When the flexibility of SciPy’s read_array() function is not needed, Tabarray can be considered as an alternative.

Class tabarray

class tabarray.tabarray(source[, comchar='#!', sepchar=' \t', lines=None, bad=None, segsep=None])

Tabarray is a subclass of NumPy’s ndarray. It provides all of ndarray’s functionality as well as some extra methods and attributes.

Parameters:
  • source – the object from which the tabarray object is constructed. It can be a 2-dimensional NumPy array, a list or tuple containing the table columns as 1-dimensional NumPy arrays, or a string with the name of a text file containing the table. Only in the latter case the other arguments are meaningful.
  • comchar – a string with characters which are used to designate comments in the input file. The occurrence of any of these characters on a line causes the rest of the line to be ignored. Empty lines and lines containing only a comment are also ignored.
  • sepchar – a string containing the column separation characters to be used. Columns are separated by any combination of these characters.
  • lines – a two-element tuple or list specifying a range of lines to be read. Line numbers are counted from one and the range is inclusive. So (1,10) specifies the first 10 lines of a file. Comment lines are included in the count. If any element of the tuple or list is zero, this limit is ignored. So (1,0) specifies the whole file, just like the default None.
  • bad – is a number to be substituted for any field which cannot be decoded as a number. The default None causes a ValueError exception to be raised in such cases.
  • segsep – a string containing segment separation characters. If any of these characters is present in a comment block, this comment block is taken as the end of the current segment. The default None indicates that every comment block will separate segments.
Raises:

IOError, when the file cannot be opened.

IndexError, when a line with an inconsistent number of fields is encountered in the input file.

ValueError: when a field cannot be decoded as a number and no alternative value was specified.

Attributes:

nrows

the number of rows

ncols

the number of columns

segments

a list with slice objects which can be used to address the different segments from the table. Segments are parts of the table which are separated by comment blocks which meet the conditions specified by argument segsep. The following example illustrates how a program can iterate over all segments:

1
2
3
4
5
6
from kapteyn.tabarray import tabarray

coasts = tabarray('world.txt')

for segment in coasts.segments:
   coast = coasts[segment]

Methods:

columns(cols=None)
Parameters:cols – a tuple or list with the numbers (zero-relative) of the columns to be extracted.
Returns:a NumPy array.

Extract specified columns from a tabarray and return an array containing these columns. Cols is a tuple or list with the column numbers. As the first index of the resulting array is the column number, multiple assignment is possible. E.g., x,y = t.columns((2,3)) delivers columns 2 and 3 in variables x and y. Default: return all columns.

rows(rows=None)
Parameters:rows – a tuple or list containing the numbers (zero-relative) of the rows to be extracted.
Returns:a new tabarray.

This method extracts specified rows from a tabarray and returns a new tabarray. Rows is a tuple or list containing the row numbers to be extracted. Normal Python indexing applies, so (0, -1) specifies the first and the last row. Default: return whole tabarray.

writeto(filename, rows=None, cols=None, comment=, []format=[])

Write the contents of a tabarray to a file.

Parameters:
  • filename – the name of the file to be written.
  • rows – a tuple or list with a selection of the rows (zero-relative) te be written. Default: all rows.
  • columns – a tuple or list with a selection of the columns (zero-relative) to be written. Default: all columns.
  • comment – a list with text strings which will be inserted as comments in the output file. These comments will be prefixed by the hash character (#).
  • format – a list with format strings for formatting the output, one element per column, e.g., ['%5d', ' %10.7f', ' %g'].

Functions

tabarray.readColumns(filename, comment='!#', cols='all', sepchar=', \t', rows=None, lines=None, bad=0.0, rowslice=(None, ), colslice=(None, ))

TableIO-compatible function for directly extracting table data from a file.

Parameters:
  • filename – a string with the name of a text file containing the table.
  • comment – a string with characters which are used to designate comments in the input file. The occurrence of any of these characters on a line causes the rest of the line to be ignored. Empty lines and lines containing only a comment are also ignored.
  • cols – a tuple or list with the column numbers or a scalar with one column number.
  • sepchar – a string containing the column separation characters to be used. Columns are separated by any combination of these characters.
  • rows – a tuple or list containing the row numbers to be extracted.
  • lines – a two-element tuple or list specifying a range of lines to be read. Line numbers are counted from one and the range is inclusive. So (1,10) specifies the first 10 lines of a file. Comment lines are included in the count. If any element of the tuple or list is zero, this limit is ignored. So (1,0) specifies the whole file, just like the default None.
  • bad – a number to be substituted for any field which cannot be decoded as a number.
  • rowslice – a tuple containing a Python slice indicating which rows should be selected. If this argument is used in combination with the argument rows, the latter should be expressed in terms of the new row numbers after slicing. Example: rowslice=(10, None) selects all rows, beginning with the eleventh (the first row has number 0) and rowslice=(10, 13) selects row numbers 10, 11 and 12.
  • colslice – a tuple containing a Python slice indicating which columns should be selected. If this argument is used in combination with the argument cols, the latter should be expressed in terms of the new column numbers after slicing. Selection is analogous to rowslice.
tabarray.writeColumns(filename, list, comment=[])

TableIO-compatible function for directly writing table data to a file.

Parameters:
  • filename – the name of the file to be written;
  • list – a list containing the columns to be written.
  • comment – a list with text strings which will be inserted as comments in the output file. These comments will be prefixed by the hash character (#).

Example

Suppose you have a file with catheti data from right-angled triangles and you want to compute the hypotenuses and write the result to a second file. The input file may be as follows:

# Triangle data
#
 3.0   4.0 ! classic example
 4.1   3.6
10.0  10.0

Then the following simple script will do the job:

#!/usr/bin/env python
import numpy
from kapteyn.tabarray import tabarray

x,y = tabarray('triangles.txt').columns()
tabarray([x,y,numpy.sqrt(x*x+y*y)]).writeto('outfile.txt')

leaving the following result in the output file:

 3      4      5
 4.1    3.6    5.45619
10     10     14.1421

Table Of Contents

Please cite the Kapteyn Package if you use it in the preparation of a publication. These citations help us to justify the resources spent on this software.