Transtractor

Universal PDF Bank Statement Parsing

The Transtractor (Transaction Extractor) is a high-performance library for extracting transaction data from PDF bank statements. Built with Rust for speed and wrapped with a Python API for ease of use.

PyPI version Development Status Tests codecov License

Quick Start

Install from PyPI:

pip install transtractor

To parse a bank statement PDF and convert it to CSV:

from transtractor import Parser

# Initialise parser
parser = Parser()

# Convert PDF to CSV
parser.parse('statement.pdf').to_csv('statement.csv')

Writes:

date,description,amount,balance
2025-01-01,Transaction 1,50000.0,100000.0
2025-01-01,Transaction 2,-1000.0,99000.0
2025-01-01,Transaction 3,-10000.0,89000.0
2025-01-01,Transaction 4,1350.0,90350.0
2025-01-03,Transaction 5,-530.99,89819.01
2025-01-03,Transaction 6,1532.55,91351.56
2025-01-04,Transaction 7,-568.01,90783.55
2025-01-04,Transaction 8,-23.56,90759.99
...

Supported Banks

See the supported statements page for a full list of supported banks and statement formats. If your bank is not supported, create your own configuration by following the guidelines and load it into the parser:

from transtractor import Parser

parser = Parser()
parser.load('path/to/your_bank_config.json')
parser.parse('your_statement.pdf').to_csv('output.csv')

Documentation

Community & Support

License

Transtractor is open source software licensed under the MIT License.