The Revolution of Cash Flow Analysis: From Rigid Rules to Intelligent Vectorization

The precise analysis of financial transactions is essential for effective financial planning. However, identifying regularly recurring payments, such as subscriptions, has long been a challenge. Traditional, rule-based approaches are quickly overwhelmed by complex and dynamic financial transactions. wealthAPI has developed a solution that overcomes these limitations. By using artificial intelligence and vectorization, we transform raw transaction data into valuable insights. Our system goes far beyond simple rule sets. It enables precise and reliable identification of recurring payments, even with minor deviations or new payment providers.

wealthapi-blog-vektorisierung-body

Rule-Based Systems: The Limitations of Traditional Approaches

The cash flow analysis of our budget planner reliably recognizes and distinguishes between regular and irregular transactions. Frequent but irregular transactions are reliably ignored. The same applies to particularly large transactions that do not reflect regular spending behavior.

The identification of regularly recurring transactions is technically not trivial. We also faced the challenge of clearly identifying recurring payments in the bank transaction history. The reason: the previously used rule-based systems work with clearly defined rules to identify payments based on specific criteria.

An Example of Rule-Based Identification of a Netflix Subscription

The rules:

  • Amount: Each payment must be exactly €15.99.
  • Frequency: Payments must be monthly, on the first working day of the month.
  • Recipient: The recipient must be “Netflix”.
  • Currency: The currency must be Euro.

The system searches the transaction history for payments that meet all the above rules. If a payment is found that meets all criteria , it is identified as a Netflix subscription.

The problem: This method is very rigid. It only works if the payments exactly match the defined rules. Even small deviations, such as a fee increase or a late payment, lead to transactions not being recognized correctly. In addition, with a large number of different subscriptions and payment providers, a large number of rules must be created and maintained. Furthermore, changes in payment modalities (for example, price increases, changes to the payment date) mean that rules must be adjusted.

Such a rule-based system is not flexible enough. It cannot handle the variety of payment methods and providers and map the diverse constellations of payments. In particular, the search for exact matches is problematic. Even small spelling deviations such as “Netflix” versus “Netflix Inc.” lead to recurring payments remaining undetected. In short: identification based on rigid rules is not able to map the complexity and dynamics of modern financial transactions. It cannot guarantee reliable recognition of recurring payments.

Vectorization: Transactions as Data Points

For us at wealthAPI, this is an unacceptable situation. That’s why we have developed a solution that uses AI to convert raw transaction data into usable insights. To do this, our system uses vectors to group or embed transactions into recurring payment patterns.

Imagine each transaction as a point in a multidimensional space. This space has many dimensions: for example, for the amount, for the merchant, and for the date. The more similar two transactions are, the closer they are in this space. Our system converts these transactions into so-called vectors. A vector corresponds to a numeric code that contains all relevant information about a transaction. These vectors allow us to calculate similarities between transactions mathematically. Vectorization is comparable to creating a digital fingerprint for each transaction.

Through the vector-based system, we achieve and ensure very high accuracy – especially for recurring payments with small feature differences. From subscriptions to insurance payments, our system delivers reliable results. At the same time, it offers the speed and scalability that our partners need.

Astra DB: The Perfect Home for Our Vectors

To store and search these vectors efficiently, we use DataStax Astra DB. This database is specifically designed to process large amounts of data at high speed. It allows us to quickly find and group similar transactions. Astra DB is like a state-of-the-art library where every book (in our case: every vector) has a unique place and can be found quickly.

A Netflix subscription, for example, could be characterized by the following features:

  • Amount: €15.99
  • Merchant: Netflix
  • Frequency: Monthly
  • Payment date: Beginning of the month
  • Description: Streaming service

These features are converted into a vector. When a new transaction is received, its vector is compared with the already stored vectors. If a vector shows a high similarity, the new transaction is assigned to the corresponding cluster, for example, “Netflix subscription.”

And this is how it works:

  • Data capture: When bank transactions are received, for example, through a data import, the wealthAPI backend publishes them in a message queue for asynchronous processing.
  • Embedding creation: Each transaction is then converted into a numerical vector (for example, [0.12, 0.65, 0.78, …, 0.23]) using the vectorization function.
  • Vector storage and search in the database: The embeddings are then stored in the database. Vector similarity searches allow the system to find and cluster similar transactions.
  • Regularity analysis : The clusters are analyzed to identify recurring payments and classify them into contracts such as “Netflix – streaming service – monthly” or “foreign health insurance – health – annually”.

Astra DB ensures that the entire process is scalable and responsive, even with large data volumes. Strict data security measures are also observed. This ensures that end users and their transactions remain anonymous and are protected from external access.

Handling Large Data Volumes

We process thousands of transactions every day. To meet this high data load, we needed a database that is both fast and flexible. Astra DB, based on Apache Cassandra, fully meets these requirements and can also be integrated into AI workflows. Astra DB enables us to process large amounts of data while performing highly accurate similarity searches.

But not only we and our partners benefit from this, but also the end users. They can use a search function to search for specific expenses without having to know exact categories. For example, if a user enters the word “health,” all transactions related to health expenses are displayed. These can be insurance premiums or gym memberships.

This is made possible because our system also converts the user’s search query into a vector. It then searches the database for the most similar vectors. This allows us to provide precise answers to semantic search queries and an intuitive, powerful search function. The vector search function makes this type of semantic search fast and accurate.

Conclusion: With Vectors to Intelligent Cash Flow Analysis

The vectorization of transaction data has taken the cash flow analysis of our budget planner to a new level. By converting transactions into vectors, we can calculate similarities between payments and identify spending patterns.

Our system, based on the powerful database Astra DB, offers numerous advantages:

  • High accuracy: Even with small deviations in the transaction data, recurring payments are reliably recognized.
  • Scalability: Our system can process large amounts of data efficiently and is equipped for future growth.
  • Flexibility: The system automatically adapts to new payment providers and models.
  • Intuitive search: End users can search for terms and get precise results without having to know exact categories.

With our solution, we enable consumers to better understand and manage their finances. The precise identification of recurring payments is an important step towards intelligent and personalized financial planning. And the beginning of a wealth management that stands on solid ground. Wealth-Management.