Introducing Replay Queue
As a data engineer, you might be knowing that when streaming data from a source to a destination there are many things that can go wrong. There are instances when data cannot be pushed to the destination warehouse for logical reasons. Say, a code snippet error, or a wrongly matched data type between the source and destination tables, or when the destination is unreachable. To ensure no data is lost, engineers often hand-code long scripts to handle individual exceptions one at a time. With the addition of new sources from time to time or schema changes in incoming events, risks of data loss increase as handling these exceptions becomes an ongoing pain point.
It becomes imperative to have an easy, foolproof mechanism to handle these exceptions and prevent any data loss.
Hevo’s Replay Queue eliminates the need to manually handle errors by hosting all the erroneous events that are not pushed to the destination. You can rely on the Replay queue to ensure no data is lost from your pipeline. The Replay queue comes with a simple visual interface that allows you to take timely action on errors detected.
Real-time notifications are sent to users when any data set enters the Replay Queue. Additional information such as error location, the reason for the exception, number of events, stage at which the error occurred, sample event data, etc. are mentioned in the platform to ensure that the user has all the useful information needed when he is ready to act. The user can choose to ignore the error or fix it and “replay” them into the pipeline.
Hevo diligently scans the Replay Queue for any error fixtures every few minutes. Upon detection of error fixture, it automatically ingests the events back into the pipeline making exception handling a walk around the park for users.
Sample Event Data for Quick Review
With Replay Queue, we want to ensure a reliable, fault tolerant data pipeline to our users that ensures zero data loss. What are your thoughts on the Replay Queue?