While loading data to your Warehouse for analysis, often data is not formatted or in complete alignment with the Data Warehouse format. The same data could be recorded in different formats in different sources. Hence, it’s possible that you may need to add a few fields before loading the data.
Before loading the data to the Data Warehouse, it’s a better practice to perform some Data Formatting to clean up your data. If you move Unformatted Data to the Warehouse, it could add additional computations at the Warehouse end. Thus, slowing down the Analytics Process and increasing your Data Warehouse cost.
The biggest challenge with Data Formatting is prioritizing Engineering Bandwidth for it. Thus, resulting in badly organized data in the Data Warehouse and Longer Analytics processes.
We are excited to introduce you to our No-Code On-The-Fly Formatter. Using this, you can format your data on the fly in your Data Pipeline without any coding!
Format your Data in your Data Pipelines without Engineering Help
Hevo’s No-Code On-The-Fly Formatter has multiple pre-built functions available for formatting the data.
You can set up the Data Formatting in minutes using the drag and drop interface by simply selecting the function you want to apply. You can even set up a sequence of functions for performing multiple formatting on an object.
Our customers love the power of setting up Data Formatting with ease and without writing any code. They are able to clean and organize their data for analysis without Engineering Bandwidth. They have set up multiple functions using this feature!
Here is What a Few of Our Customers say about Our No-Code On-The-Fly Formatter
For a non-python developer like myself, the drag and drop interface is a great way to apply data transformations.
Samvel Khachatryan, Head of Business Intelligence, Omnicore
The drag and drop transformations are a handy way to format data before loading it into the warehouse.
Javier Barragan, Co-founder, WellBefore
When you should use No-Code On-The-Fly Formatter?
- Adding a New Field: You can enrich your data by adding new fields. For example, if you have different datasets and respective Pipelines for each product, and you can add the product name for each dataset while loading it into the Data Warehouse.
- Parsing Strings: You may want to remove the characters from a few fields such as fields like order_id, customer_id, and more to make it analytics-ready.
- Dropping a Field: There could be a few events that may not be needed in your Data Warehouse. You can filter out these events easily. For example, if you have an e-commerce store, you can drop events related to cancelled or incomplete orders from your order dataset.
- Cleaning: You can group multiple event values into a single value. For a field like product_name, you can set a certain group of product_name as ‘Others’. You can also set NULL for all fields with the value ‘zeroes’ or ‘spaces’.
- Formatting: Fields like date or cost could be stored in different formats in different datasets. You can format those in a single format as per your Data Warehouse.
- Masking: Data Security is important for your customers. You don’t want customers’ personal information to be accessible to all business users. You can mask fields like phone number, email-id while loading the data.
Case Story on when to Apply No-Code On-The-Fly Formatter
Consider a global e-commerce company that has multiple versions of their website like .com, .au for each location. These locations are viewed on Google Analytics as different properties. They want to integrate data from all website properties and regions to their Google BigQuery account. Then, build an Overview Dashboard using that data.
For this, they will set up multiple Data Pipelines to load data from Google Analytics to Google BigQuery. For each property, there is a Pipeline that loads data to Google BigQuery.
Now, in the Overview Dashboard, they would also need information about the country name as a field. For this, they can set up and add a new field function from the drag-and-drop interface. This field can be ‘Country”. The value of the field can be populated based on the source property of Google Analytics for that Pipeline.
All this can be done without writing a single line of code!
Granular Control over Data Formatting
We have built a powerful and flexible On-The-Fly Formatter that helps our customers to build multiple Data Formatting options with full flexibility. If you’re more comfortable with Python, then we also have a Python-based On-The-Fly Formatter interface.
Hevo also provides additional functionality for you to view the Data Formatting functions before publishing them. You can make sure that your formatting does not lead to errors in the Data Warehouse, thus not affecting Analytics. You can even revert if there are any errors or changes in your setup.
You can rest be assured that all these are sorted with the following functionalities:
- Test and Deploy: Once you have written or modified the Data Formatting, you can test these on Sample Events. You can also test any Data Formatting functions to resolve Failed Events directly within the interface.
- Version History: You can restore any past deployed Data Formatting functions by selecting them from the version history.
Get Started with Hevo
Check out how easy and quick it is to build a pipeline with on-the-fly data formatting by signing up for a 14-day free trial. You can connect 100+ data sources (including 40+ free sources) to your Data Warehouse with On-the-Fly Data Formatting.
If you are a Hevo customer and you have to perform any cleaning or formatting on your data, then you can simply go to your respective Pipeline’s Transformation Section. You can set up data formatting with ease without wiring any code. Learn more on how to use our drag-and-drop data formatted!