Access to accurate and reliable data is paramount for success in the financial landscape. Zippi, a pioneering Brazilian fintech company, understands this principle all too well. It provides working capital to micro and small entrepreneurs, a segment that is generally overlooked by financial institutions.
Zippi uses a powerful machine learning (ML) platform to calculate each applicant's creditworthiness and loan amount. Using the applicant's financial data as well as third-party data, Zippi leverages a constantly improving risk analysis model to predict a loan's potential risk and calculate the ideal eligibility amount based on this score.
Hence, data is a crucial element of Zippi’s business model, as it powers the ML models that lie underneath the platform to make accurate predictions and analyses.
However, ensuring the integrity and normalization of this critical data presented a significant challenge for Zippi. Let us understand how Zippi leveraged Hevo pipelines to overcome these challenges and achieve the following:
Streamlined data mapping and normalization for improved model training and efficiency.
Reduced overall default rates through more accurate risk assessment.
Enhanced data management processes for their data science teams.
Focus on Data Integrity and Normalization
Prior to implementing Hevo, Zippi used external data sources for their credit calculations, which were stored in Amazon S3 file storage. This data was used for ML model training, while customer interviews helped them improve the platform and serve the customers better.
Since a customer's creditworthiness was calculated based on this information, it was essential to feed this information correctly into their ML platform.
Unfortunately, matching and getting this diverse information into a common format was proving to be difficult. Some of the reasons for this include:
Data Residing in Silos: For starters, their data, which came from customer interviews and third-party sources, resided in silos. Zippi conducted customer interviews using Typeform, a popular online form builder. This valuable information was stored in a central data storage solution, Amazon Redshift. They also brought in financial information from third party vendors, for which data integration was a necessity.
Fragmented Data: The data obtained from external vendors to supplement their internal financial data often lacked consistency in format and structure, further complicating data integration. These inconsistencies or data anomalies significantly impacted the performance of Zippi's ML model, leading to delays in critical decision-making.
Manual & Complicated Normalization Process: A simple JSON normalization process would often take days, and their existing data pipeline solution, DMS, was not well-suited for handling the complex normalization requirements of JSON data, a standard format for semi-structured data.
Data Security & Privacy: The lack of a centralized platform also raised concerns about data security. Since financial data is highly sensitive and sourced information from customer interviews and third-party insights, it needed to be protected from the time it was collected till the processing workflow.
DMS was the first data movement platform used by Zippi before Hevo was onboarded. However, the platform chosen had several challenges in managing the complexities of Zippi’s growing data volume.
As our business expanded, our data storage requirements got more complex. Plus, DMS could not handle the complex normalization requirements of our data, which was mostly in JSON format. We needed a solution to streamline JSON data normalization, accommodate growing data volumes, and offer this at a competitive price point.
Powering Informed Decisions With Hevo
In search of a solution to overcome these data hurdles, Zippi discovered Hevo Data. With its zero-maintenance data pipeline platform designed to simplify the process of moving data from various sources to a central data warehouse, Zippi was able to resolve several data issues.
Here's how Hevo Data benefits various teams at Zippi:
Growth Team: Clean and readily available data allows the growth team to accurately calculate Customer Acquisition Cost (CAC), a crucial business metric. Hevo Data empowers them to monitor and implement initiatives to keep CAC under control.
Risk Team: The risk team leverages Hevo Data to optimize and analyze the decision-making process for their risk analysis model. Having access to high-quality, normalized data empowers them to make more informed risk assessments.
Data Science Team: Hevo Data simplifies data understanding, particularly for JSON files purchased from external sources. This allows the data science team to focus on feature engineering for the product based on this valuable information.
Finance Team: Hevo Data ensures the finance team has easy access to customer payment and account information. This real-time visibility enables them to stay on top of financial activities and make data-driven decisions.
Today, Hevo Data is being used by various teams across the organization, primarily helping them improve data normalization and integration. Here are some of the key benefits they achieved:
Unified Data Platform: It helped Zippi gain a unified data platform, seamlessly integrating customer interview data from Typeform alongside financial data and information retrieved from external vendors. This unified platform gave Zippi a holistic view of each applicant, enabling a more comprehensive creditworthiness assessment.
Improved Data Ingestion: Integrating data from diverse sources often presents challenges due to variations in data structure. Hevo was able to tackle this hurdle with near real-time data ingestion. This allowed Zippi to get fresh and updated data for financial risk analysis and train the ML model, helping them get more accurate results.
Hybrid Schema Mapping: Hevo enables hybrid schema mapping, allowing Zippi to manually map their source files to the relevant files in Redshift. Thus, they can easily switch to manual schema mapping to rectify any inconsistencies, ensuring data integrity even in the face of unexpected issues. This hybrid approach empowers Zippi to leverage the efficiency of automation while maintaining the ability for human intervention when necessary.
Enhanced Normalization Process: Hevo Data's built-in data transformation capabilities automated the process of normalizing data from various sources. This includes efficiently handling JSON data, a pain point for Zippi. By standardizing data formats, Hevo Data significantly reduced data anomalies and improved the overall quality of data used by Zippi's machine learning model.
Cost Efficiency: Hevo Data empowers Zippi to optimize data warehouse costs by offering flexible data ingestion and loading latency options. This allows them to tailor the data pipeline to their specific needs, balancing real-time updates with cost-efficiency. The combination of speed, accuracy, and cost-control empowers Zippi to leverage the power of their data for optimal decision-making and achieve a significant competitive advantage.
Scale As You Grow: The crucial part of Hevo’s data platform is its ability to scale automatically to accommodate growing data volumes. This fit in with Zippi’s need for a cost-effective data storage and management solution that could also scale up as its core business expanded.
Improved Data Quality For Data-Driven Risk Assessment
Zippi streamlined and normalized its customer data into a single platform, combining data from multiple sources, including customer interviews and third-party vendors. The fragmented data landscape that previously hampered their creditworthiness assessments was replaced with a unified platform.
Hevo streamlined the process of collecting data from diverse sources, including customer interviews and external vendors. This not only provided a more holistic view of each applicant but also ensured the quality of the data feeding into their ML model.
The impact of Hevo’s end-to-end zero-maintenance data pipeline platform was clear:
Improved Data Quality and Model Efficiency: Hevo's automated data normalization processes significantly reduced data anomalies and inconsistencies. This led to cleaner, higher-quality data feeding into Zippi's ML model. As a result, the model was more refined, leading to more informed creditworthiness assessments.
Reduced Default Rates: With a more comprehensive view of applicants, Zippi was able to make better and faster lending decisions. This resulted in a measurable decrease in loan defaults, improving Zippi's financial performance.
Enhanced Data Science Productivity: By automating data integration and normalization tasks, Hevo freed up valuable time for Zippi's data science team. This allowed them to focus their efforts on more strategic initiatives, such as model development and refinement.
Enhanced Customer Experience: By efficiently evaluating creditworthiness, Zippi can offer faster loan approvals and a smoother customer experience for qualified applicants.
Hevo helped us streamline the data mapping and normalization process before it was used in the model. The manual dependency in the data normalization process would often take days to complete but can now be done in 15 minutes.
Excited to see Hevo in action and understand how a modern data stack can help your business grow? Sign up for our 14-day free trial or register for a personalized demo with our product expert.