A cloud integration perspective
I got an opportunity to attend the Snowflake Summit (June 3-6) with the SnapLogic team. Snowflake has been one of the two dominant cloud warehouse platforms in the market along with Amazon Redshift, and the event showcased that dominance. I saw a diverse set of Snowflake customers and prospects from retail to biosciences, from Fortune 100 enterprises to small startups. The entire Snowflake ecosystem was also on display at the conference. The expo floor – ‘Basecamp’ – was filled with booths from a number of Snowflake partners including SnapLogic.
At the event, Snowflake announced some interesting new features such as Snowflake Materialized Views and Database replication and failover across multiple regions or cloud service providers. You can read more about these features here. But my biggest insights came from the various breakout sessions at the Snowflake Summit. I learned about the challenges Snowflake customers are facing and the strategies they’re adopting to tackle those challenges.
Here are some common challenges IT teams faced before adopting Snowflake.
- Performance issues on Monday mornings when everyone is returning to work and hitting refresh on their dashboards
- High costs due to compute provisioning for the peak loads with other data warehouses; lack of auto-elasticity
- Data explosion; being forced to throw out data such as clickstream data, carrier logistics data, and CRM data due to the limited storage of on-premises-based appliances
- Managing data centers and servers prevents organizations from adapting to market shifts and stifles innovation
- ETL/ELT tools in the form of scripts are fragile and hard to maintain
Here are three major themes from the solutions customers adopted to deal with the above challenges:
1. Multi-tool strategy
Most companies have a multi-tool strategy when it comes to data integration. The software as a service (SaaS) consumption model enables such a strategy. With the SaaS model, organizations can evaluate multiple software products in a short period of time and then build a process/workflow to tie all those tools together. It is easy to adopt multiple tools as long as the sprawl is manageable.
However, as the business needs evolve, and as your tools ecosystem changes, you have to periodically – ~every 1.5 to 2 years – evaluate the set of tools you are using. I noticed that while a few organizations constantly reevaluated their tool sets, others kept using suboptimal tools. Every organization has inertia when it comes to enterprise applications and business processes. Hence, it is important to choose a tool based not just on the current needs but also on future ones.
2. Faster time-to-value
Startups are always facing intense pressure to stay competitive, stay relevant, and grow their customer base. While startups want to use tools that give them an edge, they can’t afford to go through long evaluation cycles when choosing a new software vendor. They are looking for tools which make them productive and are also easy to evaluate.
A number of Snowflake customers picked a tool for their Snowflake ecosystem to solve a specific need such as data load and data replication. A point-to-point tool like this meets their immediate need while staying within their limited budgets.
So, to attract startups, it is important that a SaaS product provides:
- A free trial, possibly with resources for a self-directed POC
- A seamless onboarding process for new users
- Support and a user community for trial users so they can quickly find answers
- Quick-start solutions on public cloud platforms
3. Reliance on scripting
SQL or Python scripting is still a dominant tool for data warehouse teams. In fact, some teams have a SQL standard where analysts write queries in SQL and own the data transformation logic.
Many SaaS applications in the space of integration and analytics have moved towards enabling business users with a low-to-no-code platform for building integrations or for preparing data for building dashboards. However, when it comes to preparing and loading data into a data warehouse such as Snowflake, data engineers still run the show.
They rely heavily on their coding and scripting skills. That said, I also saw signs that they will adopt automated tools and pre-built connectors that will allow them to focus on more strategic work – making sure the end-user experience is optimal, the data transformation logic is accurate, the workflows are performant, etc.
As more and more organizations take steps to collect data and gain insights from that data, the importance of data warehouse, data engineering, and data integration tools will continue to grow. The SnapLogic Intelligent Integration Platform – a unified solution that empowers data engineers, business analysts, and integration specialists to build end-to-end data and application integration – is well positioned to capture this growth opportunity.