
Snowflake External Tables
Snowflake External Tables
All data in Snowflake is maintained in the Databases. External Tables provides a faster accessibility to your external data without building complex data loading pipeline
Design Process: My Role and Responsibilities
As an only designer working with Product Manager I was responsible to bring this feature into the UI and associated workflow of it. In order to do that I was involved early on in the conceptualization phase. In my initial research I talked to several customers to understand the user’s pain point and studied the competitor’s landscape. Also, there was a need to understand the existing database object model for Snowflake. After the initial research I dived into design and taking it to the implementation phase working closely with the engineers to follow-through the release of this feature
Timeline: Dec 2018 - June 2019 (tentative first release), Platform: Web, Team: Me (Designer), Product Manager, 2 Backend Engineer and 1 Front end Engineer

The Problems We are Trying to Solve
Everyday a large amount of raw data is consumed by bigger Snowflake customers like Capital One, Netflix or Expedia. They have their data sitting in the external storage locations such as AWS S3 or Microsoft Azure They use Snowflake for faster compute services but want to avoid building a complex data ingestion pipeline to load the data in the traditional database tables.
Customer Use Case 1: Inefficient to copy and load the large volume of data into Snowflake and keep them in Sync, which involves TIME and COST in terms of storage and compute
Customer Use Case 2: Their BI analysts need the external data to generate queries, reports fairly quickly and may need to generate analysis of logs on data lake (S3) and querying them only during debugging
How External Tables Help to Solve the Problem?
External tables will be the solution to let the users access the massive amount of oncoming data from their external location by behaving the same way as the traditional tables. But unlike regular tables, they will have:
NO STORAGE uses
With Auto-Refresh, they will get instant update of new data
No need to rely on the Database Admin to build the Data Loading Pipeline and thus saves time
Challenges with the current Data Catalog Framework
The current Databases UI is incomplete; we provide a list or master view for each objects with only the output of the SQL show command which often excludes useful meta data or qualitative data that helps characterize the object such as its size, frequency of access etc
There is no visibility of the analytics information such as aggregated Metrics data, monitor an object’s performance and notification if some anomaly happens. Many times our users needs to buy external software or build their own script to develop the monitoring and alerting experience. A lot of times they refuses to simply use our UI for these reasons
Design Process : Setting Goals
Near Term : Achievable within the release timeline
1. Surface the External Table in the UI and its relative workflow.
2. Improve the presentation of the content for the Databases and the Object Details page. Instead of only showing the metadata of the table, what are the other most relevant analytics information that we can show to the detailed view to make it more useful?
3. Add improved visuals to the UI
Long term : Beyond Scope of this project, currently ongoing and WIP
1. Work on improving the information architecture of the overall Databases and object exploration workflow
2. Consider improving the layout and navigation structure
3. Consider coming up with a template for Master-Detailed view with other databases objects as well. Scope: Snowpipe (Snowflake’s continuous data ingestion service) and Regular Tables
Discovery & Definition
Qualitative Research: Interviewed 5 different customers and SEs with a set of questions to understand the app usage to figure out the mental model of the users
Summary of User Interviews:
Our customers are looking for simple solutions in the following:
Navigation through the database namespace as you select an item in the master list to drill down to the next level in the hierarchy
Searching the catalog for a single object or isolating a group of objects based on a filter criteria (from a list)
Object Management: Taking actions (including batch) on these objects. Examples:
Edit Object metadata such as name, comment
Edit Privileges
Drop, Alter Object
Viewing metadata, historical analysis metrics, monitoring metrics and perhaps even usage and other interesting analytics for one or more objects
Snowflake Data Object Model
To make an informed decision about where the External Table will live, I studied the Database objects and put together a tree view to better visualize what each object is capable of coming up with a proper grouping

Design Iterations: Initial Sketches

Final Mocks
Keeping the short term goal in mind, we were able to accomplish the following in the UI:
Improvement of the Data Grid with inline, intuitive actions
Grouping of actions for ease of accessibility
A composite detailed view (deviated from showing only SHOW COMMAND information) with Meaningful Metrics and graph visualization and a tabbed area for preview data and metadata information which was otherwise not surfaced.

Details of the Visualization Elements

Create an External Table from the UI Flow

Future Improvement: Phase 2 Design
We consider tackling the overall Information Architecture of the Data Platforms area. Also, at the very least, keeping the Architecture same and organize Master-Detail concept and make the detailed views with richer information, like Monitoring or Analytics experiences. Below are the example from some of the process and work in progress areas.

Few iterations considered with different navigation and layout styles
Settled Direction for the Detailed view of External Tables

Further improvement after feedback
