Snowflake 101: 5 Ways to Build a Secure Cloud

Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Look now.


Today, Snowflake is the favorite when it comes to data. The company started as a simple data warehouse platform a decade ago, but has since evolved into an all-encompassing data cloud that supports a wide range of workloads, including a data lake.

More than 6,000 enterprises currently trust Snowflake to manage their data workloads and produce insights and applications for business growth. Together, they have more than 250 petabytes of data in the cloud, with more than 515 million computing workloads running every day.

Now, when the scale is so large, cyber security issues will emerge. Snowflake recognizes this and offers scalable security and access control features that ensure the highest level of security for not only accounts and users, but also the data they store. However, organizations can miss out on certain fundamentals, which makes cloud computing partially secure.

Here are some quick tips to fill these gaps and build a secure enterprise cloud.

Event

Intelligent Security Summit

Learn the critical role of AI and ML in cybersecurity and industry-specific case studies on December 8. Sign up for your free pass today.

Register now

1. Make the connection secure

First and foremost, all organizations using Snowflake, regardless of size, should focus on using secure networks and SSL/TLS protocols to prevent network-level threats. According to Matt Vogt, VP of global solution architecture at Immuta, a good way to start would be to connect to Snowflake over a private IP address using a cloud service provider’s private connection such as AWS PrivateLink or Azure Private Link. This will create private VPC endpoints that allow direct, secure connectivity between the AWS/Azure VPCs and the Snowflake VPC without going through the public Internet. In addition to this, network access controls such as IP filtering can also be used for third-party integrations, further strengthening security.

2. Protect source data

While Snowflake offers multiple layers of protection—like time travel and failsafe—for data that’s already been ingested, these tools can’t help if the source data itself is missing, corrupted, or compromised (such as maliciously encrypted for ransom) in any way. These kinds of problems, as Clumio’s VP of Product Chadd Kenney suggests, can only be solved by adopting measures to protect the data when it resides in an object storage repository like Amazon S3 — before ingestion. Furthermore, to protect against logical deletions, it is advisable to maintain continuous, immutable and preferably air-gapped backups that can be immediately restored in Snowpipe.

3. Consider SCIM with multi-factor authentication

Enterprises should use SCIM (system for cross-domain identity management) to help facilitate automated provisioning and management of user identities and groups (ie, roles used to authorize access to objects such as tables, views, and functions) in Snowflake. This makes user data more secure and simplifies the user experience by reducing the role of local system accounts. In addition, by using SCIM where possible, enterprises will also have the ability to configure SCIM providers to synchronize users and roles with active directory users and groups.

On top of this, companies should also use multi-factor authentication to set up an extra layer of security. Depending on the interface used, such as client applications using drivers, Snowflake UI, or Snowpipe, the platform may support multiple authentication methods, including username/password, OAuth, key pair, remote browser, federated authentication using SAML, and Okta native authentication. If multiple methods are supported, the company recommends giving top preference to OAuth (either Snowflake OAuth or Remote OAuth) followed by Remote Browser Authentication and Okta-Native Authentication and Key Pair Authentication.

4. Access control at column level

Organizations should use Snowflake’s dynamic data masking and external tokenization capabilities to limit individual users’ access to sensitive information in certain columns. For example, dynamic data masking, which can dynamically mask column data based on who queries it, can be used to limit the visibility of columns based on the user’s country, such that a US employee can only see US order data, while French employees can only see order data from France .

Both features are quite efficient, but they use masking policies to work. To get the most out of it, organizations should first decide whether they want to centralize masking policy management or decentralize it to individual database-owning teams, depending on their needs. In addition, they must also use invoker_role() in policy conditions to enable unauthorized users to view aggregate data on protected columns while keeping individual data hidden.

5. Implement a uniform audit model

Finally, organizations should not forget to implement a unified audit model to ensure transparency in the policies being implemented. This will help them actively monitor policy changes, such as who created what policy that gave user X or group Y access to certain data, and is just as critical as monitoring query and data access patterns.

To see account usage patterns, use a system-defined, read-only shared database called SNOWFLAKE. It has a schema called ACCOUNT_USAGE that contains views that provide access to one year of audit logs.

VentureBeat’s mission will be a digital town square for technical decision makers to gain knowledge about transformative business technology and transactions. Discover our orientations.

Leave a Reply

Your email address will not be published. Required fields are marked *