Avoiding Legal Pitfalls Through Savvy Data Governance
According to Matt Vogt, VP of global solution architecture at Immuta, many governance solutions are equipped with a “data dictionary idea, so users can upload their own definition of what data is. They can even give us different rules from a discovery perspective to go find that data and look for it as we connect to sources.” Some solutions provide templates for authoring access policies—enforceable code in source systems for applying controls to specific data elements—for different regulations. Additional resources entail what Vogt termed “a policy library so we can help customers through that journey. If you’re a healthcare customer, we have these other 30 healthcare customers that are managing similar types of data assets. So, we can give you that kind of kick-starter for policies so you’re not starting from scratch.”
Access Controls
The policy authoring experience is typically codeless, with some offerings, including Immuta’s, providing natural language policy writing capabilities. In other offerings, this functionality may be coupled with drop-down menus for point-and-click, drag-and-drop approaches. Codeless authoring expedites the time required to implement policies, which can be an imperative. “If the SEC makes a new requirement and has a short turnaround due date, you don’t have time to train [machine learning] models; you need to put it into effect immediately,” Kamien mentioned. Access controls for policies typically involve a few paradigms, including these:
♦ Role-Based Access Controls: Frequently accompanied by access control lists (ACLs), this method is based on an individual’ s role in an organization and its particulars for accessing data. However, whenever something changes about that role—such as where someone is accessing data—a new role must be created. This methodology is difficult to scale.
♦ Attribute-Based Access Controls: This model is predicated on attributes of the data and of the tools individuals employ for accessing it. The latter includes “attributes of a user’s session, for instance. What time of day it is, or where they’re connecting from,” remarked Paul Moxon, SVP, data architectures, and chief evangelist at Denodo. Tag-based access controls, which are similar to attribute-based access controls, are predicated entirely on the data’s attributes or tags devised during the discovery and classification process.
♦ Semantic-Based Access Controls: Also referred to as universal access controls, this paradigm enforces controls at a comprehensive data access layer, which may be employed for data products or the data fabric architecture. With this approach, organizations are accessing knowledge—that can stem from any array of departments or sources—from a single location, which Saptarshi Sengupta, senior director of product marketing at Denodo, characterized as a “semantic layer.” Access policies can be applied at this universal layer for masking customers’ addresses, for example, even if they emanate from multiple departments or sources.
Governed Knowledge Sharing
The purpose of classifying, tagging, writing policies for, and controlling access to, knowledge is frequently to distribute it to different users within, and between, respective organizations, departments, and teams. The data product notion, in which individual business units create a data model, definitions, and data governance policies for accessing datasets, is applicable for sharing knowledge among KM practitioners. Data products hinge on centralized oversight of data governance teams to look across and within data products (and how they’re accessed) to comply with an organization’s governance rules. “The challenge of data products is getting business units to take ownership or responsibility for their data,” Moxon said. “But can we trust them to apply all our corporate legal jurisdiction of policies correctly?”