Managing your Tableau Site Content with Python

By Jeremy Bennet (Consultant BI Engineer)

In this blog, I explore how the Python library Tableau Server Client (TSC) can be used to help manage your Tableau cloud site. Here at Crimson Macaw, we recently developed a suite of TSC Python scripts to assist a client with an internal project to re-structure their Tableau site.  

As a manager or person involved in administering a Tableau Cloud site you have probably been asked the following questions about site content and access: 

  • Who has access to Tableau (or certain projects, workbooks, and data sources)? 
  • What content is available on the Tableau site? 
  • Who is a member of group ‘XYZ’? 
  • What Tableau content can these users see? 

Looking up a single user or workbook is easy to do via the online browser.  But what about 20 users, or if a workbook has four groups assigned, each with over 50 users? Although it is possible to piece this kind of information together it would be time-consuming, manual, and potentially out of date very quickly. Wouldn’t it be handy to have a Tableau-style report that can map out all the content and permissions structures? This is where the Tableau Server Client comes in.  

Admin Insights 

Tableau is adding transparency to the Tableau Cloud deployment through Admin Insights. Admin Insights is a project folder populated by Tableau with data sources and a pre-built workbook of the site’s data. This allows you to create custom workbooks around content usage, group membership and users. Tableau provides some guidance on how to get the best out of admin insights. 

Admin Insights

It is a big step in the right direction and can enable some of the basic questions about content availability and usage but, at present, it does not contain data to answer questions about access and permissions.  

Introducing Tableau Server Client (TSC) 

Tableau Server Client or TSC as it is known is a Python library built to interact with the Tableau Server REST API. It can be used to query almost anything on the Tableau Cloud site – it can also publish content and create or delete users/groups too, but that is another blog entirely! 

Use Cases 

When using TSC in a querying capacity to retrieve data about the site, the key use cases are: 

  • Tableau Governance – can be used to develop custom datasets that can support good data governance and visibility of permissions attached to sensitive or commercially confidential reports 
  • Refactoring and Audit – if like me you have seen numerous Tableau Server and Online deployments, you will probably agree that they can become a little unwieldy over time. ‘Temp’ and ‘Dev’ reports that have been sitting for years, v1, v2, and v3 workbooks or workbooks in various locations with the same name. Project and Group names with little structure or meaning. If you are looking to refactor your site, a good set of audit reports can be an extremely helpful starting place.  It can be used for re-mapping existing content into a new structure  
  • Helping users find what they need – many users complain that they do not know what information is available or where to find it. TSC can be used to provide a site content map and the necessary groups that would be required for access. This helps with user requests and transparency of content. 

Getting Started with the Python TSC 

As always, we need to install the TSC library. 

pip install tableauserverclient

To access your Tableau Cloud site, you will need to authenticate and provide the site details. It is possible to do this with your username and password, but it is recommended to use either a personal access token or SSL certificate method so that you do not store passwords directly. Here is the basic authentication using a personal access token (which is created on the Online site under ‘Manage Account Settings > Personal Access Tokens’). You will need to be an administrator to create access tokens.  

import tableauserverclient as TSC 
 
# access the tableau site 
tableau_auth = TSC.PersonalAccessTokenAuth( 
    "token_name", 
    "token", 
    "tableau_site_name", 
) 
 
server = TSC.Server("https://10ax.online.tableau.com/", 
    use_server_version=True 
) 
 
with server.auth.sign_in(tableau_auth):

The authentication flow is simplified by using TSC’s built-in support for a Python ‘with’ block to perform the sign-in (seen in the last line of the excerpt above). After the block has been executed, the sign-out is called automatically. 

If you are developing a series of scripts, moving on from your proof-of-concept or want to publish your project to a Git repository you should store your Tableau site credentials elsewhere. In the example below credentials are retrieved from environment variables (stored outside of the Git repository).  They are then passed to the TSC sign-in method. 

# Grab credentials from env variables set outside of repository 
token_name = os.environ.get('tableau_token_name') 
token_secret = os.environ.get('tableau_token_secret') 
server_url = os.environ.get('tableau_url') 
site_name = os.environ.get('tableau_site_name') 
 
tableau_auth = TSC.PersonalAccessTokenAuth( 
    token_name, 
    token_secret, 
    site_name 
) 
 
server = TSC.Server(server_url, use_server_version=True) 
 
# auto signs out of the server client after processing with block 
with server.auth.sign_in(tableau_auth):

Querying Content 

 Now that we have access to our Tableau Cloud site we can get started with some querying. Here we simply print out all workbooks on the site. 

with server.auth.sign_in(tableau_auth): 
 
    # get all workbook names and IDs 
    all_workbooks, pagination_item = server.workbooks.get() 
    print( 
        "\nThere are {} workbooks on site: " 
        .format(pagination_item.total_available) 
    ) 
 
    for workbook in all_workbooks: 
        workbook_name = workbook.name 
        workbook_id = workbook.id 
        created_at = workbook.created_at 
        project_name = workbook.project_name 
 
        print( 
            f"Workbook Name: {workbook_name}, Workbook ID: " 
            + f"{workbook_id}; \n  Created at: " 
            + f"{created_at}, Project folder: {project_name}" 
        ) 

In the example above we have accessed the WorkbookItem class to retrieve workbook objects from the Tableau Server REST API. We then utilise some of the built-in methods to operate on the workbook object, for example, the below line queries the server and returns information about the workbooks on the site. 

server.workbooks.get()

Once we have retrieved a workbook item, we can access a range of attributes about the workbook. These include id, name, owner_id, created_at, project_id, project_name etc.  

There are similar classes available for projects, users, groups, jobs, tasks, and data sources. All with their attributes and methods for retrieving information.  

Results from the example above are printed to the Terminal. 

terminal results

Accessing Permissions 

The permissions architecture for Tableau Cloud has developed over time to become quite a powerful and customisable feature. For example, permissions can be set on a workbook, project folder or data source.  

Within the TSC documentation, there are a few examples of how to access the permissions of an item. There is a brief example of how to access project permissions using the populate_permissions method. 

Example code snippet from the TSC Documentation on ‘Querying Project Permissions’ 

all_project_items, pagination_item = server.projects.get() 
project_item = all_project_items[0] 
server.projects.populate_permissions(project_item)

You will also find some examples of this in the Github samples repository.  

We will adapt this and apply it to our workbook item object to populate the permissions for each workbook. 

for workbook in all_workbooks: 
 
    try: 
        server.workbooks.populate_permissions(workbook) 
        permissions = workbook.permissions

The above variable ‘permissions’ now contains a list of each permissions object attached to the workbook in the for loop (i.e. one workbook can have many permissions objects). At this point, the TSC documentation deserts us. We are on our own looking for examples of how to access permissions attributes!  

From various online searches there appear to be three attributes: 

  • grantee.tag_name 
  • grantee.id 
  • capabilities 

Where tag_name is the type of permissions object (either a group or user), ID is the id of the group (or user), and capabilities is a dictionary of key:value pairs where the keys are the capabilities, and the values indicate whether the capability is allowed or denied. Here is an example of the capabilities attribute from the Superstore workbook on my dev site: 

{‘RunExplainData’: ‘Allow’, ‘Read’: ‘Allow’, ‘Filter’: ‘Allow’, ‘Write’: ‘Allow’, ‘ShareView’: ‘Allow’, ‘ViewUnderlyingData’: ‘Allow’, ‘ViewComments’: ‘Deny’, ‘ExportData’: ‘Deny’, ‘ExportImage’: ‘Deny’, ‘AddComment’: ‘Deny’} 

Heading back to our example script to extract permissions for each workbook. We now iterate through each permissions object to capture the type (group or user) and its capabilities. 

for permitee in permissions: 
    group_user_type = permitee.grantee.tag_name 
    group_user_id = permitee.grantee.id 
    capabilities = permitee.capabilities 
     
    row_data = [] 
 
    if group_user_type == 'user': 
        user_item = server.users.get_by_id(permitee.grantee.id) 
        group_user_name = user_item.name 
    elif group_user_type == 'group': 
        for group_item in TSC.Pager(server.groups): 
            if group_item.id == group_user_id: 
                group_user_name = group_item.name 
                break

Aside from capturing the tag_name, id, and capabilities we also obtain the name of the group or user through an IF block. You may have spotted the empty list row_data’ – this is useful if you want to store and extract the information you have obtained. In the final excerpt below I collate the data I want for each permissions object into a list and then append that to my ‘dataset’ list before writing it to a CSV file. 

# create row_data record for adding to the csv file 
row_data.extend( 
    [ 
        workbook.name, 
        workbook.project_name, 
        group_user_type.capitalize(), 
        group_user_name, 
        capabilities, 
    ] 
) 
# add the record to the dataset 
dataset.append(row_data) 

# write data to csv 
header = ["Workbook Name", "Project name", "Type", "Name", "Capabilities"] 
 
with open( 
    "Exports/Blog/tableau-workbook-permissions.csv", 
    "w", 
    encoding="utf-8", 
    newline="", 
) as f: 
    writer = csv.writer(f) 
 
    # write the header 
    writer.writerow(header) 
 
    # write multiple rows 
    writer.writerows(dataset)

Here is the final output. 

Final output

You can adapt this approach to extract permissions structures for data sources and project folders too.  

Improvements to TSC and Admin Insights 

I can see a need for additional audit history data within the Tableau REST API to support organisations with their governance frameworks. I would like to see the timestamp and user associated with the following actions: 

  • Creation and deletion of users and groups 
  • Addition or removal of users or groups to a project, data source or workbook 
  • Changes to permission capabilities 

The Admin Insights section of Tableau Cloud is a welcome addition. Looking ahead I would like to think that Tableau will add permissions data to the set of standard data sources so that customers can better manage the governance of their site. 

 Summary 

In this blog, I demonstrated how to use the Tableau Server Client (TSC) Python library to extract custom data from your Tableau Cloud site and how this could be useful in supporting data governance and planning for a refactor. If you would like to know more, including how this could be orchestrated into an automated process, or how to navigate deep project folders, please get in touch.