Cart updating

ShopsvgYour cart is currently is empty. You could visit our shop and start shopping.

Loading
svg
Open
svg0

The 4-Hour Problem: Google Workspace OAuth Audit Time with Parallelism

October 4, 20257 min read

Auditing security in a large-scale Google Workspace environment can be a time sink. Specifically, compiling a list of all OAuth Client IDs—the third-party apps and services users have authorized—for a massive domain can take hours.

For my organization, a full audit of OAuth tokens was a painful 4-hour task due to the sheer volume of users and the inherent rate limits of the Google Admin SDK API. I knew there had to be a faster way.

Well, we’re entering an era that changes everything. A few critical technology breakthroughs and fundamentally more accessible platforms are changing everything. From free web-based tools with templates that help conquer the fear of the blank screen to powerful generative artificial intelligence that conjures up anything from a text prompt, expressing yourself creatively no longer requires climbing creativity’s notoriously steep learning curve.

I’m happy to share the Python script I developed that leverages multiprocessing to tackle this monumental task, reducing the execution time from 4 hours to a blazing fast 15 minutes

The Bottleneck: Sequential API Calls and Rate Limits

When auditing user tokens via the Google Admin SDK’s tokens().list() method, you have to query each user individually.

In a large domain, this means tens of thousands of sequential API calls. Even with the best code, a single-threaded script quickly hits two major walls:

  1. Latency: Each API call has a round-trip time, which adds up massively over thousands of users.
  2. Rate Limits (The Real Killer): Google limits the number of requests you can make in a short period. Hitting a 403 (Forbidden) or 429 (Too Many Requests) error forces you to implement a pause, further dragging out the process.

My solution was simple in concept, but critical in execution: Parallelism.

The Engine of Speed: Python’s multiprocessing.Pool

The single most critical factor in slashing the audit time from 4 hours to 15 minutes was moving from a sequential, user-by-user loop to parallel processing using the multiprocessing module.

How the Parallelism is Executed in the Code

  1. Divide the Work (Chunking): The main() function retrieves the full user list and divides it into eight chunks.
# Calculate chunk size to ensure even distribution
chunk_size = total_users_to_process // NUM_WORKERS + 1
user_chunks = [all_users[i:i + chunk_size] for i in range(0, total_users_to_process, chunk_size)]

2. Spin up the Workers (Pool): The multiprocessing.Pool distributes these chunks to the workers.

logging.info(f"Starting a pool of {NUM_WORKERS} workers.")
with Pool(processes=NUM_WORKERS) as pool:
    chunked_results = pool.map(worker_process, user_chunks)
  • The pool.map() function executes the worker_process simultaneously across multiple CPU cores.

3. The Independent Worker: The worker_process function runs entirely independently. Crucially, each worker creates its own, independent authenticated Admin SDK service object (service = get_admin_service()). This isolates API connections, maximizing throughput and ensuring that any rate limiting or connection errors only affect that specific worker, not the entire job.

By simultaneously hitting the Google API from eight different processes, we bypass the inherent latency of sequential I/O and effectively multiply our processing capacity, leading directly to the dramatic time reduction.

Deeper Code Dive: Robustness and Data Integrity

Speed is useless if the script crashes or produces messy data. These technical choices ensure the script is reliable and secure:

1. Robust Error Handling and Backoff

The get_user_tokens_with_retry function is the hero against rate limits, implementing exponential backoff.

  • Rate Limit Detection: It specifically catches HttpError with status codes 403 or 429.
  • Exponential Backoff: The wait time is calculated as 2i+random(0,1) (where i is the retry attempt), which dramatically increases the pause after repeated failures, respecting Google’s limits while guaranteeing task completion.

2. Secure Authentication: Impersonation is Key 🔑

To audit all users, the script uses a Service Account with Domain-Wide Delegation (DWD), impersonating a Super Admin.

credentials = service_account.Credentials.from_service_account_file(
    SERVICE_ACCOUNT_FILE,
    scopes=SCOPES,
    subject=IMPERSONATED_USER # The magic line for DWD
)

This is necessary for the Admin SDK permissions and ensures the sensitive service account key is securely managed, away from any user-facing interaction.

3. The Power of Pandas for JSON Normalization 🧹

The raw JSON from the API contains nested lists (like scopes), which are difficult to handle in CSV. I used Pandas to instantly clean and structure the output.

# Use pandas.json_normalize to correctly flatten the JSON
df = pd.json_normalize(tokens_list)
# Convert the 'scopes' list into a comma-separated string
df['scopes'] = df['scopes'].apply(lambda x: ','.join(x) if isinstance(x, list) else '')

The pd.json_normalize function automatically flattens the complex JSON into a clean DataFrame, allowing us to standardize the final output columns and save a perfectly structured CSV.

4. Implementing Robust Checkpointing (Resuming the Audit) 💾

The main() function implements checkpointing by checking if the output CSV already exists. If it does, it reads the userEmail column to create a processed_users set. Any users found in this set are then excluded from the master user list retrieved from the Google API. This allows the script to be stopped and restarted without restarting the entire 15-minute process, resuming exactly where it left off.

Key Takeaways for Speed

This script turned a four-hour administrative burden into a routine, quarterly fifteen-minute task. It’s a powerful demonstration of how a relatively small investment in parallel programming can yield massive productivity gains:

  1. Parallelize at the I/O Level: Use multiprocessing.Pool to turn network latency waits into parallel work execution.
  2. Smart Retries are Mandatory: Implement Exponential Backoff to manage rate limits gracefully.
  3. Isolate Resources: Ensure each parallel worker has its own independent API connection object (service).
  4. Adopt Robustness: Use Checkpointing and Pandas for reliable execution and clean data output.

What other time-consuming domain tasks are you looking to automate with Python? Let me know in the comments!

How do you vote?

18 People voted this article. 15 Upvotes - 3 Downvotes.
Tagged In:#people, #thinking, #truth,
svg

What do you think?

Show comments / Leave a comment

Leave a reply

Loading
svg