Getting Started with GoCloudera
GoCloudera is an AI infrastructure cost intelligence platform that monitors GPU and LLM spend across AWS, Azure, GCP, Kubernetes, and SageMaker. This guide walks you through setup from zero to cost visibility in under 30 minutes.Prerequisites
- A GoCloudera account (sign up at gocloudera.com)
- GPU instances running on at least one supported cloud provider
- Python 3.9+ (for the agent) or Docker
Step 1: Log In and Get Your API Key
- Log in to the GoCloudera dashboard at
https://app.gocloudera.com - Navigate to Cloud Accounts in the left sidebar (admin section)
- Click Platform Info to see your tenant’s API key
- Copy your API Key and Tenant ID — you’ll need these to configure the agent
Step 2: Install the GPU Agent
The GoCloudera agent runs in your infrastructure and collects GPU metrics, instance data, and cost information. It sends data to the dashboard via HTTPS or gRPC.Option A: Docker (Recommended)
Option B: Python
Option C: Kubernetes
Step 3: Configure Cloud Access
The agent needs read-only access to your cloud accounts to collect metrics and cost data.AWS
Create an IAM role with these managed policies:AmazonEC2ReadOnlyAccessAWSCostExplorerReadOnlyAccess(for cost data)AmazonSageMakerReadOnly(if using SageMaker)
Azure
Create a service principal with Reader and Cost Management Reader roles:Google Cloud
Create a service account with Compute Viewer and BigQuery Data Viewer (for billing export) roles:Step 4: Verify Data Is Flowing
Once the agent is running, data should appear in the dashboard within 5 minutes (one monitoring cycle).- Open the GoCloudera dashboard
- Check the Dashboard page — you should see GPU instance counts and utilization charts
- Check GPU Instances — your instances should appear with state, type, and utilization
- Check Costs — cost data appears after the first cost collection cycle
- Check agent logs:
docker logs gocloudera-agentor check./logs/unified_agent.log - Verify your API key is correct
- Verify cloud credentials have the required permissions
- Check the agent health:
python main.py --health
Step 5: Set Up Alerts
- Navigate to Alert Rules in the sidebar
- Click Create Rule
- Configure your first alert:
- Metric: GPU Utilization
- Operator: Less than
- Threshold: 10%
- Duration: 30 minutes
- Severity: Medium
- Set up a notification channel in Customization → Notifications:
- Slack webhook URL for real-time alerts
- Email for daily digests
- PagerDuty for critical alerts
Step 6: Configure Enforcement Policies
- Navigate to Enforcement in the sidebar
- Use a template to get started:
- “Idle GPU Auto-Stop” — automatically stops instances idle for 15+ minutes
- “Weekend Cost Saver” — scales down 75% on Sat/Sun
- “Dev/Test Auto-Shutdown” — stops dev instances at 7pm
- Set the execution mode:
- Notify Only — sends alerts but takes no action (start here)
- Approval Required — queues actions for admin approval
- Auto — executes immediately (use once you trust the rules)
What’s Next
- API Reference — integrate GoCloudera into your workflows
- Agent Configuration — advanced agent settings
- Architecture — understand how the platform works
- Customization — branding, SSO, and tenant settings