Google AI Platform
AI ML Platform
Google AI Platform (Vertex AI)
Overview
Google AI Platform (now Vertex AI) is Google's unified AI and machine learning platform that provides a comprehensive set of services covering the entire ML lifecycle from development to production deployment. As of 2025, it has rapidly grown into an enterprise-level AI development foundation, featuring the Gemini 2.5 model family, Model Garden, and extensive AutoML capabilities.
Details
Google AI Platform (Vertex AI) serves as Google Cloud's core AI and machine learning service, consisting of the following key components:
Core Services
- Vertex AI Workbench: Integrated JupyterLab environment for data analysis and model development
- AutoML: Build high-quality custom models without writing code
- Model Garden: Unified access to Google, third-party, and open-source models
- Vertex AI Pipelines: Automation and management of ML workflows
- Feature Store: Centralized management and reuse of machine learning features
Major 2025 Updates
- Gemini 2.5 Pro: Latest multimodal model for complex reasoning tasks
- Gemini 2.0 Flash: High-speed model with image generation capabilities (preview)
- Expanded Model Garden: New models including Anthropic Claude, Llama 4, Qwen3, Gemma 3
- Gen AI Evaluation Service: Generative AI model evaluation service now GA (Generally Available)
- Private Service Connect: Enhanced private connectivity for ML pipeline execution
Technical Features
- Unified Platform: Centralized management from data preparation to model deployment
- Scalable: Automatic scaling on Google Cloud infrastructure
- Enterprise-Ready: Security, compliance, and governance features
- Multi-Cloud Support: Integration with other cloud providers via Anthos
Advantages and Disadvantages
Advantages
- Comprehensive ML Platform: Consistent workflow from data preprocessing to production operations
- Rich Pre-trained Models: Access to cutting-edge Google models like Gemini, Imagen, Chirp
- AutoML Capabilities: Build high-quality models without extensive ML expertise
- Google Cloud Integration: Seamless connectivity with BigQuery, Cloud Storage, GKE
- Powerful MLOps: Automated version control, experiment tracking, and model monitoring
- Global Endpoints: Low-latency service delivery worldwide
Disadvantages
- High Learning Cost: Complexity due to extensive features, requiring significant initial learning time
- Complex Pricing: Usage-based pricing model makes budget management challenging
- Vendor Lock-in: Google Cloud-specific features make migration to other platforms difficult
- AutoML Limitations: Some AutoML features migrated to Gemini prompt tuning after September 2024
- Private Connectivity Constraints: Requires detailed connection control in enterprise environments
Reference Links
- Google Cloud Vertex AI Official Site
- Vertex AI Documentation
- Generative AI on Vertex AI
- Model Garden Overview
- Vertex AI Release Notes
- Vertex AI Pricing
Code Examples
1. Basic Setup and Authentication
# Install required libraries
%pip install --upgrade --quiet google-cloud-aiplatform
# Initialize Vertex AI SDK
import vertexai
from google.cloud import aiplatform
# Project configuration
PROJECT_ID = "your-project-id"
LOCATION = "us-central1"
STAGING_BUCKET = "gs://your-staging-bucket"
# Initialize Vertex AI
vertexai.init(project=PROJECT_ID, location=LOCATION)
aiplatform.init(
project=PROJECT_ID,
location=LOCATION,
staging_bucket=STAGING_BUCKET,
experiment='my-ml-experiment'
)
print(f"Vertex AI initialized for project: {PROJECT_ID}")
2. AutoML and Custom Model Training
# Tabular dataset creation and AutoML training
from google.cloud import aiplatform
# Load dataset
dataset = aiplatform.TabularDataset.create(
display_name="customer-churn-dataset",
gcs_source=["gs://your-bucket/customer_data.csv"],
sync=True
)
# Configure AutoML training job
automl_job = aiplatform.AutoMLTabularTrainingJob(
display_name="churn-prediction-automl",
optimization_prediction_type="classification",
optimization_objective="maximize-au-prc",
column_specs={
"customer_id": "auto",
"churn": "auto",
}
)
# Execute model training
model = automl_job.run(
dataset=dataset,
target_column="churn",
training_fraction_split=0.8,
validation_fraction_split=0.1,
test_fraction_split=0.1,
budget_milli_node_hours=1000,
model_display_name="churn-prediction-model",
disable_early_stopping=False,
)
print(f"AutoML training completed. Model: {model.display_name}")
3. Model Deployment and Prediction
# Model endpoint deployment
from google.cloud import aiplatform
# Get existing model
model = aiplatform.Model('projects/my-project/locations/us-central1/models/{MODEL_ID}')
# Create endpoint and deploy
endpoint = model.deploy(
display_name="churn-prediction-endpoint",
machine_type="n1-standard-2",
min_replica_count=1,
max_replica_count=5,
traffic_percentage=100,
sync=True
)
# Execute online predictions
prediction_data = [
[25, 50000, 2, 1, 0], # age, income, years_used, support_tickets, premium_member
[45, 85000, 5, 0, 1]
]
predictions = endpoint.predict(instances=prediction_data)
print(f"Churn predictions: {predictions.predictions}")
# Execute batch prediction job
batch_job = model.batch_predict(
job_display_name="batch-churn-prediction",
gcs_source=["gs://your-bucket/batch_prediction_data.csv"],
gcs_destination_prefix="gs://your-bucket/predictions/",
instances_format="csv",
predictions_format="jsonl",
machine_type="n1-standard-4",
sync=False
)
print(f"Batch prediction job started: {batch_job.display_name}")
4. Batch Processing and Pipeline Management
# ML Pipeline using Vertex AI Pipelines
from google.cloud.aiplatform import PipelineJob
import google.cloud.aiplatform as aip
# Create pipeline definition
@aip.pipeline(name="ml-training-pipeline")
def training_pipeline(
project_id: str,
dataset_path: str,
model_name: str
):
from google.cloud.aiplatform.v1.types import InputDataConfig
# Data preprocessing component
preprocess_task = aip.preprocessing_component(
input_data_path=dataset_path,
output_data_path="gs://your-bucket/processed/",
)
# Model training component
training_task = aip.training_component(
input_data=preprocess_task.outputs["processed_data"],
model_output_path="gs://your-bucket/models/",
hyperparameters={"learning_rate": 0.01, "epochs": 100}
)
# Model evaluation component
evaluation_task = aip.evaluation_component(
model=training_task.outputs["model"],
test_data=preprocess_task.outputs["test_data"]
)
return evaluation_task.outputs
# Compile and execute pipeline
pipeline_job = PipelineJob(
display_name="automated-ml-pipeline",
template_path="pipeline.json",
parameter_values={
"project_id": PROJECT_ID,
"dataset_path": "gs://your-bucket/raw_data.csv",
"model_name": "automated-model"
},
pipeline_root="gs://your-bucket/pipeline-root/"
)
# Run pipeline
pipeline_job.run(
service_account="[email protected]",
sync=True
)
print(f"Pipeline execution completed: {pipeline_job.display_name}")
5. Feature Store and Data Management
# Using Vertex AI Feature Store
from google.cloud.aiplatform import Featurestore, EntityType, Feature
# Create Feature Store
featurestore = Featurestore.create(
featurestore_id="customer-features",
location=LOCATION,
sync=True
)
# Create entity type
entity_type = EntityType.create(
entity_type_id="customer",
featurestore_name=featurestore.resource_name,
sync=True
)
# Define features
age_feature = Feature.create(
feature_id="age",
entity_type_name=entity_type.resource_name,
value_type="INT64",
sync=True
)
income_feature = Feature.create(
feature_id="annual_income",
entity_type_name=entity_type.resource_name,
value_type="DOUBLE",
sync=True
)
# Ingest feature data
from google.cloud.aiplatform_v1 import FeaturestoreServiceClient
import pandas as pd
# Prepare sample data
feature_data = pd.DataFrame({
"customer_id": ["cust_001", "cust_002", "cust_003"],
"age": [25, 45, 35],
"annual_income": [50000.0, 85000.0, 70000.0],
"event_time": pd.to_datetime(["2025-01-01", "2025-01-01", "2025-01-01"])
})
# Execute batch ingestion job
import_job = entity_type.batch_create_features(
feature_specs=[
{"id": "age", "value_type": "INT64"},
{"id": "annual_income", "value_type": "DOUBLE"}
],
sync=True
)
print(f"Feature Store setup completed: {featurestore.resource_name}")
6. MLOps and Model Monitoring
# Model monitoring and metrics tracking
from google.cloud.aiplatform import Model, ModelMonitoringJob
from google.cloud.aiplatform.v1.types import ModelMonitoringSchema
# Configure monitoring for deployed model
model = aiplatform.Model('projects/my-project/locations/us-central1/models/{MODEL_ID}')
# Set up model monitoring alerts
monitoring_config = {
"skew_detection": {
"data_skew_threshold": 0.8,
"attribution_score_threshold": 0.7
},
"drift_detection": {
"drift_threshold": 0.8,
"attribution_score_threshold": 0.7
},
"explanation_config": {
"enable_feature_attributes": True
}
}
# Create monitoring job
monitoring_job = ModelMonitoringJob.create(
display_name="model-performance-monitoring",
model_name=model.resource_name,
target_dataset="projects/{PROJECT_ID}/datasets/monitoring_dataset",
notification_channels=["projects/{PROJECT_ID}/notificationChannels/{CHANNEL_ID}"],
monitoring_config=monitoring_config,
sync=True
)
# Model version management and experiment tracking
from google.cloud import aiplatform
import vertexai.preview.vertex_ai_tracking as tracking
# Start experiment tracking
with tracking.init_experiment("model-optimization-experiment"):
# Log parameters
tracking.log_params({
"learning_rate": 0.01,
"batch_size": 32,
"optimizer": "adam"
})
# Log metrics
tracking.log_metrics({
"accuracy": 0.95,
"precision": 0.93,
"recall": 0.96,
"f1_score": 0.94
})
# Log model artifacts
tracking.log_model(
model=model,
artifact_path="model-artifacts",
model_name="optimized-churn-model"
)
# Traffic splitting for A/B testing
endpoint.update_traffic_split({
"model_v1": 70, # 70% traffic
"model_v2": 30 # 30% traffic
})
print("MLOps monitoring and experimentation setup completed")