Setting Up Triggers
Learn how to configure triggers to automatically or manually execute data pipelines.
Overview
Triggers determine when and how data pipelines execute. Triggers are part of the data pipeline definition. FoundationaLLM currently supports two types of triggers:
- Schedule: Run pipelines on a recurring schedule (cron-based) - ✅ Available
- Manual: Run pipelines on-demand through the UI or API - ✅ Available
- Event: Run pipelines in response to data source events - ⚠️ Not Currently Available (planned for future release)
Note: Triggers are defined as part of the data pipeline configuration, not as separate resources.
Currently Available Trigger Types
Schedule Triggers
Execute pipelines automatically on a recurring schedule.
Use Cases:
- Daily data refreshes
- Nightly batch processing
- Weekly full reindexing
- Regular data synchronization
Configuration:
| Field | Type | Description | Example |
|---|---|---|---|
name |
string | Trigger identifier | "Daily Refresh" |
trigger_type |
string | Must be "Schedule" | "Schedule" |
trigger_cron_schedule |
string | Cron expression | "0 6 * * *" |
parameter_values |
object | Parameter overrides | See below |
Cron Expression Format:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6, Sunday = 0)
│ │ │ │ │
│ │ │ │ │
* * * * *
Common Cron Patterns:
| Pattern | Description | Example |
|---|---|---|
0 6 * * * |
Daily at 6:00 AM | Daily refresh |
0 */6 * * * |
Every 6 hours | Frequent updates |
0 0 * * 0 |
Weekly on Sunday midnight | Weekly full sync |
0 2 1 * * |
Monthly on 1st at 2:00 AM | Monthly refresh |
0 22 * * 1-5 |
Weekdays at 10:00 PM | Business day processing |
Example Schedule Trigger:
{
"name": "Daily Morning Refresh",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 6 * * *",
"parameter_values": {
"DataSource.MyDataLake.Folders": ["data/documents"],
"Stage.Partition.PartitioningStrategy": "Token",
"Stage.Embed.EmbeddingModel": "text-embedding-3-large"
}
}
Manual Triggers
Execute pipelines on-demand through user action or API call.
Use Cases:
- Ad-hoc data processing
- Testing configurations
- Initial data loads
- On-demand updates
Configuration:
| Field | Type | Description |
|---|---|---|
name |
string | Trigger identifier |
trigger_type |
string | Must be "Manual" |
parameter_values |
object | Default parameters (can be overridden) |
Invocation Methods:
- Management Portal "Run" button
- REST API endpoint
- SDK method call
Example Manual Trigger:
{
"name": "Manual Full Refresh",
"trigger_type": "Manual",
"parameter_values": {
"DataSource.DataLake.Folders": ["all-documents"],
"Stage.Partition.PartitionSizeTokens": 500
}
}
Future Feature: Event Triggers
⚠️ Not Currently Available: Event-based triggers are planned for a future release but are not yet implemented in the current version of FoundationaLLM.
Planned Use Cases:
- Process files as soon as they're added
- Respond to data updates
- Real-time data ingestion
- Incremental processing
Planned Configuration (for future reference):
| Field | Type | Description |
|---|---|---|
name |
string | Trigger identifier |
trigger_type |
string | Will be "Event" |
parameter_values |
object | Parameter overrides |
Future Event Types (planned):
- File Added: New files in storage
- File Modified: Existing files updated
- File Deleted: Files removed (for cleanup)
Note: When event triggers become available, they will require data source plugin support. Not all data sources will necessarily support event-based triggering.
Parameter Values
Triggers must provide parameter values for all required pipeline parameters. These parameters are defined by the data pipeline plugins (see PluginPackageManager.cs for complete parameter definitions).
Triggers must provide parameter values for all required pipeline parameters.
Parameter Naming Convention
Parameters use a hierarchical naming structure:
Data Source Parameters:
DataSource.{DataSourceName}.{ParameterName}
Stage Parameters:
Stage.{StageName}.{ParameterName}
Stage Dependency Parameters:
Stage.{StageName}.Dependency.{DependencyPluginName}.{ParameterName}
Example Parameter Values
{
"parameter_values": {
// Data source parameters
"DataSource.VGDataLake.Folders": [
"vectorization-input/Documents"
],
// Stage parameters
"Stage.Partition.PartitioningStrategy": "Token",
"Stage.Embed.EmbeddingModel": "text-embedding-3-large",
"Stage.Embed.EmbeddingDimensions": 2048,
"Stage.Index.IndexName": "documents-index",
"Stage.Index.IndexPartitionName": "Documents",
// Dependency parameters
"Stage.Partition.Dependency.TokenContentTextPartitioning.PartitionSizeTokens": 400,
"Stage.Partition.Dependency.TokenContentTextPartitioning.PartitionOverlapTokens": 100
}
}
Required vs. Optional Parameters
Schedule and Event Triggers:
- Must provide ALL required parameters
- No user interaction possible during execution
- Missing parameters cause immediate failure
Manual Triggers:
- Can provide default values
- User may be prompted for missing values
- More flexibility during invocation
Creating Triggers
In the Management Portal
- Navigate to Data Pipelines
- Edit the pipeline
- Scroll to the Triggers section
- Click Add Trigger
- Select trigger type
- Configure parameters
- Save the pipeline
Via API
POST /instances/{instanceId}/providers/FoundationaLLM.DataPipeline/dataPipelines
Content-Type: application/json
{
"name": "my-pipeline",
"triggers": [
{
"name": "Scheduled Trigger",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 6 * * *",
"parameter_values": { ... }
}
]
}
Managing Multiple Triggers
A single pipeline can have multiple triggers.
Common Pattern: Combine scheduled and manual triggers
{
"triggers": [
{
"name": "Daily Scheduled Run",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 6 * * *",
"parameter_values": { ... }
},
{
"name": "On-Demand Full Refresh",
"trigger_type": "Manual",
"parameter_values": { ... }
}
]
}
Benefits:
- Regular automated processing
- Flexibility for immediate updates
- Different parameter sets for different scenarios
Best Practices
Schedule Triggers
Timing:
- Schedule during low-usage periods
- Avoid peak business hours
- Consider time zones
Frequency:
- Balance freshness vs. resource usage
- More frequent = higher costs
- Less frequent = stale data
Overlap Prevention:
- Ensure previous run completes before next starts
- Use appropriate schedule intervals
- Monitor run durations
Manual Triggers
Default Values:
- Provide sensible defaults
- Document expected parameters
- Test common invocation patterns
Permissions:
- Control who can trigger manually
- Use RBAC appropriately
- Audit manual executions
Monitoring Triggers
Schedule Execution
Check that scheduled runs execute on time:
- Navigate to Data Pipeline Runs
- Filter by pipeline name
- Review execution times
- Verify against schedule
Trigger History
Track trigger performance:
- Success rate per trigger
- Average execution time
- Failure patterns
- Resource utilization
Troubleshooting
Schedule Trigger Not Firing
Possible Causes:
- Pipeline not active
- Invalid cron expression
- Time zone mismatch
- System maintenance
Solutions:
- Verify pipeline is active
- Test cron expression
- Check system status
- Review trigger configuration
Manual Trigger Fails
Possible Causes:
- Missing required parameters
- Invalid parameter values
- Insufficient permissions
- Pipeline already running
Solutions:
- Review parameter requirements
- Validate parameter values
- Check user permissions
- Verify no active runs
Parameter Errors
Problem: "Missing required parameter" error
Solution:
- Review pipeline configuration
- Identify required parameters
- Add to trigger's
parameter_values - Verify parameter names match exactly
Problem: "Invalid parameter value" error
Solution:
- Check parameter type (string, int, etc.)
- Verify value format
- Confirm resource references exist
- Test with known-good values
Common Scenarios
Scenario 1: Daily Full Refresh
{
"name": "Daily Full Refresh",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 2 * * *",
"parameter_values": {
"DataSource.Storage.Folders": ["/all-data"],
"Stage.Partition.PartitionSizeTokens": 400
}
}
Scenario 2: Hourly Incremental Update
{
"name": "Hourly Incremental",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 * * * *",
"parameter_values": {
"DataSource.Storage.Folders": ["/recent"],
"Stage.Index.IndexPartitionName": "incremental"
}
}
Scenario 3: Business Hours Processing
{
"name": "Business Hours Processing",
"trigger_type": "Schedule",
"trigger_cron_schedule": "0 9-17 * * 1-5",
"parameter_values": {
"DataSource.SharePoint.DocumentLibraries": ["Active Documents"]
}
}
Scenario 4: On-Demand Testing
{
"name": "Manual Test Run",
"trigger_type": "Manual",
"parameter_values": {
"DataSource.Storage.Folders": ["/test-data"],
"Stage.Embed.EmbeddingModel": "text-embedding-3-small"
}
}
Note: Event-based triggers (Scenario 4 in previous documentation) are not currently available. For near-real-time processing, consider using frequent scheduled triggers (e.g., every 15 minutes).