Monitoring Data Pipelines
Learn how to monitor data pipeline execution, track processing status, and understand performance characteristics.
Overview
Monitoring data pipelines helps you:
- Track processing progress in real-time
- Identify and diagnose failures quickly
- Understand processing performance and latency
- Plan capacity and scheduling optimization
- Measure the effectiveness of performance improvements
Performance and Latency
Understanding Processing Latency
Data pipeline latency refers to the time between when data is submitted and when it becomes available for agent queries. FoundationaLLM optimizes this through:
| Optimization | Benefit |
|---|---|
| Parallel Stage Processing | Multiple stages can run concurrently where dependencies allow |
| Batch Processing | Documents are processed in optimized batches |
| Efficient Embedding | Text embedding uses optimized batch sizes |
| Incremental Indexing | Only changed content is reprocessed |
Factors Affecting Latency
| Factor | Impact | Mitigation |
|---|---|---|
| Document Size | Larger documents take longer to process | Split large documents |
| Document Count | More documents increase total time | Use parallel processing |
| Embedding Model | Model complexity affects speed | Balance quality vs. speed |
| Index Size | Large indexes may slow indexing | Use index partitions |
| Network Latency | Remote services add overhead | Use regional deployments |
Performance Configuration
To optimize pipeline performance:
- Adjust Batch Sizes: Larger batches can improve throughput but increase memory usage
- Configure Parallelism: Set appropriate concurrent processing limits
- Tune Chunk Sizes: Balance between context quality and processing speed
- Schedule Off-Peak: Run large pipelines during low-usage periods
Monitoring Locations
Pipeline monitoring is available in two places:
- Data Pipelines - View pipeline configurations and status
- Data Pipeline Runs - View detailed execution history
Pipeline Status Indicators
In the Pipelines List
| Column | Description |
|---|---|
| Active | Whether the pipeline is enabled |
| Last Run | Most recent execution status (if shown) |
In Pipeline Runs
| Status | Description |
|---|---|
| Running | Currently processing data |
| Completed | Finished successfully |
| Failed | Encountered an error |
| Cancelled | Manually stopped |
Real-Time Monitoring
Watching Active Runs
- Navigate to Data Pipeline Runs
- Filter by Status: Running
- Use the refresh button to update status
- Watch for completion or failures
Progress Tracking
TODO: Document real-time progress indicators if available, such as:
- Items processed count
- Current stage indicator
- Estimated time remaining
- Processing rate metrics
Run Details
Click on a specific run to view detailed information:
Execution Log
TODO: Document the detailed execution log view, including:
- Stage-by-stage progress
- Timestamps for each step
- Items processed per stage
- Error messages and stack traces
Performance Metrics
TODO: Document available performance metrics:
- Total duration
- Time per stage
- Items per second
- Resource utilization
Alerting and Notifications
TODO: Document alerting capabilities if available:
- Failure notifications
- Completion notifications
- Integration with Azure Monitor or other systems
Historical Analysis
Viewing Trends
Use the Pipeline Runs page filters to analyze patterns:
- Filter to a specific pipeline
- Set a time range (e.g., Last 30 Days)
- Review success rates and durations
- Identify recurring issues
Common Patterns
| Pattern | Possible Cause |
|---|---|
| Intermittent failures | Network issues, resource contention |
| Increasing duration | Growing data volume, performance degradation |
| Consistent failures | Configuration error, permission issue |
| Success after retry | Transient errors, timeout issues |
Troubleshooting from Monitoring
Identifying Issues
- Failed Status: Check error messages in run details
- Long Duration: Review stage timing for bottlenecks
- Repeated Failures: Look for patterns in failure timing/type
Common Issues
| Issue | Investigation Steps |
|---|---|
| Connection failures | Check data source configuration, network |
| Timeout errors | Increase timeout settings, reduce batch size |
| Resource errors | Check storage capacity, API quotas |
| Data errors | Review source data quality, parsing settings |
Best Practices
Regular Monitoring
- Check pipeline runs daily during initial setup
- Set up alerts for critical pipelines
- Review trends weekly or monthly
Proactive Management
- Address warnings before they become failures
- Plan maintenance during low-activity periods
- Monitor storage and quota utilization
Documentation
- Record common failure patterns and solutions
- Document expected processing times
- Track changes that affect performance