Invoking Data Pipelines
Learn how to manually trigger data pipeline execution.
Overview
Data pipelines can be invoked manually to process new or updated data on demand. This is useful for:
- Initial data loading
- Processing specific datasets
- Testing pipeline configurations
- Ad-hoc data updates
Accessing Pipeline Invocation
- Navigate to Data Pipelines in the sidebar
- Locate the pipeline you want to run
- Use the run action to invoke it
Manual Invocation
From the Pipelines List
- Find the pipeline in the list
- Click the Run icon (▶️) in the actions column
- Confirm the invocation if prompted
- The pipeline begins execution
From Pipeline Edit Page
- Open the pipeline for editing
- Click Run Pipeline (if available)
- The pipeline starts processing
Invocation Options
TODO: Document specific invocation options available in the UI, such as:
- Full vs. incremental processing
- Folder/file selection for partial runs
- Override parameters for this run
Monitoring the Run
After invoking a pipeline:
- Navigate to Data Pipeline Runs
- Find your pipeline in the list (sorted by most recent)
- Monitor the status as it progresses
- Review results when complete
Run Scheduling
TODO: Document scheduled/automatic pipeline invocation if supported, including:
- Cron-based scheduling
- Event-triggered runs
- Continuous processing modes
Best Practices
Before Running
- Verify the data source is accessible
- Check that target storage/index has capacity
- Review pipeline configuration for correctness
During Execution
- Monitor the run status in Pipeline Runs
- Check for early errors
- Be prepared to cancel if issues arise
After Completion
- Verify data was processed correctly
- Check the target index/storage for new content
- Test agent queries against the updated data
Troubleshooting
Pipeline Won't Start
- Verify you have permission to run the pipeline
- Check if another run is already in progress
- Ensure the pipeline is in an active state
Run Fails Immediately
- Check data source connectivity
- Verify authentication credentials
- Review error messages in run details
Processing Slower Than Expected
- Large datasets take longer to process
- Check embedding model throughput
- Review stage configurations