API Performance
Our system handles three types of jobs: validation, training, and prediction. This page is split into three sections, one for each of the different types of jobs we have. Throughout this page, we use number of employees as a metric to descirbe the size of a dataset sent to our API. API performance varies based on data characteristics. The estimates below are based datasets with different characteristics, but actual performance may differ.
Performance may differ based on:
- Number of transactions per payslip
- Number of unique subTypes in the dataset, and how they're distributed across the payslips - essentially: How many unique subTypes are normally present on a typical employees' payslip?
- Network conditions
Validation
When integrators upload data, it triggers a validation job to ensure correct formatting. Here, we estimate the validation time on our end, but total upload time also depends on factors like your network speed.
Employees | Estimated Time |
---|---|
< 1,000 | 1-5 minutes |
1,000 | 5-10 minutes |
3,000 | 5-15 minutes |
8,000 | 30-40 minutes |
15,000-25,000 | 60-90 minutes |
Training
Training is the most time-consuming step. We recommend running training monthly, after uploading the latest approved payroll data. The first training takes longer, as models are created from scratch. Later runs only retrain models if the data has changed. The times below reflect initial training. Regular monthly trainings typically take about 30% less time.
Training Times
Employees | Estimated Time |
---|---|
< 250 | < 3 hours |
250-500 | 2-4 hours |
1,000-3,000 | 3-8 hours |
6,000-9,000 | 4-10 hours |
15,000-25,000 | 6-12 hours |
Predictions
Prediction Approaches
For small prediction jobs (1–3 payslips) we recommen using the /create_prediction
endpoint. If using presigned_url
for small jobs, expect around 10% longer prediction times due to minor overhead, which becomes negligible with larger payloads.
The create_prediction endpoint supports payloads up to 6MB. Larger uploads will result in errors:
- Over 6MB: 413 Request too long
- Over 10MB: 413 HTTP content length exceeded 10485760 bytes
For batch prediction jobs we recommend using the presigned_url
prediction.
We estimate the following prediciton times:
Small Predictions (1-3 payslips)
- Using
/create_prediction
endpoint: 1-3 minutes - Using presigned URL for small jobs adds ~10% overhead
Batch Predictions
Employees | Estimated Time |
---|---|
< 150 | < 5 minutes |
150-500 | 5-10 minutes |
500-1,000 | 10-20 minutes |
2,000-4,000 | 20-30 minutes |
6,000-9,000 | 30-60 minutes |
15,000-25,000 | 60-120 minutes |
Performance Optimization
-
Validation
- Batch large datasets and send multiple requests
-
Training
- Schedule large jobs during off-peak hours
- Start training when only new and approved payroll data is available
-
Prediction
- Implement proper error handling and retry logic