API Performance

Performance Considerations

Our system handles three types of jobs: validation, training, and prediction. This page is split into three sections, one for each of the different types of jobs we have. Throughout this page, we use number of employees as a metric to descirbe the size of a dataset sent to our API. API performance varies based on data characteristics. The estimates below are based datasets with different characteristics, but actual performance may differ.

Performance may differ based on:

Number of transactions per payslip
Number of unique subTypes in the dataset, and how they're distributed across the payslips - essentially: How many unique subTypes are normally present on a typical employees' payslip?
Network conditions

Validation

When integrators upload data, it triggers a validation job to ensure correct formatting. Here, we estimate the validation time on our end, but total upload time also depends on factors like your network speed.

Employees	Estimated Time
< 1,000	1-5 minutes
1,000	5-10 minutes
3,000	5-15 minutes
8,000	30-40 minutes
15,000-25,000	60-90 minutes

Training

Training is the most time-consuming step. We recommend running training monthly, after uploading the latest approved payroll data. The first training takes longer, as models are created from scratch. Later runs only retrain models if the data has changed. The times below reflect initial training. Regular monthly trainings typically take about 30% less time.

Training Times

Employees	Estimated Time
< 250	< 3 hours
250-500	2-4 hours
1,000-3,000	3-8 hours
6,000-9,000	4-10 hours
15,000-25,000	6-12 hours

Predictions

Prediction Approaches

For small prediction jobs (1–3 payslips) we recommen using the /create_prediction endpoint. If using presigned_url for small jobs, expect around 10% longer prediction times due to minor overhead, which becomes negligible with larger payloads.

Prediction capacity

The create_prediction endpoint supports payloads up to 6MB. Larger uploads will result in errors:

Over 6MB: 413 Request too long
Over 10MB: 413 HTTP content length exceeded 10485760 bytes

For batch prediction jobs we recommend using the presigned_url prediction. We estimate the following prediciton times:

Small Predictions (1-3 payslips)

Using /create_prediction endpoint: 1-3 minutes
Using presigned URL for small jobs adds ~10% overhead

Batch Predictions

Employees	Estimated Time
< 150	< 5 minutes
150-500	5-10 minutes
500-1,000	10-20 minutes
2,000-4,000	20-30 minutes
6,000-9,000	30-60 minutes
15,000-25,000	60-120 minutes

Performance Optimization

Best Practices

Validation
- Batch large datasets and send multiple requests
Training
- Schedule large jobs during off-peak hours
- Start training when only new and approved payroll data is available
Prediction
- Implement proper error handling and retry logic

API Performance

Validation​

Training​

Training Times​

Predictions​

Prediction Approaches​

Small Predictions (1-3 payslips)​

Batch Predictions​

Performance Optimization​

Feedback