Skip to main content

API Performance

Performance Considerations

Our system handles three types of jobs: validation, training, and prediction. This page is split into three sections, one for each of the different types of jobs we have. Throughout this page, we use number of employees as a metric to descirbe the size of a dataset sent to our API. API performance varies based on data characteristics. The estimates below are based datasets with different characteristics, but actual performance may differ.

Performance may differ based on:

  • Number of transactions per payslip
  • Number of unique subTypes in the dataset, and how they're distributed across the payslips - essentially: How many unique subTypes are normally present on a typical employees' payslip?
  • Network conditions

Validation

When integrators upload data, it triggers a validation job to ensure correct formatting. Here, we estimate the validation time on our end, but total upload time also depends on factors like your network speed.

EmployeesEstimated Time
< 1,0001-5 minutes
1,0005-10 minutes
3,0005-15 minutes
8,00030-40 minutes
15,000-25,00060-90 minutes

Training

Training is the most time-consuming step. We recommend running training monthly, after uploading the latest approved payroll data. The first training takes longer, as models are created from scratch. Later runs only retrain models if the data has changed. The times below reflect initial training. Regular monthly trainings typically take about 30% less time.

Training Times

EmployeesEstimated Time
< 250< 3 hours
250-5002-4 hours
1,000-3,0003-8 hours
6,000-9,0004-10 hours
15,000-25,0006-12 hours

Predictions

Prediction Approaches

For small prediction jobs (1–3 payslips) we recommen using the /create_prediction endpoint. If using presigned_url for small jobs, expect around 10% longer prediction times due to minor overhead, which becomes negligible with larger payloads.

Prediction capacity

The create_prediction endpoint supports payloads up to 6MB. Larger uploads will result in errors:

  • Over 6MB: 413 Request too long
  • Over 10MB: 413 HTTP content length exceeded 10485760 bytes

For batch prediction jobs we recommend using the presigned_url prediction. We estimate the following prediciton times:

Small Predictions (1-3 payslips)

  • Using /create_prediction endpoint: 1-3 minutes
  • Using presigned URL for small jobs adds ~10% overhead

Batch Predictions

EmployeesEstimated Time
< 150< 5 minutes
150-5005-10 minutes
500-1,00010-20 minutes
2,000-4,00020-30 minutes
6,000-9,00030-60 minutes
15,000-25,00060-120 minutes

Performance Optimization

Best Practices
  1. Validation

    • Batch large datasets and send multiple requests
  2. Training

    • Schedule large jobs during off-peak hours
    • Start training when only new and approved payroll data is available
  3. Prediction

    • Implement proper error handling and retry logic