Enjoy reliable web scraping thanks to rigourous quality controls
When you crawl the web multiple third parties control the source of your data. Even when site owners have no particular desire to stop your data-gathering efforts, they do have their own agendas. They may, for example, change the structure of their site at any time, create inconsistently formatted content, publish false data, suffer downtime, be the victim of spammers, etc. All of these occurrences can have a visible and potentially costly effect on your data feed.
Stratalis mitigates the consequences of these disturbances with continuous sanity monitoring and quality assurance procedures.
Sanity monitoring means that we check that the data feed we generate is within expected set-bounds: not too big, not too small and not too different from the day before. If any of these happens, one of our operational staff will look into it and decides whether there is a need to take corrective action.
Quality assurance is a process through which a quality assurance operator regularly verifies a random sample of your data feed. A software application tells the operator where to look for information and asks questions about important fields. The operator for this process does not know what the robot found and what would be expected. If their input doesn’t match the robot’s results, we take a deeper look into it to see who did not find the correct information. If it is the robot we take corrective action.
All projects do not have the same data quality demands, so we let you decide your preferred Quality Assurance strategy from our offering, ranging from just automated sanity checking to full-time dedicated Quality Assurance agents.