#community-help

Discussing Full-Text Indexing of a Website

TLDR Mac asked for advice on crawling a website with varying structures. Jason suggested using a SaaS service.

Powered by Struct AI
3
18mo
Solved
Join the chat
May 12, 2022 (18 months ago)
Mac
Photo of md5-94e285ae09d3cfa5f9e529fa0465a908
Mac
02:08 AM
Hi guys

I am running a cluster for our team, my collections are all well designed on the schema, but I have a new team who want to full text index a website that has had many developers working on it over time.
So as you can imagine the structure and naming styles are a bit of a mixed bag and we have no set crawler tags.
What is the best way to crawl this?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
02:17 AM
May be a SaaS service like this could help: https://apify.com/
Mac
Photo of md5-94e285ae09d3cfa5f9e529fa0465a908
Mac
03:06 AM
thanks for this Jason, i have dm'd to run a couple of concepts past u on this