I have successfully set up Typesense running on an...
# community-help
t
I have successfully set up Typesense running on an Amazon EC2 instance, except that one of my collection containing ~140k records keeps deleting itself around once a day. There is no process running that could explain this behaviour on application side, and there are no helpful messages in the typesense log file. I'm also using the PHP library with Laravel adapter. Are there any reasons why Typesense would drop a collection by itself? Any idea where I could start diagnosing something like this?
k
👋 Can you please clarify what you mean by deleting themselves? The collection is empty or does not exist or contains fewer documents than previously indexed? Have you checked that there is enough RAM and Typesense isn't restarting by checking the logs?
t
The collection does not exist, I have to recreate it from scratch. I also thought it could be a RAM issue, but the problem persists even after I upgraded to a server with 16 GB of RAM, of which 13.7 GB is still available. I have one smaller collection that is also doing this but a bit less often, and four collections that are stable. I don't see anything relevant in the log, no mentions of a restart. But a cold start from disk should anyways be possible?
k
Can you check if you are using ephemeral storage (if you are on EC2) and whether the host instance is getting restarted or something?
But a cold start from disk should anyways be possible?
Yes, certainly. What are the contents of the Typesense data directory? Check the date time of the files.
t
The host instance is running fine, the data is kept (EBS storage), no restarts. So I don't expect to be an instance issue. Maybe something operating system related
Here are all the unique log messages (I replaced all the numbers with #). Nothing suspicious here
k
That log snippet looks fine to me. Once a day suspiciously sounds like a cron to me.
t
It's not exactly once a day, I'd say around 15-30 hours
And this server was purpose built for typesense, nothing else installed. Quite frustrating, I know there's a reason for this but can't think of anything that might cause this. Server uptime is 4 days, the collection has dropped multiple times since that etc.
k
Can you check the date time of the files in your data directory?
Also do you've metrics that show how the free memory on the instance varies over time, and whether that correlates with the collections going missing?
t
They are all quite recent
No metrics yet, but the server is quite beefy for this dataset, I wouldn't expect all 16 GB of RAM to be eaten up, but I'll setup something if I can't think of anything else
k
How many times does the phrase
Starting Typesense
occur in the logs? I presume the logs themselves are preserved for all the 4 days of uptime.
(edited last message to remove the version number, it must be just
Starting Typesense
)
t
Just twice, when I initially started it. So it seems like it has been running
k
So strange. Can you try upgrading to v0.20.0 -- that shouldn't really change much but will helpful to be on the latest version to compare logs etc.
t
Yep, I'll try that next.
k
One other experiment I would try is to also create other data on the server, like generate an API key and see whether that also goes missing when collections are missing.
t
Here are the six collections. The smallest three have survived, the others I've had to recreate
So at least it's keeping some of the collections. Don't know about API keys, but at least the initial ones I created are still working.
k
Do you have any cron or periodic jobs running? This certainly seems like an issue of some process dropping the collection during re-indexing. I might be wrong, but if only some collections go missing then that certainly is very strange.
t
I have cron jobs that are able to rebuild indexes, but none of them have been running. I only run those manually when I need to recreate an dropped collection. But I wouldn't put it past me to find out something on the application side is actually doing the damage.
But as there's no clear reason for now, I'll just try to upgrade to 0.20 and add some metrics so I can get some more reliable data
k
One quick to way to verify: just stop all inbound traffic to instance by modifying security group and see what happens.
t
Or actually, I can spin up an identical instance and leave that untouched traffic wise, and see if the collections are still dropped. At least then we'll know if it's caused by some traffic, or if it's happening without outside influence
k
👍
t
Could there be an situation where somehow corrupted or malformed data could cause typesense to freak out and drop the collection? That's a possibility also.
k
Unlikely, because collection look up to check if a collection exists is done off an in-memory hash map and so that will never be wrong even if disk becomes corrupted.
Also 0.19.0 has been successfully deployed and used by multiple customers for 2+ months now with no issues. This is such a serious issue, it should have surfaced by now.
t
Yep. I'm also expecting (and hoping) this to be something stupid created by myself, but we'll see.
All right, I'll keep investigating and keep you posted. Thanks so far.
k
Welcome.
@Tatu Ulmanen Did you figure out what was happening here?
@Tatu Ulmanen Sorry to follow up on this once again: did you get to the bottom of what was happening here? Since this seemed like a serious issue, I just want to make sure that there are not gotchas that we might have missed.
t
@Kishore Nallan Sorry for being inactive here and not replying to this issue. The problem ended up being just me using the PHP library wrong. The correct way to delete a single record is
$client->collections['products']->documents($id)->delete()
, not
$client->collections['products']->delete($id)
which I was doing and which coincidentally deletes the whole collection. The syntax "looks" valid, which is why it took me a long time to figure out what's wrong. Maybe a point of improvement to the PHP library would be to throw a warning if a parameter was used with collection deletion, but I can raise an issue about this in the typesense-php repo.
k
Oh that's interesting. Thanks for pointing this out. We will take a look 👍