I tried upgrading from 0.26 to 0.27, using the ope...
# community-help
n
I tried upgrading from 0.26 to 0.27, using the open source docker version. With 0.27 I'm getting this error:
Copy code
typesense-1  | E20240831 20:32:14.127265     1 store.cpp:54] Error while initializing store: IO error: No such file or directory: While mkdir if missing: No such file or directory
typesense-1  | E20240831 20:32:14.127300     1 store.cpp:56] It seems like the data directory  is already being used by another Typesense server.
typesense-1  | E20240831 20:32:14.127313     1 store.cpp:58] If you are SURE that this is not the case, delete the LOCK file in the data db directory and try again.
r
I tried deleting the LOCK files and trying again, but same issue. I was able to revert back to 0.26 for now.
k
We've had several people upgrade so far with no issues. Can you post the exact flags that you use to start Typesense?
n
Copy code
command: '--log-dir /logs --data-dir /data --api-key=REDACTED --enable-cors --healthy-read-lag 10000 --healthy-write-lag 1000'
k
The actual error in the log is this:
Error while initializing store: IO error: No such file or directory: While mkdir if missing: No such file or directory
Seems like that data directory somehow does not exist when you started with the new version. Can you double check that command indeed points to the correct directory?
n
When I go into the 0.26 running container, there is a
/data
directory:
Copy code
root@10553484d797:/# ls /data
db  meta  models  state
k
Not sure what's happening but the somehow the data directory is not being found by Typesense process at startup. Maybe train again to see if it was a one-off thing?
n
Ok I tried it again. The only change between is updating docker-compose's from
Copy code
image: typesense/typesense:26.0
to
Copy code
image: typesense/typesense:27.0
Still getting the above errors. I can get into the container running 0.27 and I can verify that it can see and has access to all of the files in the
/data
directory.
As an aside, the links to "create a backup" on this page https://typesense.org/docs/27.0/api/#deprecations-behavior-changes go to a 404
@Kishore Nallan This is still not working for me. The server has been running typesense since 24, I've been able to successfully upgrade the previous versions by changing the version in docker-compose.yml in the past with no problems. I've tried starting it multiple times, I've tried restoring from the last snapshot, they all return with the error above. I tried following the code to understand. The error message is happening here: https://github.com/typesense/typesense/blob/4b6ef566859b0f15fb7ca907d5b4e4c88b845a73/src/store.cpp#L54 It appears my "state_dir_path" is an empty string? My docker-compose.yaml looks like:
Copy code
typesense:
    image: typesense/typesense:27.0
    restart: on-failure
    ports:
      - "5724:8108"
    environment:
      TYPESENSE_LOG_DIR: /logs
      TYPESENSE_DATA_DIR: /data
      TYPESENSE_API_KEY: REDACTED
      TYPESENSE_HEALTHY_READ_LAG: 10000
      TYPESENSE_HELATHY_WRITE_LAG: 1000
      TYPESENSE_DB_COMPACTION_SECONDS: 0
    volumes:
      - /var/local/typesense/data:/data
      - /var/local/typesense/logs:/logs
I've managed to make a file system snapshot of the server and launched another one to play around with (which is exhibiting the same error).
So if you have any debugging suggestions, I'd appreciate it!
k
We are unable to offer in depth support for self hosting instances because there are so many variables. The best I can do is, if you can zip the data directory and share with me on DM, I can try starting it against v27 and check what's happening.
n
I can't do that, there is data that I can't share, besides it's over 60+ gbs compressed. I understand there is only so much support you can for a self hosted instance Can you at least tell me if that variable --
state_dir_path
is that supposed to be the state directory under the /data directory?
k
state_dir_path is
Copy code
/<data_dir_path>/db
What type of underlying disk are you using?
n
Hmm so I went back and tried the RC builds, 27.0.rc22 works, 27.0.r23 is the first build that starts with this error.
It's a SSD on a dedicated cloud server hosted by Hetzner
k
Jun 27 -> July 3 -- that's the time frame for those two RC build. I will check what changed between those dates.
You can also do another thing. Can you create another test typesense server with dummy data on the same server and check if the same issue happens?
n
yes, I can try that. I guess I should create it with 26 and then upgrade it?
k
Create with v26, run a snapshot and then upgrade to v27
n
this commit touches that code from the date period you mentioned above https://github.com/typesense/typesense/commit/a1f8d7257a63d1acde86664868af379d8f14acfb
k
I saw that. Can you try on
28.0.rc3
n
the error still happens for me on
28.0.rc3
k
Ok I think the easiest way to debug is if you can share a dummy data directory which reproduces the problem.
n
Ok I was able to reproduce it with the sample collection/documents from the documentation as the dummy data. Here is the
data
directory.
k
Ok let me check
n
I started with empty
/data
and
/logs
directories with 26.0, ran the command to create the schema and add a couple of documents, then shut down, and changed docker to point to 27.0, and the problem exhibited. I didn't try taking a snapshot.
k
Ok now I can reproduce the issue locally. I will investigate and get back to you.
n
🙏 Thank you, much appreciated
If its of any use, the server that it's running on is Ubuntu 22.0.4 LTS.
k
Ok I know why this is happening now but it should not really cause any issues. You can ignore the error and it should work fine. I will fix the error.
The root cause is because we now accept an analytics directory path for the analytics db but if it's not provided, we try to open with an empty path and since that fails, an error is logged. But this should not affect any regular operations.
n
Thanks @Kishore Nallan Ok, I'll ignore that for now. I actually didn't even realize that it was continuing after that as I was stopping the server. When I start up typesense, I only see this in the log, and then because of my data set size, it's about 20 minutes before :
Copy code
Attaching to typesense-1
typesense-1  | Log directory is configured as: /logs
typesense-1  | E20240908 16:32:35.939123     1 store.cpp:54] Error while initializing store: IO error: No such file or directory: While mkdir if missing: No such file or directory
typesense-1  | E20240908 16:32:35.939168     1 store.cpp:56] It seems like the data directory  is already being used by another Typesense server.
typesense-1  | E20240908 16:32:35.939189     1 store.cpp:58] If you are SURE that this is not the case, delete the LOCK file in the data db directory and try again.
typesense-1  | E20240908 16:41:44.446480   992 raft_server.h:62] Peer refresh failed, error: Doing another configuration change
It'd be great if it were to actually say that it's building the indicies and that its ready to take requests. I couldn't see any place in the docs that mentioned a loglevel or setting for this.
k
In the initial state, the server has to download data from the leader. This takes some time, and there should be some logs related to that earlier.