Kevin Donovan
06/30/2022, 1:04 PMJason Bosco
06/30/2022, 3:28 PMKevin Donovan
07/01/2022, 6:19 AMKevin Donovan
07/01/2022, 6:21 AMJason Bosco
07/01/2022, 3:55 PMKevin Donovan
07/04/2022, 12:00 PMKevin Donovan
07/04/2022, 1:48 PMKevin Donovan
07/04/2022, 3:36 PMJason Bosco
07/04/2022, 8:52 PMKevin Donovan
07/05/2022, 3:02 PMKevin Donovan
07/05/2022, 3:21 PMJason Bosco
07/05/2022, 6:02 PMKevin Donovan
07/06/2022, 7:55 AMKevin Donovan
07/06/2022, 9:45 AMJason Bosco
07/06/2022, 4:08 PMWARNING:py.warnings:/root/.local/share/virtualenvs/root-BuDEOXnJ/lib/python3.6/
site-packages/scrapy/spidermiddlewares/offsite.py:69: PortWarning: allowed_domains
accepts only domains without ports. Ignoring entry host.docker.internal:3000 in
allowed_domains. warnings.warn(message, PortWarning)
The file that's throwing that exception is inside the scrapy package. I also searched their source code for that error message, and confirmed that the error is coming from within scrappy.
re: workaround, the IP Tables based approach mentioned in the Github issue shared seems to work.Kevin Donovan
07/07/2022, 7:33 AMDEBUG:scrapy.core.engine:Crawled (404) <GET <https://www.algotrader.com/docs/virtual_spot_positions>> (referer: <http://host.docker.internal:3000/sitemap.xml>)
Kevin Donovan
07/07/2022, 7:55 AMvirtual_spot_positions
which is the name of a page in the docusaurus site. Unfortunately, it subsitutes the IP address implied by host.docker.interal
with the URL of the organization. And it can't find virtual_spot_positions
of course. Arrgh. Redirecting the port does solve one problem but led in this case to another. I will attempt to duplicate the environment that was described in the GitHub posting. In the case where I encountered this problem, the docusaurus site was running in Windows whereas typesense was running in docker run on WSL Ubuntu on the same physical machine. In the configuration where port redirection succeeded, the docusaurus ssite was running on ubuntu. This might explain why it works in one situation but not another.Jason Bosco
07/08/2022, 7:21 PM