Discussion:
[gdal-dev] GDAL /vsistdin/ 1MB limit
Romeo Alvaraz
2021-05-04 05:04:22 UTC
Permalink
Hi there,

I've got a PDF geo-referencing implementation in place which makes use of
the gdal_translate utility. Recently I've been testing out and
implementation to read in the input content from standard input stream using
the /vsistdin/ handler.

According to the documentation for /vsistdin/ handler - "Full seek in the
first MB of a file is possible" and I've encountered the following error(s)
when the content I've attempted to pipe in is greater than 1MB.

ERROR 6: Seek(SEEK_END) unsupported on /vsistdin when stdin > 1 MB
ERROR 6: backward Seek() unsupported on /vsistdin above first MB

Would this be a hard limit of 1MB placed the content size which can be piped
in via /vsistdin/(the source code and the error message suggests it migh me)
or can it be configured? Has anyone encountered this before?

Any suggestions will be appreciated

Thanks.



Platform: Windows 64-bit
Version: GDAL 2.4.4, released 2020/01/08 sourced from
https://www.gisinternals.com/release.php




--
Sent from: http://osgeo-org.1560.x6.nabble.com/GDAL-Dev-f3742093.html
Even Rouault
2021-05-04 09:25:58 UTC
Permalink
/vsistdin/ is only meant for format whose reading is compatible of a
streamed reading, that is you read bytes as they come from stdin. Which
is only a subset of formats, and PDF is not in that subset (typically
the cross reference table to objects is always put at the end of the
file, hence the Seek(SEEK_END error you get). The 1 MB limit corresponds
to the amount of data that is cached at the beginning of the file (this
helps for drivers that do almost streamed reading, but do some backward
seeks in their header part). Beyond that, /vsimem/ doesn't buffer
anything, so if the driverĀ  tries to seek back, thatĀ  fails. This 1MB
limit could potentially be changed to be runtime configured, but that's
not the case currently.
Post by Romeo Alvaraz
Hi there,
I've got a PDF geo-referencing implementation in place which makes use of
the gdal_translate utility. Recently I've been testing out and
implementation to read in the input content from standard input stream using
the /vsistdin/ handler.
According to the documentation for /vsistdin/ handler - "Full seek in the
first MB of a file is possible" and I've encountered the following error(s)
when the content I've attempted to pipe in is greater than 1MB.
ERROR 6: Seek(SEEK_END) unsupported on /vsistdin when stdin > 1 MB
ERROR 6: backward Seek() unsupported on /vsistdin above first MB
Would this be a hard limit of 1MB placed the content size which can be piped
in via /vsistdin/(the source code and the error message suggests it migh me)
or can it be configured? Has anyone encountered this before?
Any suggestions will be appreciated
Thanks.
Platform: Windows 64-bit
Version: GDAL 2.4.4, released 2020/01/08 sourced from
https://www.gisinternals.com/release.php
--
Sent from: http://osgeo-org.1560.x6.nabble.com/GDAL-Dev-f3742093.html
_______________________________________________
gdal-dev mailing list
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
Romeo Alvaraz
2021-05-05 02:02:38 UTC
Permalink
Thanks Even for the response and clarifications on /vsistdin/. Our reasons
for attempting to use /vsistdin/ is to avoid having to read a file from the
local file system (which we would have to write to first by other means) on
our remote server. We've been using this in conjunction with /vsistdout/.

/vsicurl/ was another option we considered for the input but it's not ideal
to our situation.

I'm just wondering if there is another handler/method you could suggest
that could feed the input directly to gdal_translate in order to bypass use
of file system?

Thanks.




--
Sent from: http://osgeo-org.1560.x6.nabble.com/GDAL-Dev-f3742093.html
Andrew C Aitchison
2021-05-05 07:32:26 UTC
Permalink
Post by Romeo Alvaraz
Thanks Even for the response and clarifications on /vsistdin/. Our reasons
for attempting to use /vsistdin/ is to avoid having to read a file from the
local file system (which we would have to write to first by other means) on
our remote server. We've been using this in conjunction with /vsistdout/.
/vsicurl/ was another option we considered for the input but it's not ideal
to our situation.
I'm just wondering if there is another handler/method you could suggest
that could feed the input directly to gdal_translate in order to bypass use
of file system?
A python script could store the incoming data in /vsimem/ and call
gdal_translate internally
--
Andrew C. Aitchison Kendal, UK
***@aitchison.me.uk
Robert Coup
2021-05-05 08:14:08 UTC
Permalink
Post by Romeo Alvaraz
I'm just wondering if there is another handler/method you could suggest
that could feed the input directly to gdal_translate in order to bypass use
of file system?
If your data easily fits in memory, write it to a tmpfs filesystem
(eg: to /dev/shm/my.pdf)? Doesn't touch the disk so you get the same
performance without the downside of it needing to work as a stream.

Rob :)

Loading...