• Which the release of FS2020 we see an explosition of activity on the forun and of course we are very happy to see this. But having all questions about FS2020 in one forum becomes a bit messy. So therefore we would like to ask you all to use the following guidelines when posting your questions:

    • Tag FS2020 specific questions with the MSFS2020 tag.
    • Questions about making 3D assets can be posted in the 3D asset design forum. Either post them in the subforum of the modelling tool you use or in the general forum if they are general.
    • Questions about aircraft design can be posted in the Aircraft design forum
    • Questions about airport design can be posted in the FS2020 airport design forum. Once airport development tools have been updated for FS2020 you can post tool speciifc questions in the subforums of those tools as well of course.
    • Questions about terrain design can be posted in the FS2020 terrain design forum.
    • Questions about SimConnect can be posted in the SimConnect forum.

    Any other question that is not specific to an aspect of development or tool can be posted in the General chat forum.

    By following these guidelines we make sure that the forums remain easy to read for everybody and also that the right people can find your post to answer it.

Processing time for huge shape files

Horst18519

Moderator
Resource contributor
Messages
2,385
Country
germany
I guess another of those only-Arno-knows questions. :teacher:

I got some shape files here I want to process, but the "replace polygons" step takes forever to process. This is (a bit simplified) the file I'm using:

Code:
IMPORTOGR|myfile.shp|*|*|NOREPROJ
#
SPLITGRID|AGN|*
#
REPLACEPOLYGONBYVEGETATIONRECTANGLES|*|0.001|0.25|0.25
#
CREATEAGNRECTVEG|*|{22a1acbd-2438-4131-ad7d-41d971c5f742}
#
EXPORTAGN|FSX|myfiles\

The shp file is only 200kb small but scenproc already works on it for more than 5 minutes. The original file of 133MB started 20 hours ago and still processes. :eek:
Maybe I should use vegpolys instead of veg rectangles, would that make sense? My shape file is rather complex so there are a lot of big and small polys in it. I thought FSX/P3D would be happier to have rectangles to work with instead of polygons, but that's just an assumption.
Or is it the numbers I used in the replacePoly argument?
 
Well I guess I can answer this myself - at least in parts. Looks like I got confused with those create and replace polygons commands. Looks like scenproc doesn't need to replace anything, you just import your shp file, create the rectangles and export everything. So deleting tha "replace..." line pretty much solved the long processing times. :)
 
Hi,

The replace step can be slow indeed. Especially with complex shapes, when testing if a point is inside the polygon or not is slower.

Personally I first use polygonal autogen. Only if that's not what I want I use rectangles.

If you create rectangles from polygons directly the rectangle will be the bounding box of the polygon.
 
Thanks for the information. I wasn't aware that it's creating a bounding box. Guess it makes sense to at least process the bigger polys as polygons then.
 
I now have a similar problem with using the Feature Detection. scenproc is running for more than 24 hours now trying to detect features.
I use this code:
Code:
ImportGDAL|Input_GeoTiff.tif|NOREPROJ

#
SplitGrid|AGN|*
#
DetectFeatures|FTYPE="RASTER"|Detect_Veg_v2.tfc|String;veg|tree|NONE
#
MergeGrid
#
ExportSHP|FTYPE="POLYGON"|veg_trees|C:\Export\veg
It's almost identical with the code from the (very nice!) scenproc manual. (I think you should add "|NOREPROJ" and "|NONE" though in the manual as I get errors using the exact code you used there)

The input file is 1.5GB big, the detection file worked very nice on a small test image in the texture tool. Does the detection time depend mainly on the size of the input image or the values in the tfc file? Would it improve the workflow if I used an image that only includes vegetation areas (like a 1 bit file) instead of using the original geotiff?
 
Hi Thorsten,

The performance for the feature detection depends on many factors.

Of course the input image matters, as that determines how many pixels there are to evaluate. As a reference for a 125 sq km island using 1 meter resolution imagery it took me about 1 hour to detect all vegetation.

But the amount of vegetation will also matter, because if there is more vegetation the postprocessing and vectorization will take more time.

I mainly test with the NDVI filter, but for the Histogram match filter also the amount of sample points you are checking against will have an influence on the performance.

At the moment scenProc only supports 3 band images, so a one bit file will not work. But how to you imagine that? If you already have the vegetation areas you don't need to detect them anymore do you? You can use the mask feature to limit where the detection is run, so that way you can use landcover data to guide the detection a bit and save time.
 
I guess it's a combination of image size and amount of green areas then. Will probably create 1bit images from the source imagery and process that. Problem is that QGIS also takes a huge amount of time to vectorize raster images. The only working way I found is to use the contour option, but I can't seem to be able to get polygon holes to work that way. If I process the resulting shape files in scenproc I get vegetation polygons without those holes. I'll run a few more tests on this, though.

If you already have 1 bit imagery covering the vegetation areas what would be the fastest way to create vegetation agn from that?
 
Both QGIS and scenProc use GDAL to do the vectorization, so performance is probably similar :)

For the moment scenProc can't read 1 bit images. But I'm also working on that. Once it can, the feature detection would still be the way to go. I don't have plans for a raster to vector step at the moment.
 
raster to vector is quit a challenge even for best of apps like GlobalMapper,
it is not a reliable method to convert to shape; you will end up with jagged lines you still need to manually fix,
from personal experience; this is not a practical route for our use,
 
I don't fully agree with you there. For regions without vegetation data it is very valuable to be able to classify them from the raster. Even that the edges are jagged is not a big issue, because the autogen engine will scatter trees inside the polygon and you don't see the edges. And in scenProc I smooth/optimize the polygons first after vectorization.

If you have good data available only as raster it's a valuable technique to have. And I feel the results are quite good in general.
 
ahh, i see your point!
i wasn't thinking in terms of a one large poly;
i was using a personal experience with converting fine smaller shapes as a reference,
 
I don't fully agree with you there. For regions without vegetation data it is very valuable to be able to classify them from the raster. Even that the edges are jagged is not a big issue, because the autogen engine will scatter trees inside the polygon and you don't see the edges. And in scenProc I smooth/optimize the polygons first after vectorization.

If you have good data available only as raster it's a valuable technique to have. And I feel the results are quite good in general.

Hi Arno:

I am remembering the AGN detection and run time rendering options discussed in these threads as I read of the above internal work-flow performed by ScenProc:

http://www.fsdeveloper.com/forum/th...togen-configuration.428760/page-3#post-679056

http://www.fsdeveloper.com/forum/threads/shapefile-with-holes-islands-in-it.351937/

http://www.fsdeveloper.com/forum/threads/accuracy-of-generating-autogen-trees.426360/

http://www.fsdeveloper.com/forum/th...nd-rectangular-vegetation.430935/#post-679255


Can the above cited step in ScenProc to "smooth/optimize the polygons first after vectorization" be disabled / skipped, if we have highly detailed raster data sources we are submitting to ScenProc, and we want to preserve accuracy at the expense of possible aliasing / blocking / jagged edges in our data when converted to vector format ?


Since "Polyline vegetation is rasterized into a sampling bitmap" by the FS rendering engine at run time:

https://msdn.microsoft.com/en-us/library/cc526979.aspx#CreatingBuildingFootprints


...some end users might wish to preserve intended AGN annotation accuracy via use of this parameter in the FSX.Cfg [TERRAIN] sub-section:

IMAGE_PIXELS_FOR_AUTOGEN_POLYGONS=

...with parameter values of 1024 or 2048 at run time, and may want to see accurate vector 'shapes' derived from rasterized "path" images.

https://www.google.com/#q=graphic+image+vector+path


Thanks in advance for your reply, and for considering this option for ScenProc, as it could save a lot of work for end users attempting to perform such raster to vector conversions in non-GIS software what with all the additional steps otherwise required to Geo-rectify their source imagery and derived vector output files. :)

https://en.wikipedia.org/wiki/Comparison_of_raster-to-vector_conversion_software


GaryGB
 
Last edited:
Hi Gary,

I know the autogen engine does create a raster again. But still I think it is more efficient to do some optimization after creating the vectors. The level of smoothing is selected so by scenProc that it only removes the jagged edges from the pixels. The shape is not affected. That way less vertices are needed to describe the shapes, so that results in smaller AGN files (and probably slightly better performance).

Also for buildings the optimization is needed, else they would not get the right orientation. With the jagged 90 degree edges of the pixels all buildings would end up heading north and that is not useful.
 
Unfortunately I have to bring this up again.
Thanks to the new version the memory usage is decreased drastically (great job!), but the long processing times are still a problem for me. I already split my huge files into big files, but scenproc still needs a lot of time to process them. To avoid unnecessary steps I already provide ESRI shp files and do not use the agn detection since that included additional steps that slowed down the process. I ran a test with a rather small shp file (2MB), the total processing time was around 20 minutes. My files are between 15 and 50MB in size, though - you can imagine how long that will take, even more if I import several of them. RAM and CPU don't seem to have a big influence on this as I am running a test on 2 different PCs at the moment, both progress bars seem to move with identical speed. It looks like approx. 1 minute per 100kb seems the average for scenproc for importing shp files. There also doesn't seem to be a difference if a database file (dbf) is provided.
Now I wonder if there's a way of accelerating the import step. I assume scenproc reads the input shp file byte by byte. Reading through the ESRI documentation (https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf) I would assume it was possible to read only the important bytes and skip others as long as the user makes sure the shp file only includes multipolys. Would that help improving the import process?

I wouldn't mind the long wait but my total data for the current project is around 2GB shp files, that would mean 15,000 days or 40 years just for processing the files. I don't think users would understand that delay. :D

EDIT:

Looking at the shp files I realized they include an attribute entry "DN". Not sure what that's about, but it turns out 30-40% of the polygons in the shp file are tiny rectangles (single pixels from the rasterize conversion I made to create those shp files from aerial imagery) with DN values of 64, 128 or 191. I tried using "DN;255" as an option in the scenproc importOGR command. That actually seemed to help, but eventually scenproc crashed importing the file using that addition.
For a test I removed the polys with DN<255 to decrease shp size by some 30-40%, but of course import time is still around 1min/100kb.
 
Last edited:
Hi Thorsten,

Is it the import step that you feel is slow or is it the total script? In the last case it would help to know which steps you have in your script.

The import SHP sounds a bit slow on your side, I'm often testing with bigger files than 1 MB and they are not taking ages. I would have to check on my PC how big they really are.

But having said that, 2 GB of SHP data is a lot in general. I mean if grab all OSM buildings from the Netherlands that's less than 2 GB I think.
 
Hi Thorsten,

Just did a quick test here. A 493 MB SHP file did load in 194 seconds. But I think the performance is more determined by the amount of features than by the disk space. My file contained 2993429 features. So that's around 2.5 MB/sec and circa 15000 features per second.

How many features does your file have?

If you have a lot of features with holes in them loading will be slower. You can see what happens with add the DONTPROCESSHOLES option.
 
It's the importogr step. To improve processing performance I removed those tiny polys and used the simplify feature in QGIS, I thought this might help. The php files lost 30% because of the first step, another 10-20% from the second. scenproc still works for more than 6 hours already trying to import a 13MB file.

EDIT:

Wow, that's a big difference to what I get here. 6 hours for 13MB is quite different from your numbers.
I won't be able to work on those files for 1 week now but I'll do some more tests (don't process polyholes) after I get back.
 
Else can you send me one file to test here?
 
Back
Top