Archive

Posts Tagged ‘portals’

Fun with BSPs and pre-computed visibility

January 16, 2012 Leave a comment

Long ago (according to my last blog post date), I embarked upon a fun journey to revive some of my memories of the days of the mighty BSP tree, first fully realized by Quake.  I set about coding up a builder in C#/XNA based on my memories of how everything worked.

Here’s an early shot of the tool, with teamfort’s lumberyard loaded:

lumberyard

Progress was slow, as I was, at the time, working full time and then some.  I’d plug away at it on the holidays or occasional weekend if I wasn’t caught up in some game or other, or working on other fun projects.

Here’s one such side project:  2D conversion of the bsp stuff, with grid based path finding:

PathMaker2

Eventually I hit an impasse:  Hidden leaf removal.  This is cutting out all un-necessary geometry such as outward facing stuff.  At the time I didn’t really care about having pre-computed visibility or lighting, or even drawing the BSP.  All I wanted was a collision hull, the idea being that I would export a map out of hammer to DXF, import into max, and build up the detail from there.

Without hidden leaf removal though, the tree was inefficient, and I found myself colliding against internal faces that butted against each other in solid space, like two giant blocks of stone side by side.

The key to proper hidden leaf removal is a good set of portals.  I couldn’t remember how quake or genesis did their portal generation, but my basic algorithm was to take the infinite node planes, extend them into large polygons, and then clip them against all the parent and child nodes.  Eventually you’d have a set of portals that lead from empty to empty, solid to solid, and solid to empty.

Somewhere something was going wrong, and only on complex maps.  Drawing the portals was useless as it seemed only an unfathomable jumble of squares.  Eventually the frustration with my busted portals overcame my desire to write this stuff on my own.  Reinventing wheels is fun, but I was tired of the endless bug hunt.

And so I took my first peek at Genesis 3D.  I was there at Eclipse Entertainment when it was being constructed, and though very little of the guts of the engine were written by me, I tried to keep up with what was going on with it.  Sometimes John Pollard (who wrote probably 90% of it) would bounce ideas off me, though quite often I had no idea what he was talking about. Smile

A long and painful port of the C++ portal code got me working hidden leaf removal, and I decided that I would just port the whole thing, and asked David Stafford (owner of Genesis and founder of Eclipse) for his blessing.

I didn’t end up porting all of it, really just the core build tool, and some of the collision routines.  I figured I would learn a lot by doing so and I certainly did, though there are parts of the code that I still don’t fully understand (the face merging stuff for instance is still a pure port from C++).

I also found that building in quake style editors is just so much easier than fooling around with max and trying to match up the collision hull to the art.

The other two elements I was missing were vis and light.  Vis (the subject of most of this post) is being able to ask the question “What can I see from this spot” during game runtime and getting a fast result.  It’s just a packed bitvector built in an offline process that can take a long time.

Light is just a ton of raycasts baked down into lightmaps.  It can also take a long time, but is helped a lot by the vis stage.

I went ahead and ported the Genesis vis and light calculators, and aside from a few bugs, they worked fairly well.  When I tried building large maps however, things took a turn for the slow.

Coag_willem, a tough map to vis:

willem

My vis and light weren’t exactly quick, and I had made no effort to optimize anything yet, but when I ran across the Elements map by Vondur:  http://www.quaddicted.com/reviews/eels.html  I learned a new definition of slow as it took over a week to vis.

Now the intelligent thing to do here would have been to try a more modern vis tool to compare results, but since I was doing all of this for fun, and to keep my skills sharp, I decided to thread it across my 4 cores.

This turned out to be fairly easy to do, but there were a few gotchas:  My first attempt would create 4 threads (for my 4 core machine) and split the portals evenly and feed them to each thread.  This didn’t work very well however as the portal list is sorted by a rough estimate (based on facing) on how many portals it might see.  This lead to the first two threads finishing very quickly, and the last 2 grinding on for many times longer, with 2 cores sitting idle.

Here’s a shot of a build where I’m trying to do a narrowing scale of portals as cores increase:

Closer

My friend Kythorn then told me about the .net 4 thread stuff, with parallel for.  This was better, and kept all 4 cores utilized, but it still wasn’t really practical to wait several days for a build.

I experimented a bit with optimizing, and got my garbage collection times down via pooling, but it was still too slow.

So I decided to write a mini client/server that would distribute the load across multiple machines.  I decided to use windows communication foundation as the go between, as with that I could connect to azure cloud services and spin off 100 machines to work on a map if I wanted.  I have still yet to try this, but it sounds awesome.

I got it working fairly quickly, but it was very difficult to decide how much work to give each machine.  In most large maps, there were certain sections of portals that took many many thousands of times longer than the typical portal to compute.  Inevitably something slow like my netbook would pick up these sections.

In an effort to measure it, I began printing out iterations, and discovered that parts of elements were reaching almost a billion iterations through the flooding routine.  Inevitably these sections would reach the 24 hour timeout on WCF and end up being sent to another machine after failing.

Somewhere in here I discovered that distributed vis is not going to be as good as local for two reasons:

#1, the boost gained from sorting portals is nullified.  When you do the full computation on portals in order based on which are most likely to see fewer bits of the map, that allows later portals to end their vis flooding earlier.  These portals near the front of the list are often tight corners, and when the final vis data is complete for them, future portals often finish early when they flood to them.

#2, since only the rough idea of what a portal can see is sent across the network, the vis ends up being slightly less tight as each flood is flooding through unfinished portals.  On a single machine, as the final vis data for a portal A is finished, other portals will use it when flooding through portal A, if that makes any sense, thus tightening the vis (and making it faster as in #1)

Back to iterations, I took a look at quake 2’s vis compiler, and it was reaching a max of around one million, with the average portal iterating much much less.  I dug into the code on both sides, and couldn’t really see much of a difference.  Genesis was a sort of leaf at a time vis, while Q2 did portals, then gathered the leaf vis data at the end.

I experimented with porting some of the Q2 portal stuff over and eventually saw the difference.  Q2 merges leaves into clusters, sort of by reverse bsp order, so you get several leafs crammed together into a section of empty space.  On elements it has around 3000 of these clusters, while genesis was I think around 12000 if memory serves.

Here’s a shot of a 12 portal cluster in the E4M1 map from Quake 1:

ClusterMode

A more complex 22 portal cluster from Elements:

ClusterMode2

I then remembered why.  Genesis was at the time (the 1.3 code I was looking at) still tied heavily to software rendering, as we were beginning to target the web.  A tighter vis was important for drawing, so genesis didn’t do the clustering thing, as you got a slightly sloppier vis as a result of the decreased granularity.

I had never intended to use the vis data for drawing, though I do use it to create a material vis, which is sort of “what materials can I see from this position”.  For that a sloppy vis is fine so I kept with the clustered approach.

Here’s a sort of worst case looking up a stairway that leads down a hallway in Elements:

Vis

Here’s a shot flying above and to the side with the vis point frozen at the previous camera spot:

VisFrozen

And a shot of the material vis from the same positions (cam and vis):

MatVis

With clustering my vis times fell tremendously.  I could now vis Elements in an hour and a half on a single machine with four threads.  Quake 2 in C++ does it in around 2.6 hours single threaded, so in the same ballpark, though C++ is of course faster if you consider the threads.

Other than drawing, what is vis good for?  Well it greatly speeds up the lighting stage, allowing lights to check if a surface is visible before attempting to ray cast to it.  It’s also handy for generating pathing nodes, and as a quick rough visibility test for AI.

I’ve since used this little BSP engine in a ludum dare entry:  http://www.ludumdare.com/compo/ludum-dare-22/?action=preview&uid=5649

The experience from this was invaluable.  I found a lot of bugs that I hadn’t encountered before by simply testing downloaded maps.  It also tested the game play functionality, stuff like entities and triggers and such that I had barely touched.  I definitely recommend using quick games as a means to test an engine or framework.

I’ll do another post soon about detail brushes, the lighting stage, and what I hope to do with the tools in the future.  Hopefully it won’t take 4 years. Smile

Categories: Uncategorized Tags: , , ,